TTS Output - RunAnywhere Documentation

Early Beta — The Web SDK is in early beta. APIs may change between releases.

Overview

The Web SDK’s TTS produces raw PCM audio as Float32Array. This page covers how to work with the audio output, including playback, conversion, and integration with Web Audio API.

The Web SDK currently produces audio all-at-once (non-streaming). Streaming TTS output is planned for a future release.

Playing Audio

Using AudioPlayback

The SDK includes a built-in AudioPlayback class for playing synthesized audio:

import { TTS, AudioPlayback } from '@runanywhere/web'

const result = await TTS.synthesize('Hello from RunAnywhere!')

const player = new AudioPlayback()
player.onComplete(() => console.log('Playback finished'))
await player.play(result.audioData, result.sampleRate)

Stop Playback

const player = new AudioPlayback()
await player.play(audioData, sampleRate)

// Stop playback at any time
player.stop()

Using Web Audio API Directly

For more control, use the Web Audio API directly:

import { TTS } from '@runanywhere/web'

const result = await TTS.synthesize('Hello world')

const audioCtx = new AudioContext()
const buffer = audioCtx.createBuffer(1, result.audioData.length, result.sampleRate)
buffer.getChannelData(0).set(result.audioData)

const source = audioCtx.createBufferSource()
source.buffer = buffer
source.connect(audioCtx.destination)
source.start()

Converting to WAV

Convert synthesized audio to WAV format for download or storage:

function float32ToWav(samples: Float32Array, sampleRate: number): Blob {
  const buffer = new ArrayBuffer(44 + samples.length * 2)
  const view = new DataView(buffer)

  // WAV header
  const writeString = (offset: number, str: string) => {
    for (let i = 0; i < str.length; i++) {
      view.setUint8(offset + i, str.charCodeAt(i))
    }
  }

  writeString(0, 'RIFF')
  view.setUint32(4, 36 + samples.length * 2, true)
  writeString(8, 'WAVE')
  writeString(12, 'fmt ')
  view.setUint32(16, 16, true)
  view.setUint16(20, 1, true) // PCM
  view.setUint16(22, 1, true) // mono
  view.setUint32(24, sampleRate, true)
  view.setUint32(28, sampleRate * 2, true)
  view.setUint16(32, 2, true)
  view.setUint16(34, 16, true)
  writeString(36, 'data')
  view.setUint32(40, samples.length * 2, true)

  // Convert float32 to int16
  for (let i = 0; i < samples.length; i++) {
    const s = Math.max(-1, Math.min(1, samples[i]))
    view.setInt16(44 + i * 2, s < 0 ? s * 0x8000 : s * 0x7fff, true)
  }

  return new Blob([buffer], { type: 'audio/wav' })
}

// Usage
const result = await TTS.synthesize('Hello world')
const wavBlob = float32ToWav(result.audioData, result.sampleRate)

// Download
const url = URL.createObjectURL(wavBlob)
const a = document.createElement('a')
a.href = url
a.download = 'speech.wav'
a.click()
URL.revokeObjectURL(url)

Queuing Multiple Utterances

Play multiple TTS results in sequence:

async function speakSequence(texts: string[]) {
  const player = new AudioPlayback()

  for (const text of texts) {
    const result = await TTS.synthesize(text)
    await player.play(result.audioData, result.sampleRate)
  }
}

await speakSequence(['First sentence.', 'Second sentence.', 'And the third.'])

Synthesize

Text-to-Speech synthesis

TTS Voices

Voice management

Documentation Index

​Overview

​Playing Audio

​Using AudioPlayback

​Stop Playback

​Using Web Audio API Directly

​Converting to WAV

​Queuing Multiple Utterances

​Related

Synthesize

TTS Voices

Overview

Playing Audio

Using AudioPlayback

Stop Playback

Using Web Audio API Directly

Converting to WAV

Queuing Multiple Utterances

Related