Skip to main content
Early Beta — The Web SDK is in early beta. APIs may change between releases.

Overview

The Text-to-Speech (TTS) API converts text to spoken audio using on-device neural voice synthesis with Piper TTS compiled to WebAssembly. All synthesis happens locally in the browser.

Basic Usage

import { TTS } from '@runanywhere/web'

const result = await TTS.synthesize('Hello, welcome to the RunAnywhere SDK!')
console.log('Duration:', result.durationMs, 'ms')
console.log('Sample rate:', result.sampleRate)
// result.audioData is a Float32Array of PCM samples

Setup

Before synthesizing, load a TTS voice model:
import { TTS } from '@runanywhere/web'

await TTS.loadVoice({
  voiceId: 'piper-en-lessac',
  modelPath: '/models/piper-en-lessac.onnx',
  tokensPath: '/models/tokens.txt',
  dataDir: '/models/espeak-ng-data',
})

API Reference

TTS.loadVoice

Load a TTS voice model.
await TTS.loadVoice(config: TTSVoiceConfig): Promise<void>

TTSVoiceConfig

interface TTSVoiceConfig {
  /** Unique voice identifier */
  voiceId: string

  /** Path to VITS/Piper ONNX model */
  modelPath: string

  /** Path to tokens file */
  tokensPath: string

  /** Path to espeak-ng-data directory (for Piper voices) */
  dataDir?: string

  /** Path to lexicon file */
  lexicon?: string

  /** Number of threads (default: 1) */
  numThreads?: number
}

TTS.synthesize

Convert text to audio data.
await TTS.synthesize(
  text: string,
  options?: TTSSynthesizeOptions
): Promise<TTSSynthesisResult>
Parameters:
ParameterTypeDescription
textstringText to synthesize
optionsTTSSynthesizeOptionsOptional voice settings

TTSSynthesizeOptions

interface TTSSynthesizeOptions {
  /** Speaker ID for multi-speaker models (default: 0) */
  speakerId?: number

  /** Speech speed multiplier (default: 1.0) */
  speed?: number
}

TTSSynthesisResult

interface TTSSynthesisResult {
  /** Raw PCM audio data */
  audioData: Float32Array

  /** Audio sample rate in Hz */
  sampleRate: number

  /** Audio duration in milliseconds */
  durationMs: number

  /** Processing time in milliseconds */
  processingTimeMs: number
}

Examples

Basic Synthesis with Playback

import { TTS, AudioPlayback } from '@runanywhere/web'

const result = await TTS.synthesize('Hello from RunAnywhere!')
const player = new AudioPlayback()
await player.play(result.audioData, result.sampleRate)

With Speed Options

// Slower speech
const slow = await TTS.synthesize('This is spoken slowly.', { speed: 0.75 })

// Faster speech
const fast = await TTS.synthesize('This is spoken quickly!', { speed: 1.5 })

// Normal speed (default)
const normal = await TTS.synthesize('This is normal speed.', { speed: 1.0 })

Multi-Speaker Model

// Use different speakers from the same model
const speaker0 = await TTS.synthesize('Hello from speaker zero.', { speakerId: 0 })
const speaker1 = await TTS.synthesize('Hello from speaker one.', { speakerId: 1 })

React TTS Component

TextToSpeech.tsx
import { useState, useCallback } from 'react'
import { TTS, AudioPlayback } from '@runanywhere/web'

export function TextToSpeech() {
  const [text, setText] = useState('')
  const [isPlaying, setIsPlaying] = useState(false)

  const handleSpeak = useCallback(async () => {
    if (!text.trim()) return

    setIsPlaying(true)
    try {
      const result = await TTS.synthesize(text, { speed: 1.0 })
      const player = new AudioPlayback()

      player.onComplete(() => setIsPlaying(false))
      await player.play(result.audioData, result.sampleRate)
    } catch (error) {
      console.error('Synthesis failed:', error)
      setIsPlaying(false)
    }
  }, [text])

  return (
    <div>
      <textarea
        value={text}
        onChange={(e) => setText(e.target.value)}
        placeholder="Enter text to speak..."
      />
      <button onClick={handleSpeak} disabled={isPlaying || !text.trim()}>
        {isPlaying ? 'Speaking...' : 'Speak'}
      </button>
    </div>
  )
}

Voice Properties

After loading a voice, check its properties:
console.log('Voice loaded:', TTS.isVoiceLoaded)
console.log('Voice ID:', TTS.voiceId)
console.log('Sample rate:', TTS.sampleRate)
console.log('Speakers:', TTS.numSpeakers)

Speed Options

SpeedEffect
0.5Half speed (very slow)
0.75Slower than normal
1.0Normal speed (default)
1.25Slightly faster
1.5Fast
2.0Double speed

Error Handling

import { TTS, SDKError, SDKErrorCode } from '@runanywhere/web'

try {
  const result = await TTS.synthesize(text)
} catch (err) {
  if (err instanceof SDKError) {
    switch (err.code) {
      case SDKErrorCode.NotInitialized:
        console.error('Initialize the SDK first')
        break
      case SDKErrorCode.ModelNotLoaded:
        console.error('Load a TTS voice first')
        break
      default:
        console.error('TTS error:', err.message)
    }
  }
}