Skip to main content
Early Beta — The Web SDK is in early beta. APIs may change between releases.

Overview

The Web SDK supports multiple TTS voice models via Piper TTS (VITS architecture) compiled to WebAssembly through sherpa-onnx. This page covers available voices, switching voices, and managing multiple voice models.

Available Voices

The Web SDK uses Piper TTS voice models in ONNX format. These are neural voices that produce natural-sounding speech.
VoiceLanguageQualitySizeSpeed
en_US-lessac-mediumEnglish (US)High~65MBFast
en_US-amy-mediumEnglish (US)High~65MBFast
en_GB-alba-mediumEnglish (UK)High~65MBFast
de_DE-thorsten-mediumGermanHigh~65MBFast
fr_FR-siwis-mediumFrenchHigh~65MBFast
es_ES-davefx-mediumSpanishHigh~65MBFast
Piper TTS has hundreds of voices in many languages. Browse the full catalog at Piper Samples.

Loading a Voice

import { TTS } from '@runanywhere/web'

await TTS.loadVoice({
  voiceId: 'piper-en-lessac',
  modelPath: '/models/en_US-lessac-medium.onnx',
  tokensPath: '/models/tokens.txt',
  dataDir: '/models/espeak-ng-data',
})

Switching Voices

Unload the current voice before loading a new one:
// Unload current voice
await TTS.unloadVoice()

// Load a different voice
await TTS.loadVoice({
  voiceId: 'piper-de-thorsten',
  modelPath: '/models/de_DE-thorsten-medium.onnx',
  tokensPath: '/models/tokens.txt',
  dataDir: '/models/espeak-ng-data',
})

// Synthesize in German
const result = await TTS.synthesize('Hallo, willkommen bei RunAnywhere!')

Voice Properties

Check the currently loaded voice:
console.log('Voice loaded:', TTS.isVoiceLoaded)
console.log('Voice ID:', TTS.voiceId)
console.log('Sample rate:', TTS.sampleRate) // typically 22050
console.log('Number of speakers:', TTS.numSpeakers)

Multi-Speaker Models

Some Piper voice models include multiple speakers. Select a speaker by ID:
// Check available speakers
console.log('Speakers:', TTS.numSpeakers)

// Synthesize with different speakers
const speaker0 = await TTS.synthesize('Hello!', { speakerId: 0 })
const speaker1 = await TTS.synthesize('Hello!', { speakerId: 1 })

Voice Configuration Options

interface TTSVoiceConfig {
  /** Unique voice identifier */
  voiceId: string

  /** Path to VITS/Piper ONNX model file */
  modelPath: string

  /** Path to tokens.txt file */
  tokensPath: string

  /** Path to espeak-ng-data directory */
  dataDir?: string

  /** Path to custom lexicon file */
  lexicon?: string

  /** Number of inference threads (default: 1) */
  numThreads?: number
}

Clean Up

Release TTS resources when no longer needed:
await TTS.unloadVoice()
TTS.cleanup()