Skip to main content

Overview

The RunAnywhere SDK supports both neural TTS voices (Piper) and system voices (platform-native TTS). This guide covers available voices and how to configure them.

Voice Types

TypeProviderQualityOfflineLanguages
NeuralPiper TTSHighMulti
SystemiOS AVSpeechGoodMulti
SystemAndroid TTSGoodMulti

Getting Available Voices

import { RunAnywhere } from '@runanywhere/core'

// Get all available voices
const voices = await RunAnywhere.availableTTSVoices()

for (const voice of voices) {
  console.log(`${voice.id}`)
  console.log(`  Name: ${voice.name}`)
  console.log(`  Language: ${voice.language}`)
  console.log(`  Type: ${voice.type}`)
  console.log(`  Quality: ${voice.quality}`)
}

Voice Info Structure

interface TTSVoiceInfo {
  /** Unique voice identifier */
  id: string

  /** Display name */
  name: string

  /** Language code (e.g., 'en-US') */
  language: string

  /** Voice type (neural, system) */
  type: 'neural' | 'system'

  /** Quality level */
  quality: 'low' | 'medium' | 'high'

  /** Gender (if available) */
  gender?: 'male' | 'female' | 'neutral'

  /** Whether voice requires download */
  requiresDownload: boolean

  /** Whether voice is ready to use */
  isAvailable: boolean
}

Using Neural Voices (Piper)

Add a Piper Voice

import { ONNX, ModelArtifactType } from '@runanywhere/onnx'
import { ModelCategory } from '@runanywhere/core'

// English (US) - Lessac voice
await ONNX.addModel({
  id: 'piper-en-lessac',
  name: 'English (Lessac)',
  url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/.../vits-piper-en_US-lessac-medium.tar.gz',
  modality: ModelCategory.SpeechSynthesis,
  artifactType: ModelArtifactType.TarGzArchive,
  memoryRequirement: 65_000_000,
})

// Download and load
await RunAnywhere.downloadModel('piper-en-lessac')
const modelInfo = await RunAnywhere.getModelInfo('piper-en-lessac')
await RunAnywhere.loadTTSModel(modelInfo.localPath, 'piper')

// Use the voice
const result = await RunAnywhere.synthesize('Hello, world!')

Available Piper Voices

Voice IDLanguageQualitySize
en_US-lessac-mediumEnglish (US)High~65MB
en_US-amy-mediumEnglish (US)High~65MB
en_GB-alba-mediumEnglish (UK)High~65MB
de_DE-thorsten-mediumGermanHigh~65MB
fr_FR-upmc-mediumFrenchHigh~65MB
es_ES-sharvard-mediumSpanishHigh~65MB
Piper voices provide the highest quality but require downloading model files. See Piper TTS for more voices.

Using System Voices

System voices use the platform’s built-in TTS engine and don’t require downloads.

iOS (AVSpeechSynthesizer)

// Use system TTS (no model loading needed)
await RunAnywhere.speak('Hello from iOS!', {
  voice: 'com.apple.ttsbundle.Samantha-compact',
  rate: 1.0,
  pitch: 1.0,
})

// List iOS voices
const iosVoices = (await RunAnywhere.availableTTSVoices()).filter((v) => v.type === 'system')

Android (TextToSpeech)

// Use system TTS
await RunAnywhere.speak('Hello from Android!', {
  voice: 'en-us-x-tpf-local',
  rate: 1.0,
  pitch: 1.0,
})

// List Android voices
const androidVoices = (await RunAnywhere.availableTTSVoices()).filter((v) => v.type === 'system')

Voice Selection Component

VoiceSelector.tsx
import React, { useState, useEffect } from 'react'
import { View, Text, FlatList, TouchableOpacity, StyleSheet } from 'react-native'
import { RunAnywhere, TTSVoiceInfo } from '@runanywhere/core'

export function VoiceSelector({ onSelect }: { onSelect: (voice: TTSVoiceInfo) => void }) {
  const [voices, setVoices] = useState<TTSVoiceInfo[]>([])
  const [selected, setSelected] = useState<string | null>(null)

  useEffect(() => {
    RunAnywhere.availableTTSVoices().then(setVoices)
  }, [])

  const handleSelect = (voice: TTSVoiceInfo) => {
    setSelected(voice.id)
    onSelect(voice)
  }

  const renderVoice = ({ item }: { item: TTSVoiceInfo }) => (
    <TouchableOpacity
      style={[styles.voiceItem, selected === item.id && styles.selected]}
      onPress={() => handleSelect(item)}
      disabled={!item.isAvailable}
    >
      <Text style={styles.voiceName}>{item.name}</Text>
      <Text style={styles.voiceInfo}>
        {item.language}{item.type}{item.quality}
      </Text>
      {!item.isAvailable && <Text style={styles.downloadRequired}>Download required</Text>}
    </TouchableOpacity>
  )

  return (
    <FlatList
      data={voices}
      renderItem={renderVoice}
      keyExtractor={(item) => item.id}
      style={styles.list}
    />
  )
}

const styles = StyleSheet.create({
  list: { flex: 1 },
  voiceItem: {
    padding: 16,
    borderBottomWidth: 1,
    borderBottomColor: '#eee',
  },
  selected: {
    backgroundColor: '#e3f2fd',
  },
  voiceName: {
    fontSize: 16,
    fontWeight: '600',
  },
  voiceInfo: {
    fontSize: 12,
    color: '#666',
    marginTop: 4,
  },
  downloadRequired: {
    fontSize: 12,
    color: '#f44336',
    marginTop: 4,
  },
})

Voice Comparison

Neural vs System Voices

// Compare voice quality
async function compareVoices(text: string) {
  // Neural voice (higher quality, requires download)
  const neural = await RunAnywhere.synthesize(text, {
    voice: 'piper-en-lessac',
  })

  // System voice (built-in, no download)
  await RunAnywhere.speak(text, {
    voice: 'com.apple.ttsbundle.Samantha-compact',
  })

  console.log('Neural duration:', neural.duration, 's')
}
AspectNeural (Piper)System
QualityVery HighGood
NaturalnessVery NaturalNatural
Download RequiredYesNo
OfflineYesYes
CustomizationLimitedPlatform-dependent
LanguagesManyMany

Loading Multiple Voices

// Load voice for use
await RunAnywhere.loadTTSVoice('piper-en-lessac')

// Switch voices without reloading model
const result1 = await RunAnywhere.synthesize('Hello', { voice: 'voice1' })
const result2 = await RunAnywhere.synthesize('Hello', { voice: 'voice2' })

Best Practices

For production apps, pre-download neural voices during onboarding so they’re ready when needed.

Voice Selection Strategy

async function getBestVoice(language: string): Promise<TTSVoiceInfo | null> {
  const voices = await RunAnywhere.availableTTSVoices()

  // Prefer neural voices if available
  const neuralVoice = voices.find(
    (v) => v.language.startsWith(language) && v.type === 'neural' && v.isAvailable
  )

  if (neuralVoice) return neuralVoice

  // Fall back to system voice
  const systemVoice = voices.find(
    (v) => v.language.startsWith(language) && v.type === 'system' && v.isAvailable
  )

  return systemVoice || null
}