Documentation Index Fetch the complete documentation index at: https://docs.runanywhere.ai/llms.txt
Use this file to discover all available pages before exploring further.
Stream TTS audio as it’s generated for faster time-to-first-audio, especially with longer text.
Overview
Streaming TTS starts playing audio before the entire synthesis is complete. This is particularly useful for:
Long text passages
Voice assistants responding in real-time
Reducing perceived latency
Basic Concept
// The Voice Agent pipeline handles streaming TTS automatically
final session = await RunAnywhere . startVoiceSession (
config : VoiceSessionConfig (
autoPlayTTS : true , // Automatically plays synthesized audio
),
);
session.events. listen ((event) {
if (event is VoiceSessionSpeaking ) {
print ( 'Playing audio response...' );
}
});
Chunked Synthesis
For manual control, synthesize text in chunks:
Future < void > speakInChunks ( String longText) async {
// Split into sentences
final sentences = longText. split ( RegExp ( r'(?<=[.!?])\s+' ));
for ( final sentence in sentences) {
final result = await RunAnywhere . synthesize (sentence);
await playAudio (result);
}
}
With Voice Agent Pipeline
The Voice Agent provides the best streaming TTS experience:
// Initialize all components
await RunAnywhere . loadSTTModel ( 'sherpa-onnx-whisper-tiny.en' );
await RunAnywhere . loadModel ( 'smollm2-360m-q8_0' );
await RunAnywhere . loadTTSVoice ( 'vits-piper-en_US-lessac-medium' );
// Start session with auto-play
final session = await RunAnywhere . startVoiceSession (
config : VoiceSessionConfig (
autoPlayTTS : true ,
continuousMode : true ,
),
);
// The pipeline automatically:
// 1. Detects speech (VAD)
// 2. Transcribes audio (STT)
// 3. Generates response (LLM)
// 4. Synthesizes and plays audio (TTS)
session.events. listen ((event) {
switch (event) {
case VoiceSessionTranscribed ( : final text) :
print ( 'User: $ text ' );
case VoiceSessionResponded ( : final text) :
print ( 'AI: $ text ' );
case VoiceSessionSpeaking () :
print ( 'Playing response...' );
case VoiceSessionTurnCompleted () :
print ( 'Ready for next turn' );
default :
break ;
}
});
Latency Optimization Tips
Load the TTS voice during app startup or idle time, not when the user first needs it. // In app initialization
await RunAnywhere . loadTTSVoice ( 'vits-piper-en_US-lessac-medium' );
Smaller voice models synthesize faster. Choose based on your quality/speed tradeoff.
For very long responses, synthesize and play sentence by sentence rather than waiting for the
complete response.
Start playback as soon as you have enough audio buffered (typically 100-200ms).
See Also
synthesize() Basic synthesis
Voice Agent Complete voice pipeline