Overview
Streaming STT processes audio in chunks, providing partial transcriptions that update as more audio arrives. This creates a responsive experience where users see their words appear in real-time.Basic Usage
With Voice Session
The easiest way to use streaming STT is through the Voice Agent, which handles VAD and audio capture automatically:Real-Time Transcription Widget
Tips for Streaming STT
Buffer Management
Buffer Management
Accumulate audio in buffers of 100-500ms for optimal accuracy vs latency tradeoff.
Silence Detection
Silence Detection
Use VAD (Voice Activity Detection) to detect end of speech and finalize transcriptions.
Error Recovery
Error Recovery
Handle network interruptions and audio glitches gracefully. Consider retrying failed chunks.
UI Feedback
UI Feedback
Show a visual indicator (waveform, pulsing dot) to confirm audio is being captured.