Documentation Index Fetch the complete documentation index at: https://docs.runanywhere.ai/llms.txt
Use this file to discover all available pages before exploring further.
Stream audio for real-time transcription as the user speaks. Ideal for voice assistants and live captioning.
Overview
Streaming STT processes audio in chunks, providing partial transcriptions that update as more audio arrives. This creates a responsive experience where users see their words appear in real-time.
Basic Usage
// Ensure STT model is loaded
await RunAnywhere . loadSTTModel ( 'sherpa-onnx-whisper-tiny.en' );
// Stream transcription results from audio chunks
await for ( final result in RunAnywhere . transcribeStream (audioStream)) {
print ( 'Partial: ${ result . text } ' ); // result.isFinal is true on last segment
}
With Voice Session
The easiest way to use streaming STT is through the Voice Agent, which handles VAD and audio capture automatically:
final session = await RunAnywhere . startVoiceSession ();
session.events. listen ((event) {
if (event is VoiceSessionTranscribed ) {
print ( 'Transcription: ${ event . text } ' );
}
});
See Voice Agent for the complete voice pipeline.
class LiveTranscriptionWidget extends StatefulWidget {
@override
_LiveTranscriptionWidgetState createState () => _LiveTranscriptionWidgetState ();
}
class _LiveTranscriptionWidgetState extends State < LiveTranscriptionWidget > {
final _recorder = AudioRecorder ();
String _partialText = '' ;
String _finalText = '' ;
bool _isListening = false ;
StreamSubscription ? _audioSubscription;
Future < void > _startListening () async {
if ( ! await _recorder. hasPermission ()) return ;
setState (() {
_isListening = true ;
_partialText = '' ;
});
// Start recording with streaming
final stream = await _recorder. startStream (
const RecordConfig (
encoder : AudioEncoder .pcm16bits,
sampleRate : 16000 ,
numChannels : 1 ,
),
);
// Process audio chunks
_audioSubscription = stream. listen ((chunk) async {
// This would use your streaming transcription implementation
// The exact API depends on how you've set up streaming
});
}
Future < void > _stopListening () async {
await _audioSubscription ? . cancel ();
await _recorder. stop ();
setState (() {
_isListening = false ;
_finalText = _partialText;
_partialText = '' ;
});
}
@override
Widget build ( BuildContext context) {
return Column (
children : [
// Show partial transcription with typing indicator
Container (
padding : EdgeInsets . all ( 16 ),
child : Text (
_isListening ? ' $ _partialText |' : _finalText,
style : TextStyle (
fontSize : 18 ,
color : _isListening ? Colors .grey : Colors .black,
),
),
),
// Recording button
IconButton (
icon : Icon (_isListening ? Icons .stop : Icons .mic),
iconSize : 48 ,
color : _isListening ? Colors .red : Colors .blue,
onPressed : _isListening ? _stopListening : _startListening,
),
],
);
}
@override
void dispose () {
_audioSubscription ? . cancel ();
_recorder. dispose ();
super . dispose ();
}
}
Tips for Streaming STT
Accumulate audio in buffers of 100-500ms for optimal accuracy vs latency tradeoff.
Use VAD (Voice Activity Detection) to detect end of speech and finalize transcriptions.
Handle network interruptions and audio glitches gracefully. Consider retrying failed chunks.
Show a visual indicator (waveform, pulsing dot) to confirm audio is being captured.
See Also
transcribe() Batch transcription
Voice Agent Complete voice pipeline