Streaming STT

Stream audio for real-time transcription as the user speaks. Ideal for voice assistants and live captioning.

Overview

Streaming STT processes audio in chunks, providing partial transcriptions that update as more audio arrives. This creates a responsive experience where users see their words appear in real-time.

Basic Usage

// Ensure STT model is loaded
await RunAnywhere.loadSTTModel('sherpa-onnx-whisper-tiny.en');

// Stream transcription results from audio chunks
await for (final result in RunAnywhere.transcribeStream(audioStream)) {
  print('Partial: ${result.text}');  // result.isFinal is true on last segment
}

With Voice Session

The easiest way to use streaming STT is through the Voice Agent, which handles VAD and audio capture automatically:

final session = await RunAnywhere.startVoiceSession();

session.events.listen((event) {
  if (event is VoiceSessionTranscribed) {
    print('Transcription: ${event.text}');
  }
});

See Voice Agent for the complete voice pipeline.

class LiveTranscriptionWidget extends StatefulWidget {
  @override
  _LiveTranscriptionWidgetState createState() => _LiveTranscriptionWidgetState();
}

class _LiveTranscriptionWidgetState extends State<LiveTranscriptionWidget> {
  final _recorder = AudioRecorder();
  String _partialText = '';
  String _finalText = '';
  bool _isListening = false;
  StreamSubscription? _audioSubscription;

  Future<void> _startListening() async {
    if (!await _recorder.hasPermission()) return;

    setState(() {
      _isListening = true;
      _partialText = '';
    });

    // Start recording with streaming
    final stream = await _recorder.startStream(
      const RecordConfig(
        encoder: AudioEncoder.pcm16bits,
        sampleRate: 16000,
        numChannels: 1,
      ),
    );

    // Process audio chunks
    _audioSubscription = stream.listen((chunk) async {
      // This would use your streaming transcription implementation
      // The exact API depends on how you've set up streaming
    });
  }

  Future<void> _stopListening() async {
    await _audioSubscription?.cancel();
    await _recorder.stop();

    setState(() {
      _isListening = false;
      _finalText = _partialText;
      _partialText = '';
    });
  }

  @override
  Widget build(BuildContext context) {
    return Column(
      children: [
        // Show partial transcription with typing indicator
        Container(
          padding: EdgeInsets.all(16),
          child: Text(
            _isListening ? '$_partialText|' : _finalText,
            style: TextStyle(
              fontSize: 18,
              color: _isListening ? Colors.grey : Colors.black,
            ),
          ),
        ),

        // Recording button
        IconButton(
          icon: Icon(_isListening ? Icons.stop : Icons.mic),
          iconSize: 48,
          color: _isListening ? Colors.red : Colors.blue,
          onPressed: _isListening ? _stopListening : _startListening,
        ),
      ],
    );
  }

  @override
  void dispose() {
    _audioSubscription?.cancel();
    _recorder.dispose();
    super.dispose();
  }
}

Tips for Streaming STT

Buffer Management

Accumulate audio in buffers of 100-500ms for optimal accuracy vs latency tradeoff.

Silence Detection

Use VAD (Voice Activity Detection) to detect end of speech and finalize transcriptions.

Error Recovery

Handle network interruptions and audio glitches gracefully. Consider retrying failed chunks.

UI Feedback

Show a visual indicator (waveform, pulsing dot) to confirm audio is being captured.

transcribe()

Batch transcription

Voice Agent

Complete voice pipeline

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

Overview

Basic Usage

With Voice Session

Real-Time Transcription Widget

Tips for Streaming STT

See Also

transcribe()

Voice Agent

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

​Overview

​Basic Usage

​With Voice Session

​Real-Time Transcription Widget

​Tips for Streaming STT

​See Also

transcribe()

Voice Agent

Overview

Basic Usage

With Voice Session

Real-Time Transcription Widget

Tips for Streaming STT

See Also