STT Options

Configure STT behavior for different use cases and languages.

Model Selection

Choose the right model based on your needs:

Model	Size	Languages	Quality	Speed	Use Case
`whisper-tiny.en`	~75MB	English	Good	Fastest	Quick commands
`whisper-base.en`	~150MB	English	Better	Fast	General use
`whisper-small.en`	~250MB	English	Best	Medium	Accuracy-critical
`whisper-tiny`	~75MB	Multi	Good	Fast	Multilingual apps

Register Multiple Models

// English-optimized for primary use
Onnx.addModel(
  id: 'whisper-tiny-en',
  name: 'Whisper Tiny English',
  url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-models-v1/sherpa-onnx-whisper-tiny.en.tar.gz',
  modality: ModelCategory.speechRecognition,
);

// Multilingual for international users
Onnx.addModel(
  id: 'whisper-tiny-multi',
  name: 'Whisper Tiny Multilingual',
  url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-models-v1/sherpa-onnx-whisper-tiny.tar.gz',
  modality: ModelCategory.speechRecognition,
);

Switch Models at Runtime

// Load English model by default
await RunAnywhere.loadSTTModel('whisper-tiny-en');

// Switch to multilingual when needed
Future<void> switchToMultilingual() async {
  await RunAnywhere.unloadSTTModel();
  await RunAnywhere.loadSTTModel('whisper-tiny-multi');
}

Memory Management

Unload STT model when not needed to free memory:

// Check if loaded
if (RunAnywhere.isSTTModelLoaded) {
  print('Current STT model: ${RunAnywhere.currentSTTModelId}');
}

// Unload to free memory
await RunAnywhere.unloadSTTModel();

Audio Preprocessing Tips

Sample Rate Conversion

If your audio isn’t 16kHz, convert it before transcription: dart // Example: Convert 44.1kHz to 16kHz // Use a package like 'flutter_sound' for resampling

Noise Reduction

For noisy environments, consider preprocessing audio: - Apply a high-pass filter to remove low-frequency noise - Normalize audio levels - Remove silence at beginning/end

Audio Format

Always ensure correct format: - PCM16 (16-bit signed integer) - 16,000 Hz sample rate - Mono (single channel)

Error Handling

try {
  final text = await RunAnywhere.transcribe(audioBytes);
  print('Transcribed: $text');
} on SDKError catch (e) {
  switch (e.type) {
    case SDKErrorType.sttNotAvailable:
      print('STT not available. Load an STT model first.');
      break;
    case SDKErrorType.componentNotReady:
      print('STT model not loaded.');
      break;
    default:
      print('STT error: ${e.message}');
  }
}

Best Practices

Start with the smallest model that meets your accuracy needs. You can always upgrade later if needed.

Preload during idle time — Download and load STT model before user needs it
Use English-specific models — They’re smaller and more accurate for English
Handle empty audio — Check audio length before transcribing
Provide feedback — Show transcription progress to users

transcribe()

Basic transcription

Voice Agent

Complete voice pipeline

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

Model Selection

Register Multiple Models

Switch Models at Runtime

Memory Management

Audio Preprocessing Tips

Error Handling

Best Practices

See Also

transcribe()

Voice Agent

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

​Model Selection

​Register Multiple Models

​Switch Models at Runtime

​Memory Management

​Audio Preprocessing Tips

​Error Handling

​Best Practices

​See Also

transcribe()

Voice Agent

Model Selection

Register Multiple Models

Switch Models at Runtime

Memory Management

Audio Preprocessing Tips

Error Handling

Best Practices

See Also