Skip to main content
Configure STT behavior for different use cases and languages.

Model Selection

Choose the right model based on your needs:
ModelSizeLanguagesQualitySpeedUse Case
whisper-tiny.en~75MBEnglishGoodFastestQuick commands
whisper-base.en~150MBEnglishBetterFastGeneral use
whisper-small.en~250MBEnglishBestMediumAccuracy-critical
whisper-tiny~75MBMultiGoodFastMultilingual apps

Register Multiple Models

// English-optimized for primary use
Onnx.addModel(
  id: 'whisper-tiny-en',
  name: 'Whisper Tiny English',
  url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-models-v1/sherpa-onnx-whisper-tiny.en.tar.gz',
  modality: ModelCategory.speechRecognition,
);

// Multilingual for international users
Onnx.addModel(
  id: 'whisper-tiny-multi',
  name: 'Whisper Tiny Multilingual',
  url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-models-v1/sherpa-onnx-whisper-tiny.tar.gz',
  modality: ModelCategory.speechRecognition,
);

Switch Models at Runtime

// Load English model by default
await RunAnywhere.loadSTTModel('whisper-tiny-en');

// Switch to multilingual when needed
Future<void> switchToMultilingual() async {
  await RunAnywhere.unloadSTTModel();
  await RunAnywhere.loadSTTModel('whisper-tiny-multi');
}

Memory Management

Unload STT model when not needed to free memory:
// Check if loaded
if (RunAnywhere.isSTTModelLoaded) {
  print('Current STT model: ${RunAnywhere.currentSTTModelId}');
}

// Unload to free memory
await RunAnywhere.unloadSTTModel();

Audio Preprocessing Tips

If your audio isn’t 16kHz, convert it before transcription: dart // Example: Convert 44.1kHz to 16kHz // Use a package like 'flutter_sound' for resampling
For noisy environments, consider preprocessing audio: - Apply a high-pass filter to remove low-frequency noise - Normalize audio levels - Remove silence at beginning/end
Always ensure correct format: - PCM16 (16-bit signed integer) - 16,000 Hz sample rate - Mono (single channel)

Error Handling

try {
  final text = await RunAnywhere.transcribe(audioBytes);
  print('Transcribed: $text');
} on SDKError catch (e) {
  switch (e.code) {
    case SDKErrorCode.sttNotAvailable:
      print('STT not available. Load an STT model first.');
      break;
    case SDKErrorCode.componentNotReady:
      print('STT model not loaded.');
      break;
    default:
      print('STT error: ${e.message}');
  }
}

Best Practices

Start with the smallest model that meets your accuracy needs. You can always upgrade later if needed.
  1. Preload during idle time — Download and load STT model before user needs it
  2. Use English-specific models — They’re smaller and more accurate for English
  3. Handle empty audio — Check audio length before transcribing
  4. Provide feedback — Show transcription progress to users

See Also