Model Selection
Choose the right model based on your needs:| Model | Size | Languages | Quality | Speed | Use Case |
|---|---|---|---|---|---|
whisper-tiny.en | ~75MB | English | Good | Fastest | Quick commands |
whisper-base.en | ~150MB | English | Better | Fast | General use |
whisper-small.en | ~250MB | English | Best | Medium | Accuracy-critical |
whisper-tiny | ~75MB | Multi | Good | Fast | Multilingual apps |
Register Multiple Models
Switch Models at Runtime
Memory Management
Unload STT model when not needed to free memory:Audio Preprocessing Tips
Sample Rate Conversion
Sample Rate Conversion
If your audio isn’t 16kHz, convert it before transcription:
dart // Example: Convert 44.1kHz to 16kHz // Use a package like 'flutter_sound' for resampling Noise Reduction
Noise Reduction
For noisy environments, consider preprocessing audio: - Apply a high-pass filter to remove
low-frequency noise - Normalize audio levels - Remove silence at beginning/end
Audio Format
Audio Format
Always ensure correct format: - PCM16 (16-bit signed integer) - 16,000 Hz sample rate - Mono
(single channel)
Error Handling
Best Practices
- Preload during idle time — Download and load STT model before user needs it
- Use English-specific models — They’re smaller and more accurate for English
- Handle empty audio — Check audio length before transcribing
- Provide feedback — Show transcription progress to users