transcribe() method converts audio data to text using on-device speech recognition models like Whisper.
Basic Usage
Setup
Before transcribing, register the ONNX module and load an STT model:Method Signatures
Simple Transcription
Transcription with Options
Buffer Transcription
AVAudioPCMBuffer.
Audio Requirements
| Property | Requirement |
|---|---|
| Sample Rate | 16,000 Hz (recommended) |
| Channels | Mono (1 channel) |
| Format | Float32 or Int16 PCM |
| Duration | Up to 30 seconds per call (Whisper limitation) |
STTOutput
Examples
Recording and Transcribing
With Timestamps
Multi-Language Support
SwiftUI Voice Input
Model Management
Available Models
These are the Sherpa-ONNX Whisper models available as tar.gz archives:| Model ID | Size | Quality | Speed |
|---|---|---|---|
sherpa-onnx-whisper-tiny.en | ~40MB | Good | Fastest |
sherpa-onnx-whisper-base.en | ~150MB | Better | Fast |
sherpa-onnx-whisper-small.en | ~500MB | Best | Slower |
framework: .onnx, modality: .speechRecognition, and artifactType: .archive(.tarGz, structure: .nestedDirectory).