transcribe()

Transcribe audio data to text using on-device speech recognition models.

Basic Transcription

// Load an STT model first
RunAnywhere.loadSTTModel("whisper-tiny")

// Transcribe audio bytes
val audioData: ByteArray = // ... from file or recording
val text = RunAnywhere.transcribe(audioData)
println(text)  // "Hello, how are you today?"

Transcription with Options

Get detailed output including confidence scores and timestamps:

val output = RunAnywhere.transcribeWithOptions(
    audioData = audioBytes,
    options = STTOptions(
        language = "en",
        enableTimestamps = true,
        enablePunctuation = true
    )
)

println("Text: ${output.text}")
println("Confidence: ${output.confidence}")

// Access word-level timestamps
output.wordTimestamps?.forEach { word ->
    println("[${word.startTime}s - ${word.endTime}s]: ${word.word}")
}

STTOutput

The detailed output object:

Property	Type	Description
`text`	`String`	Transcribed text
`confidence`	`Float`	Confidence score (0.0-1.0)
`wordTimestamps`	`List<WordTimestamp>?`	Word-level timing
`detectedLanguage`	`String?`	Auto-detected language code
`metadata`	`TranscriptionMetadata`	Processing metrics

Example: Transcribe Audio File

suspend fun transcribeAudioFile(uri: Uri): String {
    val audioData = contentResolver.openInputStream(uri)?.readBytes()
        ?: throw IllegalArgumentException("Cannot read audio file")

    // Ensure STT model is loaded
    if (!RunAnywhere.isSTTModelLoaded()) {
        RunAnywhere.loadSTTModel("whisper-tiny")
    }

    return RunAnywhere.transcribe(audioData)
}

Example: Record and Transcribe

class VoiceRecorderViewModel : ViewModel() {
    private var audioRecorder: AudioRecord? = null
    private val audioBuffer = mutableListOf<Byte>()

    fun startRecording() {
        // Start recording audio...
    }

    fun stopAndTranscribe() {
        viewModelScope.launch {
            val audioData = audioBuffer.toByteArray()

            val result = RunAnywhere.transcribeWithOptions(
                audioData,
                options = STTOptions(
                    language = "en",
                    enablePunctuation = true
                )
            )

            _transcription.value = result.text
            _confidence.value = result.confidence
        }
    }
}

Model Management

// Load a specific STT model
RunAnywhere.loadSTTModel("whisper-tiny")

// Check if loaded
val isLoaded = RunAnywhere.isSTTModelLoaded()

// Get current model ID
val modelId = RunAnywhere.currentSTTModelId

// Unload when done
RunAnywhere.unloadSTTModel()

Supported Audio Formats

Format	Sample Rate	Notes
PCM	16000 Hz	Recommended for best quality
WAV	16000 Hz	Standard audio file format
MP3	Any	Converted internally

For best transcription accuracy: - Use 16kHz mono PCM audio - Keep audio clips under 30 seconds for optimal performance - Use a smaller model (whisper-tiny) for faster results, larger models (whisper-base) for better accuracy

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

Basic Transcription

Transcription with Options

STTOutput

Example: Transcribe Audio File

Example: Record and Transcribe

Model Management

Supported Audio Formats

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

​Basic Transcription

​Transcription with Options

​STTOutput

​Example: Transcribe Audio File

​Example: Record and Transcribe

​Model Management

​Supported Audio Formats

Basic Transcription

Transcription with Options

STTOutput

Example: Transcribe Audio File

Example: Record and Transcribe

Model Management

Supported Audio Formats