Streaming STT

Early Beta — The Web SDK is in early beta. APIs may change between releases.

Overview

Streaming STT provides real-time transcription as audio is being captured, without waiting for the full recording to complete. This enables live captioning, real-time voice interfaces, and interactive dictation.

Basic Usage

import { STT } from '@runanywhere/web'

// Create a streaming session
const session = STT.createStreamingSession()

// Feed audio chunks as they arrive from the microphone
function onAudioChunk(samples: Float32Array) {
  session.acceptWaveform(samples)

  // Check for partial results
  const result = session.getResult()
  if (result.text) {
    console.log('Partial:', result.text)
  }
}

// When done speaking
session.inputFinished()
const finalResult = session.getResult()
console.log('Final:', finalResult.text)

// Clean up
session.destroy()

API Reference

`STT.createStreamingSession`

Create a new streaming transcription session.

STT.createStreamingSession(options?: STTTranscribeOptions): STTStreamingSession

STTStreamingSession

interface STTStreamingSession {
  /** Feed audio samples into the session */
  acceptWaveform(samples: Float32Array, sampleRate?: number): void

  /** Signal that no more audio will be provided */
  inputFinished(): void

  /** Get the current transcription result */
  getResult(): { text: string; isEndpoint: boolean }

  /** Reset the session for a new utterance */
  reset(): void

  /** Release all resources */
  destroy(): void
}

Examples

Live Microphone Transcription

import { STT, AudioCapture } from '@runanywhere/web'

const capture = new AudioCapture()
const session = STT.createStreamingSession()

// Feed microphone audio into the streaming session
capture.onAudioChunk((samples) => {
  session.acceptWaveform(samples, 16000)

  const result = session.getResult()
  if (result.text) {
    document.getElementById('transcript').textContent = result.text
  }

  if (result.isEndpoint) {
    console.log('Endpoint detected:', result.text)
    session.reset() // Ready for next utterance
  }
})

// Start capturing
await capture.start({ sampleRate: 16000 })

// Stop when done
// capture.stop()
// session.destroy()

React Component

LiveTranscription.tsx

import { useState, useCallback, useRef, useEffect } from 'react'
import { STT, AudioCapture, STTStreamingSession } from '@runanywhere/web'

export function LiveTranscription() {
  const [transcript, setTranscript] = useState('')
  const [isListening, setIsListening] = useState(false)
  const captureRef = useRef<AudioCapture | null>(null)
  const sessionRef = useRef<STTStreamingSession | null>(null)

  const startListening = useCallback(async () => {
    const capture = new AudioCapture()
    const session = STT.createStreamingSession()
    captureRef.current = capture
    sessionRef.current = session

    capture.onAudioChunk((samples) => {
      session.acceptWaveform(samples, 16000)
      const result = session.getResult()
      if (result.text) {
        setTranscript(result.text)
      }
    })

    await capture.start({ sampleRate: 16000 })
    setIsListening(true)
  }, [])

  const stopListening = useCallback(() => {
    captureRef.current?.stop()
    sessionRef.current?.inputFinished()

    const finalResult = sessionRef.current?.getResult()
    if (finalResult?.text) {
      setTranscript(finalResult.text)
    }

    sessionRef.current?.destroy()
    setIsListening(false)
  }, [])

  return (
    <div>
      <button onClick={isListening ? stopListening : startListening}>
        {isListening ? 'Stop' : 'Start Listening'}
      </button>
      <p>{transcript || 'Speak to see transcription...'}</p>
    </div>
  )
}

Session Lifecycle

Create session

Call STT.createStreamingSession() to create a new session.

Feed audio

Call acceptWaveform() with each audio chunk from the microphone.

Read results

Call getResult() to get partial transcription at any time.

Reset or finish

Call reset() to start a new utterance, or inputFinished() when done.

Clean up

Call destroy() to release all resources.

Transcribe

Batch audio transcription

STT Options

Configuration options

VAD

Voice Activity Detection

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

Overview

Basic Usage

API Reference

`STT.createStreamingSession`

STTStreamingSession

Examples

Live Microphone Transcription

React Component

Session Lifecycle

Transcribe

STT Options

VAD

​Overview

​Basic Usage

​API Reference

​STT.createStreamingSession

​STTStreamingSession

​Examples

​Live Microphone Transcription

​React Component

​Session Lifecycle

​Related

Transcribe

STT Options

VAD

Overview

Basic Usage

API Reference

`STT.createStreamingSession`

STTStreamingSession

Examples

Live Microphone Transcription

React Component

Session Lifecycle

Related