> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runanywhere.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Configuration

> SDK configuration and settings

<Note>**Early Beta** -- The Web SDK is in early beta. APIs may change between releases.</Note>

## Overview

This guide covers SDK initialization options, backend registration, model management, events, browser capabilities, and audio utilities.

## SDK Initialization

### Basic Initialization

```typescript theme={null}
import { RunAnywhere, SDKEnvironment } from '@runanywhere/web'
import { LlamaCPP } from '@runanywhere/web-llamacpp'
import { ONNX } from '@runanywhere/web-onnx'

// Step 1: Initialize core SDK
await RunAnywhere.initialize({
  environment: SDKEnvironment.Development,
  debug: true,
})

// Step 2: Register backends (loads WASM automatically)
await LlamaCPP.register() // LLM + VLM
await ONNX.register() // STT + TTS + VAD
```

### Full Configuration

```typescript theme={null}
interface SDKInitOptions {
  /** SDK environment */
  environment?: SDKEnvironment // Development | Staging | Production

  /** Enable debug logging */
  debug?: boolean

  /** API key for authentication (optional) */
  apiKey?: string

  /** Base URL for API requests */
  baseURL?: string

  /** Acceleration preference */
  acceleration?: AccelerationPreference // 'auto' | 'webgpu' | 'cpu'

  /** Custom URL for WebGPU WASM glue */
  webgpuWasmUrl?: string
}
```

### Backend Registration

After initializing the core SDK, register the inference backends you need:

```typescript theme={null}
import { LlamaCPP } from '@runanywhere/web-llamacpp'
import { ONNX } from '@runanywhere/web-onnx'

// LlamaCpp: LLM text generation + VLM vision
await LlamaCPP.register()
console.log('LlamaCpp registered:', LlamaCPP.isRegistered)
console.log('Acceleration:', LlamaCPP.accelerationMode) // 'webgpu' or 'cpu'

// ONNX (sherpa-onnx): STT + TTS + VAD
await ONNX.register()
```

<Warning>
  Backend registration loads WASM binaries and can take a few seconds. Always `await` the register
  calls before using any inference APIs. Registration is idempotent -- calling it multiple times is
  safe.
</Warning>

## Environment Modes

| Environment | Enum Value                   | Description                       | Logging |
| ----------- | ---------------------------- | --------------------------------- | ------- |
| Development | `SDKEnvironment.Development` | Local development, full debugging | Debug   |
| Staging     | `SDKEnvironment.Staging`     | Testing with real services        | Info    |
| Production  | `SDKEnvironment.Production`  | Production deployment             | Warning |

## Logging

### Configure Log Level

```typescript theme={null}
import { SDKLogger, LogLevel } from '@runanywhere/web'

SDKLogger.level = LogLevel.Debug // Trace | Debug | Info | Warning | Error | Fatal
SDKLogger.enabled = true
```

### Log Levels

| Level     | Description             | Use Case    |
| --------- | ----------------------- | ----------- |
| `Trace`   | Very detailed tracing   | Deep debug  |
| `Debug`   | Detailed debugging info | Development |
| `Info`    | General information     | Staging     |
| `Warning` | Potential issues        | Production  |
| `Error`   | Errors and failures     | Production  |
| `Fatal`   | Critical failures       | Always      |

## Events

### EventBus

The SDK provides a typed event system for monitoring SDK activities:

```typescript theme={null}
import { EventBus } from '@runanywhere/web'

// Subscribe to model download progress
const unsubscribe = EventBus.shared.on('model.downloadProgress', (evt) => {
  console.log(`Model: ${evt.modelId}, Progress: ${((evt.progress ?? 0) * 100).toFixed(0)}%`)
})

EventBus.shared.on('model.loadCompleted', (evt) => {
  console.log(`Model loaded: ${evt.modelId}`)
})

// Clean up
unsubscribe()
```

<Note>
  Event properties are directly on the event object (e.g., `evt.modelId`, `evt.progress`), not
  nested under `evt.data`.
</Note>

### Event Types

| Event                     | Description                                     |
| ------------------------- | ----------------------------------------------- |
| `model.downloadProgress`  | Model download progress (`modelId`, `progress`) |
| `model.downloadCompleted` | Model download finished                         |
| `model.loadCompleted`     | Model loaded into memory                        |
| `model.unloaded`          | Model unloaded                                  |
| `generation.started`      | Text generation started                         |
| `generation.completed`    | Text generation completed                       |
| `generation.failed`       | Text generation failed                          |

## Model Sources

All models in RunAnywhere are sourced from **[HuggingFace](https://huggingface.co)**. The SDK provides a model registry that resolves compact model definitions into full download URLs and manages the complete lifecycle: registration -> download -> storage -> loading.

### How It Works

```mermaid theme={null}
graph LR
    A(registerModels) --> B(ModelManager)
    B --> C(downloadModel)
    C --> D(fetch from HuggingFace)
    D --> E(Store in OPFS)
    E --> F(loadModel)
    F --> G(Ready for Inference)

    style A fill:#334155,color:#fff,stroke:#334155
    style B fill:#475569,color:#fff,stroke:#475569
    style C fill:#475569,color:#fff,stroke:#475569
    style D fill:#ff6900,color:#fff,stroke:#ff6900
    style E fill:#fb2c36,color:#fff,stroke:#fb2c36
    style F fill:#475569,color:#fff,stroke:#475569
    style G fill:#334155,color:#fff,stroke:#334155
```

When you register a model with a `repo` field, the SDK constructs the download URL automatically:

```
https://huggingface.co/{repo}/resolve/main/{filename}
```

For example, `repo: 'LiquidAI/LFM2-350M-GGUF'` with `files: ['LFM2-350M-Q4_K_M.gguf']` resolves to:

```
https://huggingface.co/LiquidAI/LFM2-350M-GGUF/resolve/main/LFM2-350M-Q4_K_M.gguf
```

### CompactModelDef

The `registerModels` API accepts an array of compact model definitions:

```typescript theme={null}
import { ModelCategory, LLMFramework } from '@runanywhere/web'

interface CompactModelDef {
  /** Unique identifier for the model */
  id: string

  /** Human-readable model name */
  name: string

  /** Inference backend */
  framework: LLMFramework // LLMFramework.LlamaCpp | LLMFramework.ONNX

  /** Model category (determines which engine handles it) */
  modality: ModelCategory
  // ModelCategory.Language         — LLM text generation
  // ModelCategory.Multimodal       — VLM image + text
  // ModelCategory.SpeechRecognition — STT
  // ModelCategory.SpeechSynthesis   — TTS
  // ModelCategory.Audio             — VAD

  /** HuggingFace repo path (e.g., 'LiquidAI/LFM2-350M-GGUF') */
  repo?: string

  /** Model files in the repo. First file = primary, rest = additional (e.g., mmproj for VLM) */
  files?: string[]

  /** Direct URL (alternative to repo + files) */
  url?: string

  /** 'archive' for tar.gz bundles (STT/TTS), omit for individual GGUF files */
  artifactType?: 'archive'

  /** Estimated memory requirement in bytes */
  memoryRequirement?: number
}
```

### URL Resolution Rules

| Config                            | URL Pattern                                         | Use Case                     |
| --------------------------------- | --------------------------------------------------- | ---------------------------- |
| `repo` + `files`                  | `https://huggingface.co/{repo}/resolve/main/{file}` | Most models (LLM, VLM)       |
| `url` only                        | Used as-is                                          | Direct links, non-HF sources |
| `url` + `artifactType: 'archive'` | Used as-is, extracted after download                | STT/TTS model bundles        |

## Model Management

All model management operations use `ModelManager` from `@runanywhere/web`.

### Register Models

```typescript theme={null}
import { RunAnywhere, ModelCategory, LLMFramework } from '@runanywhere/web'

RunAnywhere.registerModels([
  // LLM: Liquid AI LFM2
  {
    id: 'lfm2-350m-q4_k_m',
    name: 'LFM2 350M Q4_K_M',
    repo: 'LiquidAI/LFM2-350M-GGUF',
    files: ['LFM2-350M-Q4_K_M.gguf'],
    framework: LLMFramework.LlamaCpp,
    modality: ModelCategory.Language,
    memoryRequirement: 250_000_000,
  },

  // VLM: Liquid AI LFM2-VL (two files: model + mmproj)
  {
    id: 'lfm2-vl-450m-q4_0',
    name: 'LFM2-VL 450M Q4_0',
    repo: 'runanywhere/LFM2-VL-450M-GGUF',
    files: ['LFM2-VL-450M-Q4_0.gguf', 'mmproj-LFM2-VL-450M-Q8_0.gguf'],
    framework: LLMFramework.LlamaCpp,
    modality: ModelCategory.Multimodal,
    memoryRequirement: 500_000_000,
  },

  // STT: Whisper (archive bundle from direct URL)
  {
    id: 'sherpa-onnx-whisper-tiny.en',
    name: 'Whisper Tiny English (ONNX)',
    url: 'https://huggingface.co/runanywhere/sherpa-onnx-whisper-tiny.en/resolve/main/sherpa-onnx-whisper-tiny.en.tar.gz',
    framework: LLMFramework.ONNX,
    modality: ModelCategory.SpeechRecognition,
    memoryRequirement: 105_000_000,
    artifactType: 'archive' as const,
  },

  // TTS: Piper (archive bundle)
  {
    id: 'vits-piper-en_US-lessac-medium',
    name: 'Piper TTS US English (Lessac)',
    url: 'https://huggingface.co/runanywhere/vits-piper-en_US-lessac-medium/resolve/main/vits-piper-en_US-lessac-medium.tar.gz',
    framework: LLMFramework.ONNX,
    modality: ModelCategory.SpeechSynthesis,
    memoryRequirement: 65_000_000,
    artifactType: 'archive' as const,
  },

  // VAD: Silero (single ONNX file)
  {
    id: 'silero-vad-v5',
    name: 'Silero VAD v5',
    url: 'https://huggingface.co/runanywhere/silero-vad-v5/resolve/main/silero_vad.onnx',
    files: ['silero_vad.onnx'],
    framework: LLMFramework.ONNX,
    modality: ModelCategory.Audio,
    memoryRequirement: 5_000_000,
  },
])
```

### Available Models on HuggingFace

#### LLM Models

| Model              | HuggingFace Repo                                                                            | Size    | Notes                             |
| ------------------ | ------------------------------------------------------------------------------------------- | ------- | --------------------------------- |
| **LFM2 350M**      | [`LiquidAI/LFM2-350M-GGUF`](https://huggingface.co/LiquidAI/LFM2-350M-GGUF)                 | \~250MB | Liquid AI, ultra-compact          |
| **LFM2 1.2B Tool** | [`LiquidAI/LFM2-1.2B-Tool-GGUF`](https://huggingface.co/LiquidAI/LFM2-1.2B-Tool-GGUF)       | \~800MB | Liquid AI, tool-calling optimized |
| Qwen 2.5 0.5B      | [`Qwen/Qwen2.5-0.5B-Instruct-GGUF`](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF) | \~400MB | Multilingual                      |

#### VLM Models

| Model            | HuggingFace Repo                                                                                          | Size    | Notes                   |
| ---------------- | --------------------------------------------------------------------------------------------------------- | ------- | ----------------------- |
| **LFM2-VL 450M** | [`runanywhere/LFM2-VL-450M-GGUF`](https://huggingface.co/runanywhere/LFM2-VL-450M-GGUF)                   | \~500MB | Liquid AI, smallest VLM |
| SmolVLM 500M     | [`runanywhere/SmolVLM-500M-Instruct-GGUF`](https://huggingface.co/runanywhere/SmolVLM-500M-Instruct-GGUF) | \~500MB | HuggingFace SmolVLM     |
| Qwen2-VL 2B      | [`runanywhere/Qwen2-VL-2B-Instruct-GGUF`](https://huggingface.co/runanywhere/Qwen2-VL-2B-Instruct-GGUF)   | \~1.5GB | Higher quality          |

#### STT / TTS / VAD Models

| Model              | URL                                          | Size    | Notes            |
| ------------------ | -------------------------------------------- | ------- | ---------------- |
| Whisper Tiny EN    | `runanywhere/sherpa-onnx-whisper-tiny.en`    | \~105MB | Archive bundle   |
| Piper TTS (Lessac) | `runanywhere/vits-piper-en_US-lessac-medium` | \~65MB  | Archive bundle   |
| Silero VAD v5      | `runanywhere/silero-vad-v5`                  | \~5MB   | Single ONNX file |

### Download and Load

```typescript theme={null}
import { ModelManager, ModelCategory, EventBus } from '@runanywhere/web'

// Track download progress
EventBus.shared.on('model.downloadProgress', (evt) => {
  console.log(`Downloading ${evt.modelId}: ${((evt.progress ?? 0) * 100).toFixed(0)}%`)
})

// Download to OPFS (persists across sessions)
await ModelManager.downloadModel('lfm2-350m-q4_k_m')

// Load into memory for inference
await ModelManager.loadModel('lfm2-350m-q4_k_m')

// Check loaded models
const allModels = ModelManager.getModels()
const loaded = ModelManager.getLoadedModel(ModelCategory.Language)
console.log('Loaded:', loaded?.id)
```

### Multi-Model Loading with `coexist`

By default, loading a new model unloads any previously loaded model. For the voice pipeline (which needs STT + LLM + TTS + VAD simultaneously), pass `coexist: true`:

```typescript theme={null}
// Load all 4 voice models without unloading each other
await ModelManager.loadModel('silero-vad-v5', { coexist: true })
await ModelManager.loadModel('sherpa-onnx-whisper-tiny.en', { coexist: true })
await ModelManager.loadModel('lfm2-350m-q4_k_m', { coexist: true })
await ModelManager.loadModel('vits-piper-en_US-lessac-medium', { coexist: true })
```

### Storage (OPFS)

Downloaded models are persisted in the browser's **Origin Private File System (OPFS)**. This means:

* Models survive page refreshes and browser restarts
* Each origin (domain) has its own isolated storage
* The SDK auto-detects previously downloaded models on page load
* If storage quota is exceeded, the SDK auto-evicts least-recently-used models

<Warning>
  **Large model downloads (>200MB) can crash the browser tab** on memory-constrained devices. The
  OPFS write buffers data in memory before committing. If the tab crashes mid-download, refresh and
  retry — the SDK can resume partial downloads. Start with smaller models (LFM2 350M at \~250MB)
  before attempting larger ones (Qwen2-VL 2B at \~1.5GB).
</Warning>

### Delete Models

```typescript theme={null}
import { ModelManager } from '@runanywhere/web'

// Delete a specific model from OPFS
await ModelManager.deleteModel('lfm2-350m-q4_k_m')
```

## Audio Utilities

<Note>
  Audio utilities (`AudioCapture`, `AudioPlayback`) are in `@runanywhere/web-onnx`, while video
  utilities (`VideoCapture`) are in `@runanywhere/web-llamacpp`. Don't mix up the import sources.
</Note>

### AudioCapture (Microphone)

`AudioCapture` is in `@runanywhere/web-onnx`. Configuration is passed to the constructor, and callbacks are passed to `start()`:

```typescript theme={null}
import { AudioCapture } from '@runanywhere/web-onnx'

const capture = new AudioCapture({ sampleRate: 16000 })

await capture.start(
  (chunk: Float32Array) => {
    // Process audio samples (e.g., feed to VAD)
  },
  (level: number) => {
    // Audio level 0.0-1.0 (for UI visualization)
  }
)

// Stop when done
capture.stop()
```

### AudioPlayback (Speaker)

`AudioPlayback` is in `@runanywhere/web-onnx`:

```typescript theme={null}
import { AudioPlayback } from '@runanywhere/web-onnx'

const player = new AudioPlayback({ sampleRate: 22050 })

await player.play(audioFloat32Array, 22050)

// Clean up resources
player.dispose()
```

### VideoCapture (Camera)

`VideoCapture` is in `@runanywhere/web-llamacpp`:

```typescript theme={null}
import { VideoCapture } from '@runanywhere/web-llamacpp'

const camera = new VideoCapture({ facingMode: 'environment' }) // or 'user' for selfie
await camera.start()

// Add the video preview to the DOM
document.getElementById('preview')!.appendChild(camera.videoElement)

// Capture a frame (downscaled to 256px max dimension)
const frame = camera.captureFrame(256)
// frame.rgbPixels: Uint8Array (RGB, no alpha)
// frame.width, frame.height: actual dimensions

// Check state
console.log('Is capturing:', camera.isCapturing)

camera.stop()
```

## Acceleration

### GPU Acceleration

The SDK auto-detects WebGPU availability when `LlamaCPP.register()` is called:

```typescript theme={null}
import { LlamaCPP } from '@runanywhere/web-llamacpp'

await LlamaCPP.register()
console.log('Acceleration:', LlamaCPP.accelerationMode) // 'webgpu' or 'cpu'
```

| Mode     | Description                                            |
| -------- | ------------------------------------------------------ |
| `webgpu` | WebGPU detected and WASM loaded successfully           |
| `cpu`    | CPU-only WASM (WebGPU not available or failed to load) |

<Note>
  If the WebGPU WASM file returns a 404, the SDK gracefully falls back to CPU mode. This is normal
  behavior -- check `LlamaCPP.accelerationMode` to confirm which mode is active.
</Note>

## Related

<CardGroup cols={2}>
  <Card title="Error Handling" icon="triangle-exclamation" href="/web/error-handling">
    Handle errors gracefully
  </Card>

  <Card title="Best Practices" icon="star" href="/web/best-practices">
    Optimization tips
  </Card>
</CardGroup>
