Skip to main content
Early Beta — The Web SDK is in early beta. APIs may change between releases.

Package Installation

The Web SDK is split into three packages. Install all three to access every feature, or pick only the backends you need:
npm install @runanywhere/web @runanywhere/web-llamacpp @runanywhere/web-onnx
Or with yarn:
yarn add @runanywhere/web @runanywhere/web-llamacpp @runanywhere/web-onnx

Package Breakdown

PackageVersionWhat’s Inside
@runanywhere/web0.1.0-beta.9Core SDK: RunAnywhere, ModelManager, ModelCategory, EventBus, VoicePipeline, SDKEnvironment, LLMFramework, CompactModelDef
@runanywhere/web-llamacpp0.1.0-beta.9LLM/VLM backend: LlamaCPP, TextGeneration, VLMWorkerBridge, VideoCapture, startVLMWorkerRuntime
@runanywhere/web-onnx0.1.0-beta.9STT/TTS/VAD backend: ONNX, AudioCapture, AudioPlayback, VAD, SpeechActivity
If you only need LLM text generation, you can skip @runanywhere/web-onnx. If you only need STT/TTS/VAD, you can skip @runanywhere/web-llamacpp. The core @runanywhere/web package is always required.

Bundler Configuration

The starter app uses Vite. Here is the complete vite.config.ts that handles WASM serving, Cross-Origin Isolation, Web Workers, and production builds:
vite.config.ts
import { defineConfig, type Plugin } from 'vite'
import react from '@vitejs/plugin-react'
import path from 'path'
import fs from 'fs'
import { fileURLToPath } from 'url'

const __dir = path.dirname(fileURLToPath(import.meta.url))

/**
 * Copies WASM binaries from @runanywhere npm packages into dist/assets/
 * for production builds. In dev mode Vite serves node_modules directly.
 */
function copyWasmPlugin(): Plugin {
  const llamacppWasm = path.resolve(__dir, 'node_modules/@runanywhere/web-llamacpp/wasm')
  const onnxWasm = path.resolve(__dir, 'node_modules/@runanywhere/web-onnx/wasm')

  return {
    name: 'copy-wasm',
    writeBundle(options) {
      const outDir = options.dir ?? path.resolve(__dir, 'dist')
      const assetsDir = path.join(outDir, 'assets')
      fs.mkdirSync(assetsDir, { recursive: true })

      // LlamaCpp WASM binaries (LLM/VLM)
      for (const file of [
        'racommons-llamacpp.wasm',
        'racommons-llamacpp.js',
        'racommons-llamacpp-webgpu.wasm',
        'racommons-llamacpp-webgpu.js',
      ]) {
        const src = path.join(llamacppWasm, file)
        if (fs.existsSync(src)) {
          fs.copyFileSync(src, path.join(assetsDir, file))
        }
      }

      // Sherpa-ONNX: copy all files in sherpa/ subdirectory (STT/TTS/VAD)
      const sherpaDir = path.join(onnxWasm, 'sherpa')
      const sherpaOut = path.join(assetsDir, 'sherpa')
      if (fs.existsSync(sherpaDir)) {
        fs.mkdirSync(sherpaOut, { recursive: true })
        for (const file of fs.readdirSync(sherpaDir)) {
          fs.copyFileSync(path.join(sherpaDir, file), path.join(sherpaOut, file))
        }
      }
    },
  }
}

export default defineConfig({
  plugins: [react(), copyWasmPlugin()],
  server: {
    headers: {
      'Cross-Origin-Opener-Policy': 'same-origin',
      'Cross-Origin-Embedder-Policy': 'credentialless',
    },
  },
  assetsInclude: ['**/*.wasm'],
  worker: { format: 'es' },
  optimizeDeps: {
    // CRITICAL: exclude WASM packages from pre-bundling so import.meta.url
    // resolves correctly for automatic WASM file discovery
    exclude: ['@runanywhere/web-llamacpp', '@runanywhere/web-onnx'],
  },
})
optimizeDeps.exclude is critical. Without excluding the WASM packages from Vite’s pre-bundling, import.meta.url resolves to the wrong paths and WASM files won’t be found at runtime. This is the most common cause of “WASM not loading” errors with Vite.

Webpack

webpack.config.js
module.exports = {
  module: {
    rules: [{ test: /\.wasm$/, type: 'asset/resource' }],
  },
}

Next.js

next.config.js
/** @type {import('next').NextConfig} */
const nextConfig = {
  async headers() {
    return [
      {
        source: '/(.*)',
        headers: [
          { key: 'Cross-Origin-Opener-Policy', value: 'same-origin' },
          { key: 'Cross-Origin-Embedder-Policy', value: 'credentialless' },
        ],
      },
    ]
  },
  webpack: (config) => {
    config.module.rules.push({
      test: /\.wasm$/,
      type: 'asset/resource',
    })
    return config
  },
}

module.exports = nextConfig

Cross-Origin Isolation Headers

For multi-threaded WASM (significantly better performance), your server must set two HTTP headers:
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: credentialless
These enable SharedArrayBuffer, which is required for multi-threaded WASM. Without them, the SDK falls back to single-threaded mode.
Always use credentialless, NOT require-corp for COEP. Using require-corp will break WASM loading in most setups because it requires every sub-resource (including Vite’s /@fs/ served files, CDN assets, fonts, and worker scripts) to include a Cross-Origin-Resource-Policy header. In practice, require-corp causes silent failures where module scripts get blocked with “non-JavaScript MIME type” errors. Use credentialless — it enables SharedArrayBuffer without breaking cross-origin resource loading.
Iframe environments (Replit, CodeSandbox, StackBlitz): The Cross-Origin-Opener-Policy: same-origin header breaks iframe embedding because the parent and child frames are on different origins. In these environments, SharedArrayBuffer will NOT be available regardless of your header configuration. The SDK will fall back to single-threaded WASM mode, which still works but is slower. This is an environment limitation, not a bug. When accessed directly (not in an iframe), the headers work correctly.

Server Configuration

vercel.json
{
  "headers": [
    {
      "source": "/(.*)",
      "headers": [
        { "key": "Cross-Origin-Opener-Policy", "value": "same-origin" },
        { "key": "Cross-Origin-Embedder-Policy", "value": "credentialless" }
      ]
    },
    {
      "source": "/assets/(.*).wasm",
      "headers": [
        { "key": "Content-Type", "value": "application/wasm" },
        { "key": "Cache-Control", "value": "public, max-age=31536000, immutable" }
      ]
    }
  ]
}

Package Contents

@runanywhere/web-llamacpp

ComponentDescriptionSize
TypeScript APITextGeneration, VLMWorkerBridge, VideoCapture, LlamaCPP~50KB
racommons-llamacpp.wasmCPU WASM binary (llama.cpp)~3.6MB
racommons-llamacpp.jsCPU WASM glue code~200KB
racommons-llamacpp-webgpu.wasmWebGPU WASM binary (optional)~4MB
racommons-llamacpp-webgpu.jsWebGPU WASM glue code (optional)~200KB

@runanywhere/web-onnx

ComponentDescriptionSize
TypeScript APIONNX, AudioCapture, AudioPlayback, VAD, SpeechActivity~30KB
sherpa/ directorysherpa-onnx WASM + glue (STT/TTS/VAD)~12MB
The sherpa-onnx WASM module is only loaded when you call ONNX.register(). If you only need LLM text generation, you don’t need @runanywhere/web-onnx at all.

Supported Model Formats

FormatExtensionBackendPackageUse Case
GGUF.ggufllama.cppweb-llamacppLLM text generation, VLM
ONNX.onnxsherpa-onnxweb-onnxSTT, TTS, VAD

Browser Compatibility

BrowserVersionStatus
Chrome96+Fully supported
Edge96+Fully supported
Firefox119+Supported (no WebGPU)
Safari17+Basic support (limited OPFS)
Safari has known reliability issues with OPFS. Mobile browsers have memory constraints that limit larger models. Chrome/Edge 120+ is recommended for the best experience.

Troubleshooting

”SharedArrayBuffer is not defined”

Cause: Missing Cross-Origin Isolation headers. Fix: Add the COOP/COEP headers to your server configuration. The SDK will still work in single-threaded mode without them, but performance will be reduced.
In iframe-based environments (Replit preview, CodeSandbox, StackBlitz), SharedArrayBuffer is unavailable even with correct headers because COOP: same-origin conflicts with iframe embedding. The SDK still works in single-threaded mode. Access the app directly (not through the iframe preview) for full multi-threaded performance.

WASM file not loading

Cause: Bundler not configured correctly, or optimizeDeps.exclude missing for Vite. Fix: For Vite, ensure @runanywhere/web-llamacpp and @runanywhere/web-onnx are in optimizeDeps.exclude. For other bundlers, configure .wasm as static assets.

WASM loads HTML instead of binary (production)

Cause: Your server has a SPA catch-all route (e.g., Express app.get('*', (req, res) => res.sendFile('index.html'))) that serves HTML for any unmatched path, including .wasm file requests. The WASM compiler then receives HTML bytes (3c 21 44 4f = <!DO…) instead of the binary, causing a cryptic error. Error message: CompileError: WebAssembly.instantiate(): expected magic word 00 61 73 6d, found 3c 21 44 4f Fix: Ensure your server serves .wasm files with the correct MIME type before the SPA catch-all. For Express:
import express from 'express'

const app = express()

// Serve static assets BEFORE the catch-all — wasm files need correct MIME type
app.use(express.static('dist/public', {
  setHeaders: (res, filePath) => {
    if (filePath.endsWith('.wasm')) {
      res.setHeader('Content-Type', 'application/wasm')
    }
  },
}))

// SPA catch-all AFTER static files
app.get('*', (req, res) => {
  // Only serve index.html for non-asset requests
  if (!req.path.match(/\.(js|css|wasm|json|png|jpg|svg|ico|woff2?)$/)) {
    res.sendFile('index.html', { root: 'dist/public' })
  } else {
    res.status(404).end()
  }
})
This is the #1 production deployment issue. The copyWasmPlugin() correctly copies .wasm files to dist/assets/, but if your server’s catch-all route intercepts the request first, the browser receives HTML instead of the WASM binary. Always serve static assets before SPA routing.

VLM Worker fails with “non-JavaScript MIME type”

Cause: The VLM Web Worker script URL resolves to a path that returns HTML (same catch-all issue as above), or Vite’s ?worker&url import isn’t configured correctly. Error message: Failed to load module script: The server responded with a non-JavaScript MIME type of "text/html" Fix:
  1. Ensure worker: { format: 'es' } is in your Vite config
  2. Ensure the catch-all route doesn’t intercept .js file requests (see fix above)
  3. For the ?worker&url import, add a TypeScript declaration:
// src/vite-env.d.ts or global.d.ts
declare module '*?worker&url' {
  const url: string
  export default url
}

OPFS storage not persisting

Cause: Incognito/Private mode or browser eviction. Fix: Ensure you are not in private browsing mode. Safari has known OPFS issues — Chrome/Edge is recommended.

Large model download crashes the tab

Cause: Downloading models larger than ~200MB can exhaust available browser memory, especially on memory-constrained devices or when other tabs are open. The OPFS write operation buffers the entire model before committing. Fix:
  • Close other browser tabs to free memory before downloading large models
  • Start with smaller models (350M-500M parameter models are typically under 300MB)
  • Monitor model.downloadProgress events to detect stalls
  • If the tab crashes during download, refresh and retry — OPFS supports resuming from partial downloads

WebGPU WASM 404 in console

Cause: The SDK tries to load racommons-llamacpp-webgpu.wasm for GPU acceleration but it may not be available. Fix: This is harmless. The SDK gracefully falls back to CPU mode. You can suppress the 404 by ensuring the WebGPU WASM files are copied to your assets directory.

Next Steps

Quick Start

Initialize the SDK and run your first browser inference