Installation - RunAnywhere Documentation

Early Beta — The Web SDK is in early beta. APIs may change between releases.

Package Installation

The Web SDK is split into three packages. Install all three to access every feature, or pick only the backends you need:

npm install @runanywhere/web @runanywhere/web-llamacpp @runanywhere/web-onnx

Or with yarn:

yarn add @runanywhere/web @runanywhere/web-llamacpp @runanywhere/web-onnx

Package Breakdown

Package	Version	What’s Inside
`@runanywhere/web`	`0.1.0-beta.9`	Core SDK: `RunAnywhere`, `ModelManager`, `ModelCategory`, `EventBus`, `VoicePipeline`, `SDKEnvironment`, `LLMFramework`, `CompactModelDef`
`@runanywhere/web-llamacpp`	`0.1.0-beta.9`	LLM/VLM backend: `LlamaCPP`, `TextGeneration`, `VLMWorkerBridge`, `VideoCapture`, `startVLMWorkerRuntime`
`@runanywhere/web-onnx`	`0.1.0-beta.9`	STT/TTS/VAD backend: `ONNX`, `AudioCapture`, `AudioPlayback`, `VAD`, `SpeechActivity`

If you only need LLM text generation, you can skip @runanywhere/web-onnx. If you only need STT/TTS/VAD, you can skip @runanywhere/web-llamacpp. The core @runanywhere/web package is always required.

Bundler Configuration

Vite (Recommended)

The starter app uses Vite. Here is the complete vite.config.ts that handles WASM serving, Cross-Origin Isolation, Web Workers, and production builds:

vite.config.ts

import { defineConfig, type Plugin } from 'vite'
import react from '@vitejs/plugin-react'
import path from 'path'
import fs from 'fs'
import { fileURLToPath } from 'url'

const __dir = path.dirname(fileURLToPath(import.meta.url))

/**
 * Copies WASM binaries from @runanywhere npm packages into dist/assets/
 * for production builds. In dev mode Vite serves node_modules directly.
 */
function copyWasmPlugin(): Plugin {
  const llamacppWasm = path.resolve(__dir, 'node_modules/@runanywhere/web-llamacpp/wasm')
  const onnxWasm = path.resolve(__dir, 'node_modules/@runanywhere/web-onnx/wasm')

  return {
    name: 'copy-wasm',
    writeBundle(options) {
      const outDir = options.dir ?? path.resolve(__dir, 'dist')
      const assetsDir = path.join(outDir, 'assets')
      fs.mkdirSync(assetsDir, { recursive: true })

      // LlamaCpp WASM binaries (LLM/VLM)
      for (const file of [
        'racommons-llamacpp.wasm',
        'racommons-llamacpp.js',
        'racommons-llamacpp-webgpu.wasm',
        'racommons-llamacpp-webgpu.js',
      ]) {
        const src = path.join(llamacppWasm, file)
        if (fs.existsSync(src)) {
          fs.copyFileSync(src, path.join(assetsDir, file))
        }
      }

      // Sherpa-ONNX: copy all files in sherpa/ subdirectory (STT/TTS/VAD)
      const sherpaDir = path.join(onnxWasm, 'sherpa')
      const sherpaOut = path.join(assetsDir, 'sherpa')
      if (fs.existsSync(sherpaDir)) {
        fs.mkdirSync(sherpaOut, { recursive: true })
        for (const file of fs.readdirSync(sherpaDir)) {
          fs.copyFileSync(path.join(sherpaDir, file), path.join(sherpaOut, file))
        }
      }
    },
  }
}

export default defineConfig({
  plugins: [react(), copyWasmPlugin()],
  server: {
    headers: {
      'Cross-Origin-Opener-Policy': 'same-origin',
      'Cross-Origin-Embedder-Policy': 'credentialless',
    },
  },
  assetsInclude: ['**/*.wasm'],
  worker: { format: 'es' },
  optimizeDeps: {
    // CRITICAL: exclude WASM packages from pre-bundling so import.meta.url
    // resolves correctly for automatic WASM file discovery
    exclude: ['@runanywhere/web-llamacpp', '@runanywhere/web-onnx'],
  },
})

optimizeDeps.exclude is critical. Without excluding the WASM packages from Vite’s pre-bundling, import.meta.url resolves to the wrong paths and WASM files won’t be found at runtime. This is the most common cause of “WASM not loading” errors with Vite.

Webpack

webpack.config.js

module.exports = {
  module: {
    rules: [{ test: /\.wasm$/, type: 'asset/resource' }],
  },
}

Next.js

next.config.js

/** @type {import('next').NextConfig} */
const nextConfig = {
  async headers() {
    return [
      {
        source: '/(.*)',
        headers: [
          { key: 'Cross-Origin-Opener-Policy', value: 'same-origin' },
          { key: 'Cross-Origin-Embedder-Policy', value: 'credentialless' },
        ],
      },
    ]
  },
  webpack: (config) => {
    config.module.rules.push({
      test: /\.wasm$/,
      type: 'asset/resource',
    })
    return config
  },
}

module.exports = nextConfig

Cross-Origin Isolation Headers

For multi-threaded WASM (significantly better performance), your server must set two HTTP headers:

Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: credentialless

These enable SharedArrayBuffer, which is required for multi-threaded WASM. Without them, the SDK falls back to single-threaded mode.

Always use credentialless, NOT require-corp for COEP. Using require-corp will break WASM loading in most setups because it requires every sub-resource (including Vite’s /@fs/ served files, CDN assets, fonts, and worker scripts) to include a Cross-Origin-Resource-Policy header. In practice, require-corp causes silent failures where module scripts get blocked with “non-JavaScript MIME type” errors. Use credentialless — it enables SharedArrayBuffer without breaking cross-origin resource loading.

Iframe environments (Replit, CodeSandbox, StackBlitz): The Cross-Origin-Opener-Policy: same-origin header breaks iframe embedding because the parent and child frames are on different origins. In these environments, SharedArrayBuffer will NOT be available regardless of your header configuration. The SDK will fall back to single-threaded WASM mode, which still works but is slower. This is an environment limitation, not a bug. When accessed directly (not in an iframe), the headers work correctly.

Server Configuration

Vercel
Netlify
Cloudflare Pages
Nginx
Apache

vercel.json

{
  "headers": [
    {
      "source": "/(.*)",
      "headers": [
        { "key": "Cross-Origin-Opener-Policy", "value": "same-origin" },
        { "key": "Cross-Origin-Embedder-Policy", "value": "credentialless" }
      ]
    },
    {
      "source": "/assets/(.*).wasm",
      "headers": [
        { "key": "Content-Type", "value": "application/wasm" },
        { "key": "Cache-Control", "value": "public, max-age=31536000, immutable" }
      ]
    }
  ]
}

netlify.toml

[[headers]]
  for = "/*"
  [headers.values]
    Cross-Origin-Opener-Policy = "same-origin"
    Cross-Origin-Embedder-Policy = "credentialless"

[[headers]]
for = "/assets/\*.wasm"
[headers.values]
Content-Type = "application/wasm"
Cache-Control = "public, max-age=31536000, immutable"

Create a _headers file in the project root:

/\*
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: credentialless

/assets/\*.wasm
Content-Type: application/wasm
Cache-Control: public, max-age=31536000, immutable

nginx.conf

server {
    listen 443 ssl;
    server_name app.example.com;

    add_header Cross-Origin-Opener-Policy "same-origin" always;
    add_header Cross-Origin-Embedder-Policy "credentialless" always;

    types {
        application/wasm wasm;
    }

    location ~* \.wasm$ {
        add_header Cross-Origin-Opener-Policy "same-origin" always;
        add_header Cross-Origin-Embedder-Policy "credentialless" always;
        add_header Cache-Control "public, max-age=31536000, immutable";
    }
}

.htaccess

<IfModule mod_headers.c>
    Header always set Cross-Origin-Opener-Policy "same-origin"
    Header always set Cross-Origin-Embedder-Policy "credentialless"
</IfModule>

AddType application/wasm .wasm

Package Contents

`@runanywhere/web-llamacpp`

Component	Description	Size
TypeScript API	`TextGeneration`, `VLMWorkerBridge`, `VideoCapture`, `LlamaCPP`	~50KB
`racommons-llamacpp.wasm`	CPU WASM binary (llama.cpp)	~3.6MB
`racommons-llamacpp.js`	CPU WASM glue code	~200KB
`racommons-llamacpp-webgpu.wasm`	WebGPU WASM binary (optional)	~4MB
`racommons-llamacpp-webgpu.js`	WebGPU WASM glue code (optional)	~200KB

`@runanywhere/web-onnx`

Component	Description	Size
TypeScript API	`ONNX`, `AudioCapture`, `AudioPlayback`, `VAD`, `SpeechActivity`	~30KB
`sherpa/` directory	sherpa-onnx WASM + glue (STT/TTS/VAD)	~12MB

The sherpa-onnx WASM module is only loaded when you call ONNX.register(). If you only need LLM text generation, you don’t need @runanywhere/web-onnx at all.

Supported Model Formats

Format	Extension	Backend	Package	Use Case
GGUF	`.gguf`	llama.cpp	`web-llamacpp`	LLM text generation, VLM
ONNX	`.onnx`	sherpa-onnx	`web-onnx`	STT, TTS, VAD

Browser Compatibility

Browser	Version	Status
Chrome	96+	Fully supported
Edge	96+	Fully supported
Firefox	119+	Supported (no WebGPU)
Safari	17+	Basic support (limited OPFS)

Safari has known reliability issues with OPFS. Mobile browsers have memory constraints that limit larger models. Chrome/Edge 120+ is recommended for the best experience.

Troubleshooting

”SharedArrayBuffer is not defined”

Cause: Missing Cross-Origin Isolation headers. Fix: Add the COOP/COEP headers to your server configuration. The SDK will still work in single-threaded mode without them, but performance will be reduced.

In iframe-based environments (Replit preview, CodeSandbox, StackBlitz), SharedArrayBuffer is unavailable even with correct headers because COOP: same-origin conflicts with iframe embedding. The SDK still works in single-threaded mode. Access the app directly (not through the iframe preview) for full multi-threaded performance.

WASM file not loading

Cause: Bundler not configured correctly, or optimizeDeps.exclude missing for Vite. Fix: For Vite, ensure @runanywhere/web-llamacpp and @runanywhere/web-onnx are in optimizeDeps.exclude. For other bundlers, configure .wasm as static assets.

WASM loads HTML instead of binary (production)

Cause: Your server has a SPA catch-all route (e.g., Express app.get('*', (req, res) => res.sendFile('index.html'))) that serves HTML for any unmatched path, including .wasm file requests. The WASM compiler then receives HTML bytes (3c 21 44 4f = <!DO…) instead of the binary, causing a cryptic error. Error message: CompileError: WebAssembly.instantiate(): expected magic word 00 61 73 6d, found 3c 21 44 4f Fix: Ensure your server serves .wasm files with the correct MIME type before the SPA catch-all. For Express:

import express from 'express'

const app = express()

// Serve static assets BEFORE the catch-all — wasm files need correct MIME type
app.use(express.static('dist/public', {
  setHeaders: (res, filePath) => {
    if (filePath.endsWith('.wasm')) {
      res.setHeader('Content-Type', 'application/wasm')
    }
  },
}))

// SPA catch-all AFTER static files
app.get('*', (req, res) => {
  // Only serve index.html for non-asset requests
  if (!req.path.match(/\.(js|css|wasm|json|png|jpg|svg|ico|woff2?)$/)) {
    res.sendFile('index.html', { root: 'dist/public' })
  } else {
    res.status(404).end()
  }
})

This is the #1 production deployment issue. The copyWasmPlugin() correctly copies .wasm files to dist/assets/, but if your server’s catch-all route intercepts the request first, the browser receives HTML instead of the WASM binary. Always serve static assets before SPA routing.

VLM Worker fails with “non-JavaScript MIME type”

Cause: The VLM Web Worker script URL resolves to a path that returns HTML (same catch-all issue as above), or Vite’s ?worker&url import isn’t configured correctly. Error message: Failed to load module script: The server responded with a non-JavaScript MIME type of "text/html" Fix:

Ensure worker: { format: 'es' } is in your Vite config
Ensure the catch-all route doesn’t intercept .js file requests (see fix above)
For the ?worker&url import, add a TypeScript declaration:

// src/vite-env.d.ts or global.d.ts
declare module '*?worker&url' {
  const url: string
  export default url
}

OPFS storage not persisting

Cause: Incognito/Private mode or browser eviction. Fix: Ensure you are not in private browsing mode. Safari has known OPFS issues — Chrome/Edge is recommended.

Large model download crashes the tab

Cause: Downloading models larger than ~200MB can exhaust available browser memory, especially on memory-constrained devices or when other tabs are open. The OPFS write operation buffers the entire model before committing. Fix:

Close other browser tabs to free memory before downloading large models
Start with smaller models (350M-500M parameter models are typically under 300MB)
Monitor model.downloadProgress events to detect stalls
If the tab crashes during download, refresh and retry — OPFS supports resuming from partial downloads

WebGPU WASM 404 in console

Cause: The SDK tries to load racommons-llamacpp-webgpu.wasm for GPU acceleration but it may not be available. Fix: This is harmless. The SDK gracefully falls back to CPU mode. You can suppress the 404 by ensuring the WebGPU WASM files are copied to your assets directory.

Next Steps

Quick Start

Initialize the SDK and run your first browser inference

Documentation Index

​Package Installation

​Package Breakdown

​Bundler Configuration

​Vite (Recommended)

​Webpack

​Next.js

​Cross-Origin Isolation Headers

​Server Configuration

​Package Contents

​@runanywhere/web-llamacpp

​@runanywhere/web-onnx

​Supported Model Formats

​Browser Compatibility

​Troubleshooting

​”SharedArrayBuffer is not defined”

​WASM file not loading

​WASM loads HTML instead of binary (production)

​VLM Worker fails with “non-JavaScript MIME type”

​OPFS storage not persisting

​Large model download crashes the tab

​WebGPU WASM 404 in console

​Next Steps

Quick Start

Package Installation

Package Breakdown

Bundler Configuration

Vite (Recommended)

Webpack

Next.js

Cross-Origin Isolation Headers

Server Configuration

Package Contents

`@runanywhere/web-llamacpp`

`@runanywhere/web-onnx`

Supported Model Formats

Browser Compatibility

Troubleshooting

”SharedArrayBuffer is not defined”

WASM file not loading

WASM loads HTML instead of binary (production)

VLM Worker fails with “non-JavaScript MIME type”

OPFS storage not persisting

Large model download crashes the tab

WebGPU WASM 404 in console

Next Steps