Skip to main content

Overview

The RunAnywhere Swift SDK is a production-grade, on-device AI SDK for Apple platforms. It enables developers to run AI models directly on Apple devices without requiring network connectivity for inference, ensuring minimal latency and maximum privacy for your users. The SDK provides a unified interface to multiple AI capabilities:

Key Capabilities

  • Multi-backend architecture – Choose from LlamaCPP (GGUF models), ONNX Runtime, or Apple Foundation Models
  • Metal acceleration – GPU-accelerated inference on Apple Silicon
  • Event-driven design – Subscribe to SDK events for reactive UI updates
  • Production-ready – Built-in analytics, logging, device registration, and model lifecycle management

Core Philosophy

All AI inference runs locally, ensuring low latency and data privacy. Once models are downloaded, no network connection is required for inference.
Backend engines are optional modules—include only what you need. This keeps your app binary size minimal.
Audio and text data never leaves the device unless explicitly configured. Only anonymous analytics are collected by default.
Subscribe to SDK events for reactive UI updates and observability. Track generation progress, model loading, and errors in real-time.

Features

Language Models (LLM)

  • On-device text generation with streaming support
  • Structured output generation with Generatable protocol
  • System prompts and customizable generation parameters
  • Support for thinking/reasoning models with token extraction
  • Multiple framework backends (LlamaCPP, Apple Foundation Models)

Speech-to-Text (STT)

  • Real-time streaming transcription
  • Batch audio transcription
  • Multi-language support
  • Whisper-based models via ONNX Runtime

Text-to-Speech (TTS)

  • Neural voice synthesis with ONNX models
  • System voices via AVSpeechSynthesizer
  • Streaming audio generation for long text
  • Customizable voice, pitch, rate, and volume

Voice Activity Detection (VAD)

  • Energy-based speech detection
  • Configurable sensitivity thresholds
  • Real-time audio stream processing

Voice Agent Pipeline

  • Full VAD → STT → LLM → TTS orchestration
  • Complete voice conversation flow
  • Streaming and batch processing modes

Model Management

  • Automatic model discovery and catalog sync
  • Download with progress tracking (download, extract, validate stages)
  • In-memory model storage with file system caching
  • Framework-specific model assignment

System Requirements

PlatformMinimum Version
iOS17.0+
macOS14.0+
tvOS17.0+
watchOS10.0+
Swift Version: 5.9+ Xcode: 15.2+
Some optional modules have higher runtime requirements: - Apple Foundation Models (RunAnywhereAppleAI): iOS 26+ / macOS 26+ at runtime

SDK Modules

ModulePurpose
RunAnywhereCore SDK (required)
RunAnywhereLlamaCPPLLM text generation with GGUF models
RunAnywhereONNXSTT/TTS/VAD via ONNX Runtime

Next Steps