Overview
The RunAnywhere Swift SDK is a production-grade, on-device AI SDK for Apple platforms. It enables developers to run AI models directly on Apple devices without requiring network connectivity for inference, ensuring minimal latency and maximum privacy for your users. The SDK provides a unified interface to multiple AI capabilities:LLM
Text generation with streaming support and structured output
STT
Speech-to-text transcription with multiple backends
TTS
Neural and system voice synthesis
VAD
Real-time voice activity detection
Key Capabilities
- Multi-backend architecture – Choose from LlamaCPP (GGUF models), ONNX Runtime, or Apple Foundation Models
- Metal acceleration – GPU-accelerated inference on Apple Silicon
- Event-driven design – Subscribe to SDK events for reactive UI updates
- Production-ready – Built-in analytics, logging, device registration, and model lifecycle management
Core Philosophy
On-Device First
On-Device First
All AI inference runs locally, ensuring low latency and data privacy. Once models are
downloaded, no network connection is required for inference.
Plugin Architecture
Plugin Architecture
Backend engines are optional modules—include only what you need. This keeps your app binary size
minimal.
Privacy by Design
Privacy by Design
Audio and text data never leaves the device unless explicitly configured. Only anonymous
analytics are collected by default.
Event-Driven
Event-Driven
Subscribe to SDK events for reactive UI updates and observability. Track generation progress,
model loading, and errors in real-time.
Features
Language Models (LLM)
- On-device text generation with streaming support
- Structured output generation with
Generatableprotocol - System prompts and customizable generation parameters
- Support for thinking/reasoning models with token extraction
- Multiple framework backends (LlamaCPP, Apple Foundation Models)
Speech-to-Text (STT)
- Real-time streaming transcription
- Batch audio transcription
- Multi-language support
- Whisper-based models via ONNX Runtime
Text-to-Speech (TTS)
- Neural voice synthesis with ONNX models
- System voices via AVSpeechSynthesizer
- Streaming audio generation for long text
- Customizable voice, pitch, rate, and volume
Voice Activity Detection (VAD)
- Energy-based speech detection
- Configurable sensitivity thresholds
- Real-time audio stream processing
Voice Agent Pipeline
- Full VAD → STT → LLM → TTS orchestration
- Complete voice conversation flow
- Streaming and batch processing modes
Model Management
- Automatic model discovery and catalog sync
- Download with progress tracking (download, extract, validate stages)
- In-memory model storage with file system caching
- Framework-specific model assignment
System Requirements
| Platform | Minimum Version |
|---|---|
| iOS | 17.0+ |
| macOS | 14.0+ |
| tvOS | 17.0+ |
| watchOS | 10.0+ |
Some optional modules have higher runtime requirements: - Apple Foundation Models
(
RunAnywhereAppleAI): iOS 26+ / macOS 26+ at runtimeSDK Modules
| Module | Purpose |
|---|---|
RunAnywhere | Core SDK (required) |
RunAnywhereLlamaCPP | LLM text generation with GGUF models |
RunAnywhereONNX | STT/TTS/VAD via ONNX Runtime |