Skip to main content

Overview

The RunAnywhere Flutter SDK is a production-grade, on-device AI SDK for Flutter applications. It enables developers to run AI models directly on iOS and Android devices without requiring network connectivity for inference, ensuring minimal latency and maximum privacy for your users. The SDK provides a unified interface to multiple AI capabilities:

Key Capabilities

  • Multi-backend architecture – Choose from LlamaCPP (GGUF models) or ONNX Runtime
  • Cross-platform – Single codebase for iOS and Android
  • Dart-native – Built with async/await and Streams for reactive programming
  • Production-ready – Built-in analytics, logging, and model lifecycle management

Core Philosophy

All AI inference runs locally, ensuring low latency and data privacy. Once models are downloaded, no network connection is required for inference.
Backend engines are separate packages—include only what you need. This keeps your app bundle size minimal.
Audio and text data never leaves the device unless explicitly configured. Only anonymous analytics are collected by default.
Subscribe to SDK events for reactive UI updates and observability.

Features

Language Models (LLM)

  • On-device text generation with streaming support
  • Dart Stream-based token streaming
  • System prompts and customizable generation parameters
  • Support for thinking/reasoning models
  • LlamaCPP backend for GGUF models

Speech-to-Text (STT)

  • Real-time streaming transcription
  • Batch audio transcription
  • Multi-language support
  • Whisper-based models via ONNX Runtime

Text-to-Speech (TTS)

  • Neural voice synthesis with Piper TTS
  • System voices via platform TTS
  • Streaming audio generation for long text
  • Customizable voice, pitch, rate, and volume

Voice Activity Detection (VAD)

  • Energy-based speech detection with Silero VAD
  • Configurable sensitivity thresholds
  • Real-time audio stream processing

Voice Agent Pipeline

  • Full VAD → STT → LLM → TTS orchestration
  • Complete voice conversation flow
  • Push-to-talk and hands-free modes

System Requirements

PlatformMinimum Version
Flutter3.10.0+
Dart3.0.0+
iOS14.0+
AndroidAPI 24 (7.0+)
ARM64 devices are recommended for best performance. Metal GPU acceleration on iOS and NEON SIMD on Android provide significant speedups over CPU-only inference.

Package Composition

PackageSizePurpose
runanywhere~5MBCore SDK (required)
runanywhere_llamacpp~15-25MBLLM text generation with GGUF models
runanywhere_onnx~50-70MBSTT/TTS/VAD via ONNX Runtime

Architecture

Starter Example

Flutter Starter Example

Complete working example with LLM chat, STT, TTS, and Voice Agent demos

Next Steps