Overview
This guide covers best practices for building performant, reliable, and user-friendly AI applications with the RunAnywhere SDK.
Model Selection
Choose the Right Model Size
| Model Size | RAM Required | Use Case | Speed |
|---|
| 360M–500M (Q8) | ~500MB | Quick responses, chat | Very Fast |
| 1B–3B (Q4/Q6) | 1–2GB | Balanced quality/speed | Fast |
| 7B (Q4) | 4–5GB | High quality | Slower |
// For chat applications - use smaller, faster models
await LlamaCPP.addModel({
id: 'smollm2-360m',
name: 'SmolLM2 360M',
url: 'https://huggingface.co/.../SmolLM2-360M.Q8_0.gguf',
memoryRequirement: 500_000_000,
})
// For quality-critical tasks - use larger models
await LlamaCPP.addModel({
id: 'qwen-1.5b',
name: 'Qwen 1.5B',
url: 'https://huggingface.co/.../qwen-1.5b-q4_k_m.gguf',
memoryRequirement: 1_500_000_000,
})
Start with smaller models during development for faster iteration. Switch to larger models when
quality becomes critical.
Quantization Trade-offs
| Quantization | Quality | Size | Speed |
|---|
| Q8_0 | Best | Largest | Slower |
| Q6_K | Great | Large | Fast |
| Q4_K_M | Good | Medium | Faster |
| Q4_0 | Acceptable | Small | Fastest |
Memory Management
Unload Unused Models
// Unload LLM when not in use
await RunAnywhere.unloadModel()
// Unload STT when not needed
await RunAnywhere.unloadSTTModel()
// Unload TTS when done
await RunAnywhere.unloadTTSModel()
Handle App Lifecycle
import { useEffect } from 'react'
import { AppState, AppStateStatus } from 'react-native'
import { RunAnywhere } from '@runanywhere/core'
function useModelLifecycle(modelId: string) {
useEffect(() => {
const subscription = AppState.addEventListener('change', (state: AppStateStatus) => {
if (state === 'background') {
// Free memory when app backgrounds
RunAnywhere.unloadModel()
} else if (state === 'active') {
// Optionally reload when app returns
// This depends on your UX requirements
}
})
return () => subscription.remove()
}, [modelId])
}
Check Memory Before Loading
async function safeLoadModel(modelId: string): Promise<boolean> {
const modelInfo = await RunAnywhere.getModelInfo(modelId)
const storage = await RunAnywhere.getStorageInfo()
if (!modelInfo?.memoryRequired) {
return false
}
// Check if we have enough free memory (with 20% buffer)
const requiredWithBuffer = modelInfo.memoryRequired * 1.2
if (storage.freeSpace < requiredWithBuffer) {
console.warn('Insufficient memory for model')
return false
}
await RunAnywhere.loadModel(modelInfo.localPath!)
return true
}
Use Streaming for Better UX
// ❌ Bad: User waits for entire response
const result = await RunAnywhere.generate(prompt, { maxTokens: 500 })
setResponse(result.text)
// ✅ Good: User sees response as it's generated
const stream = await RunAnywhere.generateStream(prompt, { maxTokens: 500 })
for await (const token of stream.stream) {
setResponse((prev) => prev + token)
}
Limit Token Generation
// For quick responses
const quick = await RunAnywhere.generate(prompt, {
maxTokens: 100, // Short responses
temperature: 0.5, // Faster sampling
})
// For detailed responses
const detailed = await RunAnywhere.generate(prompt, {
maxTokens: 500,
temperature: 0.7,
})
Pre-Download Models
Download models during onboarding for a better user experience:
async function downloadModels(onProgress: (percent: number) => void) {
const models = ['smollm2-360m', 'whisper-tiny-en', 'piper-en-lessac']
for (let i = 0; i < models.length; i++) {
await RunAnywhere.downloadModel(models[i], (progress) => {
const overallProgress = (i + progress.progress) / models.length
onProgress(overallProgress * 100)
})
}
}
Error Handling
Always Handle Errors Gracefully
async function generateSafely(prompt: string): Promise<string> {
try {
const result = await RunAnywhere.generate(prompt)
return result.text
} catch (error) {
if (isSDKError(error)) {
switch (error.code) {
case SDKErrorCode.modelNotLoaded:
// Try to load model and retry
await loadDefaultModel()
return generateSafely(prompt)
case SDKErrorCode.insufficientMemory:
// Use a smaller model
return 'I need to use a smaller model. Please try again.'
case SDKErrorCode.generationCancelled:
return '' // Expected, don't show error
default:
return 'Sorry, I encountered an error. Please try again.'
}
}
return 'An unexpected error occurred.'
}
}
Provide User Feedback
// Show loading states
const [status, setStatus] = useState<'idle' | 'loading' | 'generating' | 'error'>('idle')
async function generate() {
setStatus('loading')
try {
const result = await RunAnywhere.generate(prompt)
setStatus('idle')
return result
} catch (error) {
setStatus('error')
throw error
}
}
User Experience
Show Progress During Downloads
import React, { useState } from 'react'
import { View, Text, ActivityIndicator, StyleSheet } from 'react-native'
import { RunAnywhere, DownloadState } from '@runanywhere/core'
export function ModelDownloader({ modelId }: { modelId: string }) {
const [progress, setProgress] = useState(0)
const [status, setStatus] = useState<DownloadState>('queued')
const download = async () => {
await RunAnywhere.downloadModel(modelId, (p) => {
setProgress(p.progress * 100)
setStatus(p.state)
})
}
return (
<View style={styles.container}>
{status === 'downloading' && (
<>
<ActivityIndicator />
<Text>Downloading: {progress.toFixed(0)}%</Text>
</>
)}
{status === 'extracting' && (
<>
<ActivityIndicator />
<Text>Setting up model...</Text>
</>
)}
{status === 'completed' && <Text>✅ Ready to use!</Text>}
</View>
)
}
Add Typing Indicators
export function TypingIndicator() {
return (
<View style={styles.container}>
<Text style={styles.dot}>•</Text>
<Text style={[styles.dot, styles.dot2]}>•</Text>
<Text style={[styles.dot, styles.dot3]}>•</Text>
</View>
)
}
const styles = StyleSheet.create({
container: { flexDirection: 'row', padding: 8 },
dot: {
fontSize: 24,
marginHorizontal: 2,
opacity: 0.3,
},
dot2: { animationDelay: '0.2s' },
dot3: { animationDelay: '0.4s' },
})
Graceful Degradation
async function getAIResponse(prompt: string): Promise<string> {
// Try on-device first
if (await RunAnywhere.isModelLoaded()) {
try {
return (await RunAnywhere.generate(prompt)).text
} catch {
// Fall through to fallback
}
}
// Fallback to simpler response
return "I'm still setting up. Please try again in a moment."
}
Security & Privacy
Never Log Sensitive Data
// ❌ Bad: Logging user prompts
console.log('User prompt:', prompt)
logger.debug('Generating response for:', prompt)
// ✅ Good: Log only metadata
logger.debug('Generating response', { promptLength: prompt.length })
Use Development Mode Appropriately
// Development: Full logging, no auth
await RunAnywhere.initialize({
environment: SDKEnvironment.Development,
})
// Production: Minimal logging, with auth
await RunAnywhere.initialize({
environment: SDKEnvironment.Production,
apiKey: process.env.RUNANYWHERE_API_KEY,
})
Disable Telemetry If Needed
// Telemetry is only in Production mode
// Use Development mode to disable all telemetry
await RunAnywhere.initialize({
environment: SDKEnvironment.Development, // No telemetry
})
Testing
Use Smaller Models for Testing
// In tests, use the smallest available model
const testModel = 'smollm2-360m' // Fast for CI/CD
beforeAll(async () => {
await RunAnywhere.initialize({ environment: SDKEnvironment.Development })
LlamaCPP.register()
// Download and load small model for tests
})
Mock for Unit Tests
// Create a mock for unit tests
jest.mock('@runanywhere/core', () => ({
RunAnywhere: {
generate: jest.fn().mockResolvedValue({
text: 'Mock response',
tokensUsed: 10,
latencyMs: 100,
}),
chat: jest.fn().mockResolvedValue('Mock response'),
},
}))
iOS: react-native-screens Crashes
On iOS with New Architecture in RN 0.83+, react-native-screens crashes with errors like -[RCTView setColor:]. Use @react-navigation/stack (JS-based) instead of @react-navigation/native-stack, and mock react-native-screens in your Metro config:
resolveRequest: (context, moduleName, platform) => {
if (platform === 'ios' && moduleName === 'react-native-screens') {
return { filePath: './src/react-native-screens-mock.js', type: 'sourceFile' }
}
return context.resolveRequest(context, moduleName, platform)
}
iOS vs Android New Architecture
New Architecture should be enabled on Android (newArchEnabled=true in gradle.properties) but disabled on iOS (set new_arch_enabled: false in Podfile). This avoids crashes with several native modules on iOS.
Custom Native Audio Module
Third-party audio recording libraries like react-native-audio-recorder-player crash on iOS New Architecture. Build a custom native audio module using AVAudioRecorder (iOS/Swift) and AudioRecord (Android/Kotlin) that returns base64-encoded WAV data directly.
Audio is passed as base64-encoded strings between JavaScript and native code. RunAnywhere.transcribe() accepts base64-encoded WAV audio input. RunAnywhere.synthesize() returns base64-encoded float32 PCM audio. Use RunAnywhere.Audio.createWavFromPCMFloat32() to convert to a playable WAV file.
NitroModules Manual Setup
react-native-nitro-modules requires manual configuration on both platforms:
- iOS: Add
NitroModules pod manually in Podfile
- Android: Include as a Gradle project in
settings.gradle
Duplicate Native Libraries (Android)
Add pickFirsts for libc++_shared.so, libjsi.so, libfbjni.so, libfolly_runtime.so, and libreactnative.so in your app’s build.gradle packagingOptions to resolve conflicts.
Node.js Path Fix (Android Studio)
Android Studio doesn’t inherit terminal PATH. Add explicit Node.js binary search paths in your root build.gradle to fix node not found errors during Gradle builds.
Summary Checklist
Choose appropriate model size for your use case
Use streaming for better perceived performance
Unload models when not in use
Handle all error cases gracefully
Show progress during downloads and loading
Pre-download models during onboarding
Test on actual target devices
Use Development mode to iterate quickly
Never log sensitive user data
Provide clear feedback for all operations