Generate images on-device with Stable Diffusion and CoreML
The Diffusion module enables on-device image generation using Apple’s ml-stable-diffusion framework. Models run as CoreML packages, leveraging the Apple Neural Engine for hardware-accelerated inference.
CoreML backend — Uses Apple’s ml-stable-diffusion for optimized on-device inference
Progress callbacks — Step-by-step progress updates during generation
Cancellation — Interrupt generation at any step
Safety checker — Optional NSFW content filtering
First-time model loading triggers CoreML compilation, which can take 5–15 minutes depending on
the device. Subsequent loads use the compiled cache and are significantly faster.
public struct DiffusionGenerationOptions { public let prompt: String // Text description of the desired image public let width: UInt32 // Image width in pixels (default: 512) public let height: UInt32 // Image height in pixels (default: 512) public let steps: Int // Number of denoising steps (default: 20) public let guidanceScale: Double // Prompt adherence strength (default: 7.5) public init( prompt: String, width: UInt32 = 512, height: UInt32 = 512, steps: Int = 20, guidanceScale: Double = 7.5 )}
Parameter
Range
Guidance
steps
10–50
More steps = higher quality but slower. 20 is a good default.
guidanceScale
1.0–20.0
Higher = more prompt adherence. 7.0–8.5 works well for most prompts.
do { let result = try await RunAnywhere.generateImage( prompt: prompt, options: options ) { update in return !isCancelled } guard let image = UIImage(data: result.imageData) else { print("Failed to decode generated image data") return }} catch let error as SDKError { switch error.code { case .notInitialized: print("Load a diffusion model before generating images") case .modelNotFound: print("Diffusion model not found — download it first") case .cancelled: print("Image generation was cancelled") case .processingFailed: print("Generation failed: \(error.message)") case .outOfMemory: print("Not enough memory — try reduceMemory: true") default: print("Diffusion error: \(error)") }}
The first time a CoreML diffusion model loads, it compiles the model for the target device’s Neural Engine. This takes 5–15 minutes. Show a clear progress indicator and explain the wait to users. Subsequent loads use the compiled cache and are fast.
Enable reduceMemory on constrained devices
Stable Diffusion models require ~2GB of RAM. Set reduceMemory: true in DiffusionConfiguration
to lower peak memory usage at the cost of some speed. This prevents OOM crashes on older devices.
Match resolution to model variant
Always use the native resolution for your model variant: 512×512 for SD 1.5/2.0 and 1024×1024 for
SDXL. Non-native resolutions produce distorted or low-quality results.
Provide a cancel mechanism
Image generation can take 30–120 seconds on mobile. Always return false from the progress
handler or call cancelImageGeneration() to let users abort without waiting.
Use 20–25 steps for most prompts
Below 15 steps, images are noticeably noisy. Above 30 steps, quality improvement plateaus. 20–25
steps gives the best quality-to-latency ratio on Apple Silicon.
Keep guidance scale between 7 and 9
A guidance scale of 7.0–8.5 produces coherent images that follow the prompt. Values above 12 often produce oversaturated, artifact-heavy results.