Image Generation

The Diffusion module enables on-device image generation using Apple’s ml-stable-diffusion framework. Models run as CoreML packages, leveraging the Apple Neural Engine for hardware-accelerated inference.

Overview

The image generation pipeline:

CoreML backend — Uses Apple’s ml-stable-diffusion for optimized on-device inference
Progress callbacks — Step-by-step progress updates during generation
Cancellation — Interrupt generation at any step
Safety checker — Optional NSFW content filtering

First-time model loading triggers CoreML compilation, which can take 5–15 minutes depending on the device. Subsequent loads use the compiled cache and are significantly faster.

Basic Usage

import RunAnywhere

// 1. Load diffusion model
let models = try await RunAnywhere.availableModels()
let model = models.first(where: { $0.id == "sd15-coreml-palettized" })!
try await RunAnywhere.loadDiffusionModel(
    modelPath: model.localPath!.path,
    modelId: "sd15-coreml-palettized",
    modelName: "Stable Diffusion 1.5",
    configuration: DiffusionConfiguration(modelVariant: .sd15)
)

// 2. Generate an image
let options = DiffusionGenerationOptions(
    prompt: "A beautiful sunset over mountains",
    width: 512,
    height: 512,
    steps: 20,
    guidanceScale: 7.5
)

let result = try await RunAnywhere.generateImage(
    prompt: "A beautiful sunset over mountains",
    options: options
) { update in
    print("Step \(update.currentStep)/\(update.totalSteps) — \(Int(update.progress * 100))%")
    return true  // return true to continue, false to cancel
}

// 3. Use the result
let imageData = result.imageData  // Raw image Data
print("Generated in \(result.generationTimeMs)ms")
print("Size: \(result.width)x\(result.height)")

Setup

Register a Diffusion Model

Diffusion models use CoreML and are distributed as .zip archives. Use registerModel with the .coreml framework:

import RunAnywhere

RunAnywhere.registerModel(
    id: "sd15-coreml-palettized",
    name: "Stable Diffusion 1.5",
    url: URL(string: "https://huggingface.co/apple/coreml-stable-diffusion-v1-5-palettized/resolve/main/coreml-stable-diffusion-v1-5-palettized_original_compiled.zip")!,
    framework: .coreml,
    modality: .imageGeneration,
    artifactType: .archive(.zip, structure: .nestedDirectory),
    memoryRequirement: 2_000_000_000
)

Use .coreml framework and .imageGeneration modality — not .llamaCpp. The model is a .zip archive containing compiled CoreML model packages.

Download the Model

let models = try await RunAnywhere.availableModels()
let model = models.first(where: { $0.id == "sd15-coreml-palettized" })!

// Download if not already cached
if model.localPath == nil {
    try await RunAnywhere.downloadModel(model.id) { progress in
        print("Download: \(Int(progress * 100))%")
    }
}

Configure and Load

let config = DiffusionConfiguration(
    modelVariant: .sd15,
    enableSafetyChecker: true,
    reduceMemory: true
)

try await RunAnywhere.loadDiffusionModel(
    modelPath: model.localPath!.path,
    modelId: "sd15-coreml-palettized",
    modelName: "Stable Diffusion 1.5",
    configuration: config
)

if await RunAnywhere.isDiffusionModelLoaded {
    print("Diffusion model ready")
}

Set reduceMemory: true on devices with limited RAM. This trades some speed for significantly lower peak memory usage during generation.

API Reference

DiffusionConfiguration

public struct DiffusionConfiguration {
    public let modelVariant: DiffusionModelVariant  // .sd15, .sd20, .sdxl
    public let enableSafetyChecker: Bool            // NSFW filter (default: true)
    public let reduceMemory: Bool                   // Lower memory mode (default: false)

    public init(
        modelVariant: DiffusionModelVariant = .sd15,
        enableSafetyChecker: Bool = true,
        reduceMemory: Bool = false
    )
}

DiffusionModelVariant

Variant	Description	Recommended Resolution
`.sd15`	Stable Diffusion 1.5	512×512
`.sd20`	Stable Diffusion 2.0	512×512 or 768×768
`.sdxl`	Stable Diffusion XL	1024×1024

DiffusionGenerationOptions

public struct DiffusionGenerationOptions {
    public let prompt: String          // Text description of the desired image
    public let width: UInt32           // Image width in pixels (default: 512)
    public let height: UInt32          // Image height in pixels (default: 512)
    public let steps: Int              // Number of denoising steps (default: 20)
    public let guidanceScale: Double   // Prompt adherence strength (default: 7.5)

    public init(
        prompt: String,
        width: UInt32 = 512,
        height: UInt32 = 512,
        steps: Int = 20,
        guidanceScale: Double = 7.5
    )
}

Parameter	Range	Guidance
`steps`	10–50	More steps = higher quality but slower. 20 is a good default.
`guidanceScale`	1.0–20.0	Higher = more prompt adherence. 7.0–8.5 works well for most prompts.
`width` / `height`	Must match model variant	SD 1.5 → 512×512, SDXL → 1024×1024

Model Operations

Method	Description
`RunAnywhere.loadDiffusionModel(modelPath:modelId:modelName:configuration:) async throws`	Load and compile a diffusion model
`RunAnywhere.isDiffusionModelLoaded: Bool` (async)	Whether a diffusion model is currently loaded
`RunAnywhere.cancelImageGeneration() async throws`	Cancel any in-progress image generation

generateImage

let result = try await RunAnywhere.generateImage(
    prompt: String,
    options: DiffusionGenerationOptions,
    progressHandler: @escaping (DiffusionProgressUpdate) -> Bool
)

Parameter	Type	Description
`prompt`	`String`	Text prompt for generation
`options`	`DiffusionGenerationOptions`	Generation configuration
`progressHandler`	`(DiffusionProgressUpdate) -> Bool`	Called each step. Return `true` to continue, `false` to cancel.

DiffusionProgressUpdate

Property	Type	Description
`progress`	`Double`	Overall progress from 0.0 to 1.0
`currentStep`	`Int`	Current denoising step
`totalSteps`	`Int`	Total steps to complete

DiffusionResult

Property	Type	Description
`imageData`	`Data`	Raw image data (PNG/JPEG)
`width`	`UInt32`	Image width in pixels
`height`	`UInt32`	Image height in pixels
`generationTimeMs`	`UInt64`	Total generation time in milliseconds

Examples

Complete SwiftUI Image Generator

import SwiftUI
import RunAnywhere

@Observable
class ImageGenViewModel {
    var prompt = ""
    var generatedImage: UIImage?
    var isGenerating = false
    var isModelLoaded = false
    var progress: Double = 0
    var currentStep = 0
    var totalSteps = 0
    var statusText = "Loading model..."
    var generationTime: UInt64 = 0

    var steps: Int = 20
    var guidanceScale: Double = 7.5

    func loadModel() async {
        do {
            let models = try await RunAnywhere.availableModels()
            guard let model = models.first(where: { $0.id == "sd15-coreml-palettized" }) else {
                statusText = "Model not registered"
                return
            }

            if model.localPath == nil {
                statusText = "Downloading model..."
                try await RunAnywhere.downloadModel(model.id) { progress in
                    Task { @MainActor in
                        self.statusText = "Downloading: \(Int(progress * 100))%"
                    }
                }
            }

            statusText = "Compiling model (first time may take several minutes)..."

            let config = DiffusionConfiguration(
                modelVariant: .sd15,
                enableSafetyChecker: true,
                reduceMemory: true
            )

            try await RunAnywhere.loadDiffusionModel(
                modelPath: model.localPath!.path,
                modelId: "sd15-coreml-palettized",
                modelName: "Stable Diffusion 1.5",
                configuration: config
            )

            isModelLoaded = await RunAnywhere.isDiffusionModelLoaded
            statusText = isModelLoaded ? "Ready" : "Failed to load"
        } catch {
            statusText = "Error: \(error.localizedDescription)"
        }
    }

    func generate() async {
        guard !prompt.isEmpty else { return }

        isGenerating = true
        progress = 0
        generatedImage = nil

        do {
            let options = DiffusionGenerationOptions(
                prompt: prompt,
                width: 512,
                height: 512,
                steps: steps,
                guidanceScale: guidanceScale
            )

            let result = try await RunAnywhere.generateImage(
                prompt: prompt,
                options: options
            ) { [weak self] update in
                Task { @MainActor in
                    self?.progress = update.progress
                    self?.currentStep = update.currentStep
                    self?.totalSteps = update.totalSteps
                }
                return true
            }

            generationTime = result.generationTimeMs
            generatedImage = UIImage(data: result.imageData)
        } catch {
            statusText = "Generation failed: \(error.localizedDescription)"
        }

        isGenerating = false
    }

    func cancel() async {
        try? await RunAnywhere.cancelImageGeneration()
        isGenerating = false
    }
}

struct ImageGenView: View {
    @State private var viewModel = ImageGenViewModel()

    var body: some View {
        NavigationStack {
            ScrollView {
                VStack(spacing: 20) {
                    // Status indicator
                    HStack {
                        Circle()
                            .fill(viewModel.isModelLoaded ? .green : .orange)
                            .frame(width: 8, height: 8)
                        Text(viewModel.statusText)
                            .font(.caption)
                            .foregroundStyle(.secondary)
                        Spacer()
                    }

                    // Generated image display
                    if let image = viewModel.generatedImage {
                        Image(uiImage: image)
                            .resizable()
                            .scaledToFit()
                            .clipShape(RoundedRectangle(cornerRadius: 12))
                            .shadow(radius: 4)

                        Text("Generated in \(viewModel.generationTime)ms")
                            .font(.caption)
                            .foregroundStyle(.secondary)
                    } else if viewModel.isGenerating {
                        VStack(spacing: 12) {
                            ProgressView(value: viewModel.progress)
                                .progressViewStyle(.linear)
                            Text("Step \(viewModel.currentStep)/\(viewModel.totalSteps)")
                                .font(.caption.monospaced())
                                .foregroundStyle(.secondary)
                        }
                        .padding()
                        .background(.ultraThinMaterial)
                        .clipShape(RoundedRectangle(cornerRadius: 8))
                    } else {
                        RoundedRectangle(cornerRadius: 12)
                            .fill(.quaternary)
                            .frame(height: 300)
                            .overlay {
                                VStack(spacing: 8) {
                                    Image(systemName: "photo.artframe")
                                        .font(.largeTitle)
                                    Text("Enter a prompt to generate")
                                }
                                .foregroundStyle(.secondary)
                            }
                    }

                    // Prompt input
                    TextField("Describe the image you want...", text: $viewModel.prompt, axis: .vertical)
                        .textFieldStyle(.roundedBorder)
                        .lineLimit(2...5)

                    // Generation settings
                    VStack(alignment: .leading, spacing: 8) {
                        HStack {
                            Text("Steps: \(viewModel.steps)")
                                .font(.caption)
                            Slider(value: Binding(
                                get: { Double(viewModel.steps) },
                                set: { viewModel.steps = Int($0) }
                            ), in: 10...50, step: 5)
                        }
                        HStack {
                            Text("Guidance: \(viewModel.guidanceScale, specifier: "%.1f")")
                                .font(.caption)
                            Slider(value: $viewModel.guidanceScale, in: 1...20, step: 0.5)
                        }
                    }

                    // Action buttons
                    HStack {
                        Button("Generate") {
                            Task { await viewModel.generate() }
                        }
                        .buttonStyle(.borderedProminent)
                        .disabled(!viewModel.isModelLoaded || viewModel.prompt.isEmpty || viewModel.isGenerating)

                        if viewModel.isGenerating {
                            Button("Cancel", role: .destructive) {
                                Task { await viewModel.cancel() }
                            }
                            .buttonStyle(.bordered)
                        }
                    }

                    // Save button
                    if let image = viewModel.generatedImage {
                        Button("Save to Photos") {
                            UIImageWriteToSavedPhotosAlbum(image, nil, nil, nil)
                        }
                        .buttonStyle(.bordered)
                    }
                }
                .padding()
            }
            .navigationTitle("Image Generation")
            .task { await viewModel.loadModel() }
        }
    }
}

Prompt Engineering Helpers

struct PromptTemplates {
    static func photorealistic(_ subject: String) -> String {
        "\(subject), photorealistic, 8k, highly detailed, professional photography, natural lighting"
    }

    static func illustration(_ subject: String) -> String {
        "\(subject), digital illustration, vibrant colors, clean lines, artstation trending"
    }

    static func portrait(_ subject: String) -> String {
        "\(subject), portrait photography, shallow depth of field, studio lighting, sharp focus"
    }
}

// Usage
let options = DiffusionGenerationOptions(
    prompt: PromptTemplates.photorealistic("a golden retriever in a field of sunflowers"),
    width: 512,
    height: 512,
    steps: 25,
    guidanceScale: 8.0
)

Error Handling

do {
    let result = try await RunAnywhere.generateImage(
        prompt: prompt,
        options: options
    ) { update in
        return !isCancelled
    }

    guard let image = UIImage(data: result.imageData) else {
        print("Failed to decode generated image data")
        return
    }
} catch let error as SDKError {
    switch error.code {
    case .notInitialized:
        print("Load a diffusion model before generating images")
    case .modelNotFound:
        print("Diffusion model not found — download it first")
    case .cancelled:
        print("Image generation was cancelled")
    case .processingFailed:
        print("Generation failed: \(error.message)")
    case .outOfMemory:
        print("Not enough memory — try reduceMemory: true")
    default:
        print("Diffusion error: \(error)")
    }
}

Best Practices

Expect slow first load

The first time a CoreML diffusion model loads, it compiles the model for the target device’s Neural Engine. This takes 5–15 minutes. Show a clear progress indicator and explain the wait to users. Subsequent loads use the compiled cache and are fast.

Enable reduceMemory on constrained devices

Stable Diffusion models require ~2GB of RAM. Set reduceMemory: true in DiffusionConfiguration to lower peak memory usage at the cost of some speed. This prevents OOM crashes on older devices.

Match resolution to model variant

Always use the native resolution for your model variant: 512×512 for SD 1.5/2.0 and 1024×1024 for SDXL. Non-native resolutions produce distorted or low-quality results.

Provide a cancel mechanism

Image generation can take 30–120 seconds on mobile. Always return false from the progress handler or call cancelImageGeneration() to let users abort without waiting.

Use 20–25 steps for most prompts

Below 15 steps, images are noticeably noisy. Above 30 steps, quality improvement plateaus. 20–25 steps gives the best quality-to-latency ratio on Apple Silicon.

Keep guidance scale between 7 and 9

A guidance scale of 7.0–8.5 produces coherent images that follow the prompt. Values above 12 often produce oversaturated, artifact-heavy results.

Supported Models

Model	Framework	Size	Resolution	Notes
Stable Diffusion 1.5	CoreML	~2GB	512×512	Best balance of quality and speed
Stable Diffusion 2.0	CoreML	~2.5GB	512×512 / 768×768	Improved image quality
Stable Diffusion XL	CoreML	~6GB	1024×1024	Highest quality, requires more memory

VLM

Vision Language Models

LLM Generation

Text generation

Configuration

SDK configuration options

Best Practices

Performance optimization

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

Overview

Basic Usage

Setup

Register a Diffusion Model

Download the Model

Configure and Load

API Reference

DiffusionConfiguration

DiffusionModelVariant

DiffusionGenerationOptions

Model Operations

generateImage

DiffusionProgressUpdate

DiffusionResult

Examples

Complete SwiftUI Image Generator

Prompt Engineering Helpers

Error Handling

Best Practices

Supported Models

VLM

LLM Generation

Configuration

Best Practices

​Overview

​Basic Usage

​Setup

​Register a Diffusion Model

​Download the Model

​Configure and Load

​API Reference

​DiffusionConfiguration

​DiffusionModelVariant

​DiffusionGenerationOptions

​Model Operations

​generateImage

​DiffusionProgressUpdate

​DiffusionResult

​Examples

​Complete SwiftUI Image Generator

​Prompt Engineering Helpers

​Error Handling

​Best Practices

​Supported Models

​Related

VLM

LLM Generation

Configuration

Best Practices

Overview

Basic Usage

Setup

Register a Diffusion Model

Download the Model

Configure and Load

API Reference

DiffusionConfiguration

DiffusionModelVariant

DiffusionGenerationOptions

Model Operations

generateImage

DiffusionProgressUpdate

DiffusionResult

Examples

Complete SwiftUI Image Generator

Prompt Engineering Helpers

Error Handling

Best Practices

Supported Models

Related