Skip to main content
The Diffusion module enables on-device image generation using Apple’s ml-stable-diffusion framework. Models run as CoreML packages, leveraging the Apple Neural Engine for hardware-accelerated inference.

Overview

The image generation pipeline:
  • CoreML backend — Uses Apple’s ml-stable-diffusion for optimized on-device inference
  • Progress callbacks — Step-by-step progress updates during generation
  • Cancellation — Interrupt generation at any step
  • Safety checker — Optional NSFW content filtering
First-time model loading triggers CoreML compilation, which can take 5–15 minutes depending on the device. Subsequent loads use the compiled cache and are significantly faster.

Basic Usage

import RunAnywhere

// 1. Load diffusion model
let models = try await RunAnywhere.availableModels()
let model = models.first(where: { $0.id == "sd15-coreml-palettized" })!
try await RunAnywhere.loadDiffusionModel(
    modelPath: model.localPath!.path,
    modelId: "sd15-coreml-palettized",
    modelName: "Stable Diffusion 1.5",
    configuration: DiffusionConfiguration(modelVariant: .sd15)
)

// 2. Generate an image
let options = DiffusionGenerationOptions(
    prompt: "A beautiful sunset over mountains",
    width: 512,
    height: 512,
    steps: 20,
    guidanceScale: 7.5
)

let result = try await RunAnywhere.generateImage(
    prompt: "A beautiful sunset over mountains",
    options: options
) { update in
    print("Step \(update.currentStep)/\(update.totalSteps)\(Int(update.progress * 100))%")
    return true  // return true to continue, false to cancel
}

// 3. Use the result
let imageData = result.imageData  // Raw image Data
print("Generated in \(result.generationTimeMs)ms")
print("Size: \(result.width)x\(result.height)")

Setup

Register a Diffusion Model

Diffusion models use CoreML and are distributed as .zip archives. Use registerModel with the .coreml framework:
import RunAnywhere

RunAnywhere.registerModel(
    id: "sd15-coreml-palettized",
    name: "Stable Diffusion 1.5",
    url: URL(string: "https://huggingface.co/apple/coreml-stable-diffusion-v1-5-palettized/resolve/main/coreml-stable-diffusion-v1-5-palettized_original_compiled.zip")!,
    framework: .coreml,
    modality: .imageGeneration,
    artifactType: .archive(.zip, structure: .nestedDirectory),
    memoryRequirement: 2_000_000_000
)
Use .coreml framework and .imageGeneration modality — not .llamaCpp. The model is a .zip archive containing compiled CoreML model packages.

Download the Model

let models = try await RunAnywhere.availableModels()
let model = models.first(where: { $0.id == "sd15-coreml-palettized" })!

// Download if not already cached
if model.localPath == nil {
    try await RunAnywhere.downloadModel(model.id) { progress in
        print("Download: \(Int(progress * 100))%")
    }
}

Configure and Load

let config = DiffusionConfiguration(
    modelVariant: .sd15,
    enableSafetyChecker: true,
    reduceMemory: true
)

try await RunAnywhere.loadDiffusionModel(
    modelPath: model.localPath!.path,
    modelId: "sd15-coreml-palettized",
    modelName: "Stable Diffusion 1.5",
    configuration: config
)

if await RunAnywhere.isDiffusionModelLoaded {
    print("Diffusion model ready")
}
Set reduceMemory: true on devices with limited RAM. This trades some speed for significantly lower peak memory usage during generation.

API Reference

DiffusionConfiguration

public struct DiffusionConfiguration {
    public let modelVariant: DiffusionModelVariant  // .sd15, .sd20, .sdxl
    public let enableSafetyChecker: Bool            // NSFW filter (default: true)
    public let reduceMemory: Bool                   // Lower memory mode (default: false)

    public init(
        modelVariant: DiffusionModelVariant = .sd15,
        enableSafetyChecker: Bool = true,
        reduceMemory: Bool = false
    )
}

DiffusionModelVariant

VariantDescriptionRecommended Resolution
.sd15Stable Diffusion 1.5512×512
.sd20Stable Diffusion 2.0512×512 or 768×768
.sdxlStable Diffusion XL1024×1024

DiffusionGenerationOptions

public struct DiffusionGenerationOptions {
    public let prompt: String          // Text description of the desired image
    public let width: UInt32           // Image width in pixels (default: 512)
    public let height: UInt32          // Image height in pixels (default: 512)
    public let steps: Int              // Number of denoising steps (default: 20)
    public let guidanceScale: Double   // Prompt adherence strength (default: 7.5)

    public init(
        prompt: String,
        width: UInt32 = 512,
        height: UInt32 = 512,
        steps: Int = 20,
        guidanceScale: Double = 7.5
    )
}
ParameterRangeGuidance
steps10–50More steps = higher quality but slower. 20 is a good default.
guidanceScale1.0–20.0Higher = more prompt adherence. 7.0–8.5 works well for most prompts.
width / heightMust match model variantSD 1.5 → 512×512, SDXL → 1024×1024

Model Operations

MethodDescription
RunAnywhere.loadDiffusionModel(modelPath:modelId:modelName:configuration:) async throwsLoad and compile a diffusion model
RunAnywhere.isDiffusionModelLoaded: Bool (async)Whether a diffusion model is currently loaded
RunAnywhere.cancelImageGeneration() async throwsCancel any in-progress image generation

generateImage

let result = try await RunAnywhere.generateImage(
    prompt: String,
    options: DiffusionGenerationOptions,
    progressHandler: @escaping (DiffusionProgressUpdate) -> Bool
)
ParameterTypeDescription
promptStringText prompt for generation
optionsDiffusionGenerationOptionsGeneration configuration
progressHandler(DiffusionProgressUpdate) -> BoolCalled each step. Return true to continue, false to cancel.

DiffusionProgressUpdate

PropertyTypeDescription
progressDoubleOverall progress from 0.0 to 1.0
currentStepIntCurrent denoising step
totalStepsIntTotal steps to complete

DiffusionResult

PropertyTypeDescription
imageDataDataRaw image data (PNG/JPEG)
widthUInt32Image width in pixels
heightUInt32Image height in pixels
generationTimeMsUInt64Total generation time in milliseconds

Examples

Complete SwiftUI Image Generator

import SwiftUI
import RunAnywhere

@Observable
class ImageGenViewModel {
    var prompt = ""
    var generatedImage: UIImage?
    var isGenerating = false
    var isModelLoaded = false
    var progress: Double = 0
    var currentStep = 0
    var totalSteps = 0
    var statusText = "Loading model..."
    var generationTime: UInt64 = 0

    var steps: Int = 20
    var guidanceScale: Double = 7.5

    func loadModel() async {
        do {
            let models = try await RunAnywhere.availableModels()
            guard let model = models.first(where: { $0.id == "sd15-coreml-palettized" }) else {
                statusText = "Model not registered"
                return
            }

            if model.localPath == nil {
                statusText = "Downloading model..."
                try await RunAnywhere.downloadModel(model.id) { progress in
                    Task { @MainActor in
                        self.statusText = "Downloading: \(Int(progress * 100))%"
                    }
                }
            }

            statusText = "Compiling model (first time may take several minutes)..."

            let config = DiffusionConfiguration(
                modelVariant: .sd15,
                enableSafetyChecker: true,
                reduceMemory: true
            )

            try await RunAnywhere.loadDiffusionModel(
                modelPath: model.localPath!.path,
                modelId: "sd15-coreml-palettized",
                modelName: "Stable Diffusion 1.5",
                configuration: config
            )

            isModelLoaded = await RunAnywhere.isDiffusionModelLoaded
            statusText = isModelLoaded ? "Ready" : "Failed to load"
        } catch {
            statusText = "Error: \(error.localizedDescription)"
        }
    }

    func generate() async {
        guard !prompt.isEmpty else { return }

        isGenerating = true
        progress = 0
        generatedImage = nil

        do {
            let options = DiffusionGenerationOptions(
                prompt: prompt,
                width: 512,
                height: 512,
                steps: steps,
                guidanceScale: guidanceScale
            )

            let result = try await RunAnywhere.generateImage(
                prompt: prompt,
                options: options
            ) { [weak self] update in
                Task { @MainActor in
                    self?.progress = update.progress
                    self?.currentStep = update.currentStep
                    self?.totalSteps = update.totalSteps
                }
                return true
            }

            generationTime = result.generationTimeMs
            generatedImage = UIImage(data: result.imageData)
        } catch {
            statusText = "Generation failed: \(error.localizedDescription)"
        }

        isGenerating = false
    }

    func cancel() async {
        try? await RunAnywhere.cancelImageGeneration()
        isGenerating = false
    }
}

struct ImageGenView: View {
    @State private var viewModel = ImageGenViewModel()

    var body: some View {
        NavigationStack {
            ScrollView {
                VStack(spacing: 20) {
                    // Status indicator
                    HStack {
                        Circle()
                            .fill(viewModel.isModelLoaded ? .green : .orange)
                            .frame(width: 8, height: 8)
                        Text(viewModel.statusText)
                            .font(.caption)
                            .foregroundStyle(.secondary)
                        Spacer()
                    }

                    // Generated image display
                    if let image = viewModel.generatedImage {
                        Image(uiImage: image)
                            .resizable()
                            .scaledToFit()
                            .clipShape(RoundedRectangle(cornerRadius: 12))
                            .shadow(radius: 4)

                        Text("Generated in \(viewModel.generationTime)ms")
                            .font(.caption)
                            .foregroundStyle(.secondary)
                    } else if viewModel.isGenerating {
                        VStack(spacing: 12) {
                            ProgressView(value: viewModel.progress)
                                .progressViewStyle(.linear)
                            Text("Step \(viewModel.currentStep)/\(viewModel.totalSteps)")
                                .font(.caption.monospaced())
                                .foregroundStyle(.secondary)
                        }
                        .padding()
                        .background(.ultraThinMaterial)
                        .clipShape(RoundedRectangle(cornerRadius: 8))
                    } else {
                        RoundedRectangle(cornerRadius: 12)
                            .fill(.quaternary)
                            .frame(height: 300)
                            .overlay {
                                VStack(spacing: 8) {
                                    Image(systemName: "photo.artframe")
                                        .font(.largeTitle)
                                    Text("Enter a prompt to generate")
                                }
                                .foregroundStyle(.secondary)
                            }
                    }

                    // Prompt input
                    TextField("Describe the image you want...", text: $viewModel.prompt, axis: .vertical)
                        .textFieldStyle(.roundedBorder)
                        .lineLimit(2...5)

                    // Generation settings
                    VStack(alignment: .leading, spacing: 8) {
                        HStack {
                            Text("Steps: \(viewModel.steps)")
                                .font(.caption)
                            Slider(value: Binding(
                                get: { Double(viewModel.steps) },
                                set: { viewModel.steps = Int($0) }
                            ), in: 10...50, step: 5)
                        }
                        HStack {
                            Text("Guidance: \(viewModel.guidanceScale, specifier: "%.1f")")
                                .font(.caption)
                            Slider(value: $viewModel.guidanceScale, in: 1...20, step: 0.5)
                        }
                    }

                    // Action buttons
                    HStack {
                        Button("Generate") {
                            Task { await viewModel.generate() }
                        }
                        .buttonStyle(.borderedProminent)
                        .disabled(!viewModel.isModelLoaded || viewModel.prompt.isEmpty || viewModel.isGenerating)

                        if viewModel.isGenerating {
                            Button("Cancel", role: .destructive) {
                                Task { await viewModel.cancel() }
                            }
                            .buttonStyle(.bordered)
                        }
                    }

                    // Save button
                    if let image = viewModel.generatedImage {
                        Button("Save to Photos") {
                            UIImageWriteToSavedPhotosAlbum(image, nil, nil, nil)
                        }
                        .buttonStyle(.bordered)
                    }
                }
                .padding()
            }
            .navigationTitle("Image Generation")
            .task { await viewModel.loadModel() }
        }
    }
}

Prompt Engineering Helpers

struct PromptTemplates {
    static func photorealistic(_ subject: String) -> String {
        "\(subject), photorealistic, 8k, highly detailed, professional photography, natural lighting"
    }

    static func illustration(_ subject: String) -> String {
        "\(subject), digital illustration, vibrant colors, clean lines, artstation trending"
    }

    static func portrait(_ subject: String) -> String {
        "\(subject), portrait photography, shallow depth of field, studio lighting, sharp focus"
    }
}

// Usage
let options = DiffusionGenerationOptions(
    prompt: PromptTemplates.photorealistic("a golden retriever in a field of sunflowers"),
    width: 512,
    height: 512,
    steps: 25,
    guidanceScale: 8.0
)

Error Handling

do {
    let result = try await RunAnywhere.generateImage(
        prompt: prompt,
        options: options
    ) { update in
        return !isCancelled
    }

    guard let image = UIImage(data: result.imageData) else {
        print("Failed to decode generated image data")
        return
    }
} catch let error as SDKError {
    switch error.code {
    case .notInitialized:
        print("Load a diffusion model before generating images")
    case .modelNotFound:
        print("Diffusion model not found — download it first")
    case .cancelled:
        print("Image generation was cancelled")
    case .processingFailed:
        print("Generation failed: \(error.message)")
    case .outOfMemory:
        print("Not enough memory — try reduceMemory: true")
    default:
        print("Diffusion error: \(error)")
    }
}

Best Practices

The first time a CoreML diffusion model loads, it compiles the model for the target device’s Neural Engine. This takes 5–15 minutes. Show a clear progress indicator and explain the wait to users. Subsequent loads use the compiled cache and are fast.
Stable Diffusion models require ~2GB of RAM. Set reduceMemory: true in DiffusionConfiguration to lower peak memory usage at the cost of some speed. This prevents OOM crashes on older devices.
Always use the native resolution for your model variant: 512×512 for SD 1.5/2.0 and 1024×1024 for SDXL. Non-native resolutions produce distorted or low-quality results.
Image generation can take 30–120 seconds on mobile. Always return false from the progress handler or call cancelImageGeneration() to let users abort without waiting.
Below 15 steps, images are noticeably noisy. Above 30 steps, quality improvement plateaus. 20–25 steps gives the best quality-to-latency ratio on Apple Silicon.
A guidance scale of 7.0–8.5 produces coherent images that follow the prompt. Values above 12 often produce oversaturated, artifact-heavy results.

Supported Models

ModelFrameworkSizeResolutionNotes
Stable Diffusion 1.5CoreML~2GB512×512Best balance of quality and speed
Stable Diffusion 2.0CoreML~2.5GB512×512 / 768×768Improved image quality
Stable Diffusion XLCoreML~6GB1024×1024Highest quality, requires more memory