Skip to main content

Overview

LoRA (Low-Rank Adaptation) lets you apply lightweight fine-tuned adapters to a loaded base model at runtime. Swap adapters instantly without reloading the full model — perfect for switching between domain-specific behaviors like medical QA, creative writing, or code generation.

Package Imports

import com.runanywhere.sdk.public.RunAnywhere
import com.runanywhere.sdk.public.extensions.LLM.LoRAAdapterConfig
import com.runanywhere.sdk.public.extensions.LLM.LoRAAdapterInfo
import com.runanywhere.sdk.public.extensions.loadLoraAdapter
import com.runanywhere.sdk.public.extensions.removeLoraAdapter
import com.runanywhere.sdk.public.extensions.clearLoraAdapters
import com.runanywhere.sdk.public.extensions.getLoadedLoraAdapters
import com.runanywhere.sdk.public.extensions.checkLoraCompatibility

Basic Usage

// 1. Load a base model first
RunAnywhere.loadLLMModel("qwen-0.5b")

// 2. Check compatibility
val compat = RunAnywhere.checkLoraCompatibility("/path/to/adapter.gguf")
if (!compat.isCompatible) {
    println("Incompatible: ${compat.error}")
    return
}

// 3. Apply a LoRA adapter
RunAnywhere.loadLoraAdapter(LoRAAdapterConfig(
    path = "/path/to/adapter.gguf",
    scale = 1.0f
))

// 4. Generate with the adapter applied
val result = RunAnywhere.generate("What are the symptoms of diabetes?")
println(result.text)
A base LLM model must be loaded before applying LoRA adapters. Calling loadLoraAdapter() without a loaded model will throw an SDKError.

Adapter Scale

The scale parameter controls how strongly the adapter affects generation:
ScaleEffect
0.0No effect (adapter loaded but inactive)
0.5Half strength — subtle influence
1.0Full strength (default)
> 1.0Amplified effect (use with caution)
// Subtle adapter influence
RunAnywhere.loadLoraAdapter(LoRAAdapterConfig(
    path = "/path/to/style-adapter.gguf",
    scale = 0.3f
))

Stacking Multiple Adapters

You can apply multiple LoRA adapters simultaneously:
// Apply medical knowledge adapter
RunAnywhere.loadLoraAdapter(LoRAAdapterConfig(
    path = "/path/to/medical-qa.gguf",
    scale = 1.0f
))

// Stack a conversational style adapter on top
RunAnywhere.loadLoraAdapter(LoRAAdapterConfig(
    path = "/path/to/conversational.gguf",
    scale = 0.5f
))

// Both adapters are now active
val adapters = RunAnywhere.getLoadedLoraAdapters()
println("Active adapters: ${adapters.size}") // 2

API Reference

LoRAAdapterConfig

@Serializable
data class LoRAAdapterConfig(
    val path: String,        // Path to LoRA adapter GGUF file (required)
    val scale: Float = 1.0f  // Scale factor (0.0 to 1.0+)
)
ParameterTypeDefaultDescription
pathStringPath to the LoRA adapter .gguf file. Must not be blank.
scaleFloat1.0fHow strongly the adapter affects output.

LoRAAdapterInfo

@Serializable
data class LoRAAdapterInfo(
    val path: String,     // Path used when loading
    val scale: Float,     // Active scale factor
    val applied: Boolean  // Whether applied to current context
)

LoraCompatibilityResult

data class LoraCompatibilityResult(
    val isCompatible: Boolean,
    val error: String? = null
)

loadLoraAdapter

suspend fun RunAnywhere.loadLoraAdapter(config: LoRAAdapterConfig)
Loads and applies a LoRA adapter to the currently loaded model. Context is recreated internally. Throws: SDKError if no model is loaded or loading fails.

removeLoraAdapter

suspend fun RunAnywhere.removeLoraAdapter(path: String)
Removes a specific LoRA adapter by the path used when loading. Throws: SDKError if adapter not found or removal fails.

clearLoraAdapters

suspend fun RunAnywhere.clearLoraAdapters()
Removes all loaded LoRA adapters and restores the base model behavior.

getLoadedLoraAdapters

suspend fun RunAnywhere.getLoadedLoraAdapters(): List<LoRAAdapterInfo>
Returns information about all currently loaded adapters.

checkLoraCompatibility

fun RunAnywhere.checkLoraCompatibility(loraPath: String): LoraCompatibilityResult
Checks if a LoRA adapter file is compatible with the currently loaded model. Always call this before loadLoraAdapter() to avoid runtime errors.

Adapter Catalog

Register LoRA adapters in a catalog for discovery and management:
import com.runanywhere.sdk.public.extensions.LoraAdapterCatalogEntry
import com.runanywhere.sdk.public.extensions.registerLoraAdapter
import com.runanywhere.sdk.public.extensions.loraAdaptersForModel
import com.runanywhere.sdk.public.extensions.allRegisteredLoraAdapters

// Register adapter metadata
RunAnywhere.registerLoraAdapter(LoraAdapterCatalogEntry(
    id = "medical-qa-v1",
    name = "Medical QA Adapter",
    description = "Fine-tuned for medical question answering",
    downloadUrl = "https://huggingface.co/your-org/medical-qa-lora/resolve/main/adapter.gguf",
    filename = "medical-qa.gguf",
    compatibleModelIds = listOf("qwen-0.5b", "qwen-1.5b"),
    fileSize = 25_000_000L,
    defaultScale = 1.0f
))

// Find adapters for a specific model
val adapters = RunAnywhere.loraAdaptersForModel("qwen-0.5b")
adapters.forEach { println("${it.name}: ${it.description}") }

// List all registered adapters
val all = RunAnywhere.allRegisteredLoraAdapters()

LoraAdapterCatalogEntry

ParameterTypeDescription
idStringUnique identifier
nameStringDisplay name
descriptionStringWhat the adapter does
downloadUrlStringURL to download the adapter file
filenameStringLocal filename
compatibleModelIdsList<String>Base model IDs this adapter works with
fileSizeLongFile size in bytes (default: 0)
defaultScaleFloatRecommended scale factor (default: 1.0)

Examples

Swapping Adapters in a ViewModel

class ChatViewModel : ViewModel() {
    private var currentDomain: String? = null

    fun switchDomain(domain: String, adapterPath: String) {
        viewModelScope.launch {
            // Remove previous adapter
            RunAnywhere.clearLoraAdapters()

            // Apply new domain adapter
            RunAnywhere.loadLoraAdapter(LoRAAdapterConfig(
                path = adapterPath,
                scale = 1.0f
            ))

            currentDomain = domain
        }
    }

    fun generate(prompt: String) {
        viewModelScope.launch {
            val result = RunAnywhere.generate(prompt)
            _response.value = result.text
        }
    }
}

Adapter A/B Testing

suspend fun compareAdapters(
    prompt: String,
    adapterA: String,
    adapterB: String
): Pair<String, String> {
    // Test adapter A
    RunAnywhere.clearLoraAdapters()
    RunAnywhere.loadLoraAdapter(LoRAAdapterConfig(path = adapterA))
    val resultA = RunAnywhere.generate(prompt)

    // Test adapter B
    RunAnywhere.clearLoraAdapters()
    RunAnywhere.loadLoraAdapter(LoRAAdapterConfig(path = adapterB))
    val resultB = RunAnywhere.generate(prompt)

    return Pair(resultA.text, resultB.text)
}

Error Handling

try {
    RunAnywhere.loadLoraAdapter(LoRAAdapterConfig(
        path = "/path/to/adapter.gguf",
        scale = 1.0f
    ))
} catch (e: SDKError) {
    when {
        e.message?.contains("not initialized") == true ->
            println("SDK not initialized")
        e.message?.contains("not loaded") == true ->
            println("Load a base model first")
        else ->
            println("LoRA error: ${e.message}")
    }
}

Performance Tips

  • Check compatibility first — always call checkLoraCompatibility() before loading to avoid errors - Adapter loading recreates context — expect ~100-300ms latency when loading/removing adapters - KV cache is cleared — conversation history is reset when adapters change - Adapters are lightweight — typically 10-50MB vs. multi-GB base models - Scale tuning — start with 1.0 and adjust down if the adapter overfits

LLM Generation

Text generation with options

LLM Streaming

Streaming text generation

RAG Pipeline

Retrieval-augmented generation

Best Practices

Performance optimization