generate()

Use generate() for full control over text generation with detailed performance metrics.

val result = RunAnywhere.generate(
    prompt = "Write a haiku about Kotlin programming",
    options = LLMGenerationOptions(
        maxTokens = 50,
        temperature = 1.0f,
        topP = 0.9f,
        stopSequences = listOf("###")
    )
)

println("Response: ${result.text}")
println("Model: ${result.modelUsed}")
println("Tokens: ${result.tokensUsed}")
println("Speed: ${result.tokensPerSecond} tok/s")
println("Latency: ${result.latencyMs}ms")

// For reasoning models (e.g., models with thinking capability)
result.thinkingContent?.let { thinking ->
    println("Reasoning: $thinking")
}

LLMGenerationResult

The result object contains comprehensive metrics:

Property	Type	Description
`text`	`String`	Generated response text
`thinkingContent`	`String?`	Reasoning content (for thinking models)
`inputTokens`	`Int`	Number of prompt tokens
`tokensUsed`	`Int`	Number of output tokens
`modelUsed`	`String`	Model ID used for generation
`latencyMs`	`Double`	Total generation time in milliseconds
`tokensPerSecond`	`Double`	Generation speed
`timeToFirstTokenMs`	`Double?`	Time to first token (streaming)
`framework`	`String?`	Inference framework used

LLMGenerationOptions

Customize generation behavior:

data class LLMGenerationOptions(
    val maxTokens: Int = 100,           // Maximum tokens to generate
    val temperature: Float = 0.8f,      // Creativity (0.0-2.0)
    val topP: Float = 1.0f,             // Nucleus sampling (0.0-1.0)
    val stopSequences: List<String> = emptyList(),
    val streamingEnabled: Boolean = false,
    val systemPrompt: String? = null    // System behavior prompt
)

Example: Creative Writing

val story = RunAnywhere.generate(
    prompt = "Write a short story about a robot learning to paint",
    options = LLMGenerationOptions(
        maxTokens = 500,
        temperature = 1.2f,   // Higher = more creative
        topP = 0.95f
    )
)

Example: Factual Response

val facts = RunAnywhere.generate(
    prompt = "List the planets in our solar system",
    options = LLMGenerationOptions(
        maxTokens = 200,
        temperature = 0.1f,   // Lower = more deterministic
        topP = 0.5f
    )
)

Cancel Generation

Cancel an ongoing generation:

// Start generation in a coroutine
val job = lifecycleScope.launch {
    val result = RunAnywhere.generate(longPrompt, options)
}

// Cancel if needed
cancelButton.setOnClickListener {
    RunAnywhere.cancelGeneration()
    job.cancel()
}

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

LLMGenerationResult

LLMGenerationOptions

Example: Creative Writing

Example: Factual Response

Cancel Generation

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

​LLMGenerationResult

​LLMGenerationOptions

​Example: Creative Writing

​Example: Factual Response

​Cancel Generation

LLMGenerationResult

LLMGenerationOptions

Example: Creative Writing

Example: Factual Response

Cancel Generation