Documentation Index
Fetch the complete documentation index at: https://docs.runanywhere.ai/llms.txt
Use this file to discover all available pages before exploring further.
Use generate() for full control over text generation with detailed performance metrics.
val result = RunAnywhere.generate(
prompt = "Write a haiku about Kotlin programming",
options = LLMGenerationOptions(
maxTokens = 50,
temperature = 1.0f,
topP = 0.9f,
stopSequences = listOf("###")
)
)
println("Response: ${result.text}")
println("Model: ${result.modelUsed}")
println("Tokens: ${result.tokensUsed}")
println("Speed: ${result.tokensPerSecond} tok/s")
println("Latency: ${result.latencyMs}ms")
// For reasoning models (e.g., models with thinking capability)
result.thinkingContent?.let { thinking ->
println("Reasoning: $thinking")
}
LLMGenerationResult
The result object contains comprehensive metrics:
| Property | Type | Description |
|---|
text | String | Generated response text |
thinkingContent | String? | Reasoning content (for thinking models) |
inputTokens | Int | Number of prompt tokens |
tokensUsed | Int | Number of output tokens |
modelUsed | String | Model ID used for generation |
latencyMs | Double | Total generation time in milliseconds |
tokensPerSecond | Double | Generation speed |
timeToFirstTokenMs | Double? | Time to first token (streaming) |
framework | String? | Inference framework used |
LLMGenerationOptions
Customize generation behavior:
data class LLMGenerationOptions(
val maxTokens: Int = 100, // Maximum tokens to generate
val temperature: Float = 0.8f, // Creativity (0.0-2.0)
val topP: Float = 1.0f, // Nucleus sampling (0.0-1.0)
val stopSequences: List<String> = emptyList(),
val streamingEnabled: Boolean = false,
val systemPrompt: String? = null // System behavior prompt
)
Example: Creative Writing
val story = RunAnywhere.generate(
prompt = "Write a short story about a robot learning to paint",
options = LLMGenerationOptions(
maxTokens = 500,
temperature = 1.2f, // Higher = more creative
topP = 0.95f
)
)
Example: Factual Response
val facts = RunAnywhere.generate(
prompt = "List the planets in our solar system",
options = LLMGenerationOptions(
maxTokens = 200,
temperature = 0.1f, // Lower = more deterministic
topP = 0.5f
)
)
Cancel Generation
Cancel an ongoing generation:
// Start generation in a coroutine
val job = lifecycleScope.launch {
val result = RunAnywhere.generate(longPrompt, options)
}
// Cancel if needed
cancelButton.setOnClickListener {
RunAnywhere.cancelGeneration()
job.cancel()
}