generate()

LLMGenerationOptions
LLMGenerationResult
Thinking Models
See Also

Generate text with detailed metrics including latency, token count, and generation speed.

final result = await RunAnywhere.generate(
  'Explain quantum computing in simple terms',
  options: LLMGenerationOptions(
    maxTokens: 200,
    temperature: 0.7,
  ),
);

print('Response: ${result.text}');
print('Tokens: ${result.tokensUsed}');
print('Speed: ${result.tokensPerSecond.toStringAsFixed(1)} tok/s');
print('Latency: ${result.latencyMs.toStringAsFixed(0)}ms');

LLMGenerationOptions

Parameter	Type	Default	Description
`maxTokens`	`int`	100	Maximum tokens to generate
`temperature`	`double`	0.8	Randomness (0.0–2.0)
`topP`	`double`	1.0	Nucleus sampling parameter
`stopSequences`	`List<String>`	`[]`	Stop generation at these
`systemPrompt`	`String?`	`null`	System prompt for context

const options = LLMGenerationOptions(
  maxTokens: 256,
  temperature: 0.7,
  topP: 0.95,
  stopSequences: ['END', '###'],
  systemPrompt: 'You are a helpful coding assistant.',
);

LLMGenerationResult

Property	Type	Description
`text`	`String`	Generated text
`thinkingContent`	`String?`	Thinking content (if supported)
`inputTokens`	`int`	Number of input tokens
`tokensUsed`	`int`	Number of output tokens
`modelUsed`	`String`	Model ID used
`latencyMs`	`double`	Total latency in milliseconds
`tokensPerSecond`	`double`	Generation speed
`timeToFirstTokenMs`	`double?`	Time to first token

Thinking Models

Some models support “thinking” tokens for chain-of-thought reasoning:

LlamaCpp.addModel(
  id: 'qwen-cot',
  name: 'Qwen CoT',
  url: '...',
  supportsThinking: true,  // Enable thinking token parsing
);

final result = await RunAnywhere.generate('Solve: 2x + 5 = 15');

if (result.thinkingContent != null) {
  print('Reasoning: ${result.thinkingContent}');
}
print('Answer: ${result.text}');

chat()

Simple one-liner

generateStream()

Stream tokens in real-time

chat()generateStream()

⌘I

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

LLMGenerationOptions

LLMGenerationResult

Thinking Models

See Also

chat()

generateStream()

Getting Started

Swift SDK

Kotlin SDK

React Native SDK

Flutter SDK

Web SDK

Vibe Coding

​LLMGenerationOptions

​LLMGenerationResult

​Thinking Models

​See Also

chat()

generateStream()

LLMGenerationOptions

LLMGenerationResult

Thinking Models

See Also