Stream tokens as they are generated for responsive UX
Stream tokens in real-time for a responsive user experience. Ideal for chat interfaces where users expect to see text appear progressively.
final streamResult = await RunAnywhere.generateStream( 'Tell me a story about a robot', options: LLMGenerationOptions(maxTokens: 500),);// Display tokens as they arriveawait for (final token in streamResult.stream) { stdout.write(token); // Real-time output // Or update UI: setState(() => _response += token);}// Get final metrics after streaming completesfinal metrics = await streamResult.result;print('\n\nGenerated ${metrics.tokensUsed} tokens');print('Speed: ${metrics.tokensPerSecond.toStringAsFixed(1)} tok/s');
final streamResult = await RunAnywhere.generateStream('Long story...');// Start a timer to cancel after 5 secondsFuture.delayed(Duration(seconds: 5), () { streamResult.cancel(); print('Generation cancelled');});// Or cancel via the static methodawait RunAnywhere.cancelGeneration();
Use streaming for chat interfaces — Users perceive the app as more responsive when they see
tokens appear progressively, even if total generation time is the same.
Update UI incrementally — Append tokens to your state as they arrive
Show a cancel button — Let users stop long generations
Handle cancellation gracefully — The stream will complete when cancelled
Get final metrics — Always await result for accurate performance data