requestStreaming PRO
requestStreaming requests a streaming response from the Assistant.
Instead of returning a complete result at once, the Assistant emits chunks incrementally as the model generates output.
This enables:
- Real-time UI updates (typing effect)
- Low-latency handling of long responses
- Progressive rendering of results
- Streaming logs and intermediate output handling
The API returns a ReadableStream<StreamChunk>, which can be consumed using for await ... of.
API Definition
Parameters
options.systemPrompt (optional)
-
Type:
string | null -
Specifies the system prompt for this request.
-
If omitted:
- The default Assistant system prompt is used.
-
If provided:
- It fully replaces the default system prompt.
- Assistant Tools are not available.
Typical use cases:
- Defining a strict role (e.g. reviewer, translator, summarizer)
- Enforcing output tone or behavior
- Running the model without built-in tools
options.messages
- Type:
MessageItem | MessageItem[] - Required
- Represents the conversation context sent to the model.
MessageItem
-
role"user": user input"assistant": previous assistant messages (for context)
MessageContent Types
Text
Image
Document
options.provider (optional)
-
Type:
Provider -
Specifies the AI provider.
-
If omitted, the currently configured default provider is used.
-
Supported values:
"openai""gemini""anthropic""deepseek""openrouter"{ custom: string }
options.modelId (optional)
- Type:
string - Specifies the model ID.
- Must match a model actually supported by the selected provider.
- If omitted, the provider’s default model is used.
Return Value
Once resolved, you receive a stream that can be consumed asynchronously.
StreamChunk Types
The stream may emit the following chunk types.
StreamTextChunk
- Represents user-visible generated text.
- Multiple chunks concatenated form the final response.
StreamReasoningChunk
- Represents intermediate reasoning produced by the model.
- Availability and granularity depend on the provider and model.
StreamUsageChunk
Notes:
- Typically emitted once near the end of the stream.
- Some providers may omit certain fields.
totalCostmay benullif the provider does not expose pricing data.
Examples
Example 1: Basic streaming request
Example 2: Handling text, reasoning, and usage separately
Example 3: Streaming with document input
Usage Notes and Best Practices
-
Streams must be consumed sequentially; do not read concurrently.
-
For UI scenarios:
- Render
textchunks immediately. - Keep
reasoningfor debugging or developer modes. - Process
usageafter completion.
- Render
-
If you no longer need the output, stop consuming the stream to avoid unnecessary cost.
-
Not all providers/models emit
reasoningorusage. -
Do not assume a chunk represents a complete sentence; chunk sizes vary.
