requestStreaming PRO

requestStreaming requests a streaming response from the Assistant. Instead of returning a complete result at once, the Assistant emits chunks incrementally as the model generates output.

This enables:

Real-time UI updates (typing effect)
Low-latency handling of long responses
Progressive rendering of results
Streaming logs and intermediate output handling

The API returns a ReadableStream<StreamChunk>, which can be consumed using for await ... of.

API Definition

function requestStreaming(options: {
  systemPrompt?: string | null
  messages: MessageItem | MessageItem[]
  provider?: Provider
  modelId?: string
}): Promise<ReadableStream<StreamChunk>>

Parameters

options.systemPrompt (optional)

Type: string | null
Specifies the system prompt for this request.
If omitted:
- The default Assistant system prompt is used.
If provided:
- It fully replaces the default system prompt.
- Assistant Tools are not available.

Typical use cases:

Defining a strict role (e.g. reviewer, translator, summarizer)
Enforcing output tone or behavior
Running the model without built-in tools

options.messages

Type: MessageItem | MessageItem[]
Required
Represents the conversation context sent to the model.

MessageItem

type MessageItem = {
  role: "user" | "assistant"
  content: MessageContent | MessageContent[]
}

role
- "user": user input
- "assistant": previous assistant messages (for context)

MessageContent Types

Text

type MessageTextContent =
  | string
  | { type: "text"; content: string }

Image

type MessageImageContent = {
  type: "image"
  content: string // data:image/...;base64,...
}

Document

type MessageDocumentContent = {
  type: "document"
  content: {
    mediaType: string
    data: string // base64
  }
}

options.provider (optional)

Type: Provider
Specifies the AI provider.
If omitted, the currently configured default provider is used.
Supported values:
- "openai"
- "gemini"
- "anthropic"
- "deepseek"
- "openrouter"
- { custom: string }

options.modelId (optional)

Type: string
Specifies the model ID.
Must match a model actually supported by the selected provider.
If omitted, the provider’s default model is used.

Return Value

Promise<ReadableStream<StreamChunk>>

Once resolved, you receive a stream that can be consumed asynchronously.

StreamChunk Types

The stream may emit the following chunk types.

StreamTextChunk

type StreamTextChunk = {
  type: "text"
  content: string
}

Represents user-visible generated text.
Multiple chunks concatenated form the final response.

StreamReasoningChunk

type StreamReasoningChunk = {
  type: "reasoning"
  content: string
}

Represents intermediate reasoning produced by the model.
Availability and granularity depend on the provider and model.

StreamUsageChunk

type StreamUsageChunk = {
  type: "usage"
  content: {
    totalCost: number | null
    cacheReadTokens: number | null
    cacheWriteTokens: number | null
    inputTokens: number
    outputTokens: number
  }
}

Notes:

Typically emitted once near the end of the stream.
Some providers may omit certain fields.
totalCost may be null if the provider does not expose pricing data.

Examples

Example 1: Basic streaming request

const stream = await Assistant.requestStreaming({
  messages: {
    role: "user",
    content: "Tell me a short science fiction story."
  },
  provider: "openai"
})

let result = ""

for await (const chunk of stream) {
  if (chunk.type === "text") {
    result += chunk.content
    console.log(chunk.content)
  }
}

Example 2: Handling text, reasoning, and usage separately

const stream = await Assistant.requestStreaming({
  systemPrompt: "You are a precise technical writing assistant.",
  messages: [
    {
      role: "user",
      content: "Explain what HTTP/3 is."
    }
  ]
})

let answer = ""
let reasoningLog = ""
let usage = null

for await (const chunk of stream) {
  switch (chunk.type) {
    case "text":
      answer += chunk.content
      break

    case "reasoning":
      reasoningLog += chunk.content
      break

    case "usage":
      usage = chunk.content
      break
  }
}

console.log(answer)
console.log(usage)

Example 3: Streaming with document input

const stream = await Assistant.requestStreaming({
  messages: [
    {
      role: "user",
      content: [
        { type: "text", content: "Summarize the key points of this document." },
        {
          type: "document",
          content: {
            mediaType: "application/pdf",
            data: "JVBERi0xLjQKJcfs..."
          }
        }
      ]
    }
  ],
  provider: "anthropic"
})

for await (const chunk of stream) {
  if (chunk.type === "text") {
    console.log(chunk.content)
  }
}

Usage Notes and Best Practices

Streams must be consumed sequentially; do not read concurrently.
For UI scenarios:
- Render text chunks immediately.
- Keep reasoning for debugging or developer modes.
- Process usage after completion.
If you no longer need the output, stop consuming the stream to avoid unnecessary cost.
Not all providers/models emit reasoning or usage.
Do not assume a chunk represents a complete sentence; chunk sizes vary.

requestStreaming# PRO

#API Definition

#Parameters

#options.systemPrompt (optional)

#options.messages

#MessageItem

#MessageContent Types

#Text

#Image

#Document

#options.provider (optional)

#options.modelId (optional)

#Return Value

#StreamChunk Types

#StreamTextChunk

#StreamReasoningChunk

#StreamUsageChunk

#Examples

#Example 1: Basic streaming request

#Example 2: Handling text, reasoning, and usage separately

#Example 3: Streaming with document input

#Usage Notes and Best Practices