Assistant PRO

Requires Scripting PRO

The Assistant module provides a powerful set of APIs that allow users to request structured JSON data from an intelligent assistant. This feature can be used for automation tasks such as extracting billing information, classifying expenses, parsing text, or recognizing image content.


isAvailable Variable

Description

Indicates whether the Assistant API is currently available.

  • This status depends on the selected AI provider and whether a valid API key has been configured.
  • If no valid API key is provided, the Assistant API will be unavailable.

requestStructuredData Method

Description

The requestStructuredData method allows you to send a natural language prompt and receive a structured data response that conforms to a defined JSON Schema. It supports two forms:

  1. Text-only input
  2. Input with images — allows the model to analyze both textual and visual data

Syntax 1: Text Input Version

1function requestStructuredData<R>(
2  prompt: string,
3  schema: JSONSchemaArray | JSONSchemaObject,
4  options?: {
5    provider: "openai" | "gemini" | "anthropic" | "deepseek" | "openrouter" | { custom: string }
6    modelId?: string
7  }
8): Promise<R>

Syntax 2: Image Input Version

1function requestStructuredData<R>(
2  prompt: string,
3  images: string[],
4  schema: JSONSchemaArray | JSONSchemaObject,
5  options?: {
6    provider: "openai" | "gemini" | "anthropic" | "deepseek" | "openrouter" | { custom: string }
7    modelId?: string
8  }
9): Promise<R>

Parameters

prompt (string)

The natural language prompt describing what should be parsed or extracted. Example:

“Please extract the amount, date, category, and location from the following bill.”

images (string[])

An array of input images for the assistant to process. Each item must be a Base64-encoded data URI string, such as:

...
  • Multiple images are supported, but avoid providing too many to prevent request timeouts or large payloads.
  • Useful for tasks like invoice OCR, document extraction, or image-based scene analysis.

schema (JSONSchemaArray | JSONSchemaObject)

Defines the structure of the expected JSON output (see “JSON Schema Definition” below).

options (optional)

  • provider — specifies the AI provider to use:

    Value Description
    "openai" Use OpenAI models (e.g., GPT-4, GPT-4 Turbo)
    "gemini" Use Google Gemini models
    "anthropic" Use Anthropic Claude models
    "deepseek" Use DeepSeek models
    "openrouter" Use the OpenRouter multi-model platform
    { custom: string } Specify a custom API provider name, such as your own backend service
  • modelId — specifies the model ID (e.g., "gpt-4-turbo", "gemini-1.5-pro", "claude-3-opus"). If not provided, the default model for the selected provider will be used.


Return Value

Returns a Promise that resolves to the structured data object matching the defined schema, with a type of R.


JSON Schema Definition

The schema parameter defines the structure of the expected data.

JSONSchemaType

1type JSONSchemaType = JSONSchemaPrimitive | JSONSchemaArray | JSONSchemaObject

Primitive Type

1type JSONSchemaPrimitive = {
2  type: "string" | "number" | "boolean"
3  required?: boolean
4  description: string
5}

Array Type

1type JSONSchemaArray = {
2  type: "array"
3  items: JSONSchemaType
4  required?: boolean
5  description: string
6}

Object Type

1type JSONSchemaObject = {
2  type: "object"
3  properties: Record<string, JSONSchemaType>
4  required?: boolean
5  description: string
6}

Example: Extracting Bill Information (Text Input)

Suppose you have a bill text and want to extract the amount, date, category, and location:

1const someBillDetails = `
2- Amount: $15.00
3- Date: 2024-03-11 14:30
4- Location: City Center Parking
5- Category: Parking
6`
7
8const prompt = `Please parse the following bill information and output structured data: ${someBillDetails}`
9
10const schema: JSONSchemaObject = {
11  type: "object",
12  properties: {
13    totalAmount: {
14      type: "number",
15      required: true,
16      description: "Total bill amount"
17    },
18    category: {
19      type: "string",
20      required: true,
21      description: "Bill category"
22    },
23    date: {
24      type: "string",
25      required: false,
26      description: "Bill date"
27    },
28    location: {
29      type: "string",
30      required: false,
31      description: "Bill location"
32    }
33  }
34}
35
36const data = await Assistant.requestStructuredData(
37  prompt,
38  schema,
39  {
40    provider: "openai",
41    modelId: "gpt-4-turbo"
42  }
43)
44
45console.log(data)

Possible Output

1{
2  "totalAmount": 15.00,
3  "category": "Parking",
4  "date": "2024-03-11 14:30",
5  "location": "City Center Parking"
6}

Example: Parsing Invoice Information from Images

The following example demonstrates how to use requestStructuredData with image input to extract structured invoice data:

1const prompt = "Please extract the total amount, date, and merchant name from the following images."
2
3// const base64Data = UIImage.fromFile("/path/to/image.png").toJPEGBase64String(0.6)
4// const base64Image = `data:image/jpeg;base64,${base64Data}`
5
6const images = [
7  "...", // First invoice
8  "..."      // Second invoice
9]
10
11const schema: JSONSchemaObject = {
12  type: "object",
13  properties: {
14    total: { type: "number", description: "Invoice total amount", required: true },
15    date: { type: "string", description: "Invoice date" },
16    merchant: { type: "string", description: "Merchant name" }
17  },
18  description: "Invoice information"
19}
20
21const result = await Assistant.requestStructuredData(
22  prompt,
23  images,
24  schema,
25  { provider: "gemini", modelId: "gemini-1.5-pro" }
26)
27
28console.log(result)

Possible Output:

1{
2  "total": 268.5,
3  "date": "2024-12-01",
4  "merchant": "Shenzhen Youxuan Supermarket"
5}

Usage Notes

  1. Ensure the schema is well-defined The returned data must match the defined schema; otherwise, parsing may fail.

  2. Use the required field appropriately Fields that must always be present should have required: true. Optional fields may omit it.

  3. Select the provider and modelId carefully If you need a specific model (e.g., GPT-4, Gemini Pro), specify it explicitly in options.

  4. Supports OpenRouter and Custom Providers

    • "openrouter" allows using multiple models via the OpenRouter platform.
    • { custom: "your-provider" } lets you use your own backend AI service.
  5. Supports Image Input for Multimodal Models

    • Only some models (e.g., GPT-4 Turbo, Gemini 1.5 Pro) support image input.
    • Each image must be a valid Base64 data:image/...;base64, string.
    • Avoid passing too many images at once.
  6. Add proper error handling

1try {
2  const result = await Assistant.requestStructuredData(prompt, schema, {
3    provider: { custom: "my-ai-backend" },
4    modelId: "my-custom-model"
5  })
6  console.log("Parsed result:", result)
7} catch (err) {
8  console.error("Parsing failed:", err)
9}