Documentation Index
Fetch the complete documentation index at: https://docs.rkat.ai/llms.txt
Use this file to discover all available pages before exploring further.
Image generation in Meerkat is exposed as the session-scoped generate_image builtin tool. The model asks for an image during a normal turn, Meerkat routes the request to a configured image provider, stores generated bytes in the realm blob store, and returns durable image references back into the session.
This is not a standalone image/generate RPC. Use it by running a session with builtins enabled and telling the agent to call generate_image when an image artifact is required.
Requirements
- A runtime-backed session surface with builtins enabled. The CLI default
--tools safe enables builtins; --tools none hides generate_image.
- At least one configured image provider. OpenAI uses
RKAT_OPENAI_API_KEY or OPENAI_API_KEY; Azure OpenAI uses AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, and AZURE_OPENAI_IMAGE_GENERATION_DEPLOYMENT; Gemini uses RKAT_GEMINI_API_KEY, GEMINI_API_KEY, or GOOGLE_API_KEY.
- A blob store. CLI, REST, RPC, MCP, and persistent SDK-backed sessions wire this automatically through the runtime-backed surface.
count must be 1. Multiple image output is rejected today with unsupported_count; run multiple operations if you need several variants.
generate_image is separate from view_image. view_image reads an existing file into the model as image input. generate_image creates a new assistant-owned image and stores it as a blob. When the model needs to save a generated blob to disk during the same run, it can call blob_save_file.
CLI quickstart
The safest way to force a first turn to use image generation is to allow only the image tool for that turn:
rkat run \
--model gpt-5.5 \
--allow-tool generate_image \
"Use generate_image to create a square PNG of a cozy tabby cat by a sunlit window. Return the blob id and a one-sentence caption."
When the tool succeeds, the tool result includes images[].blob_ref.blob_id. Save the generated image with:
rkat blob get <blob_id> --output cat.png
You can also ask the agent to save the image itself:
rkat run "Create a PNG infographic about today's top news and save it to top-news-infographic.png" --yolo -m gpt-5.5
Inspect the stored payload instead of writing raw bytes:
rkat blob get <blob_id> --json
Request Shape
The tool accepts one top-level field, request. For normal use, pass the simple request shape:
{
"request": {
"intent": "generate",
"prompt": "a cozy tabby cat by a sunlit window",
"size": "1024x1024",
"quality": "auto",
"format": "png",
"count": 1
}
}
This is the Meerkat generate_image tool input, not a raw OpenAI Responses API request. Meerkat lowers universal fields such as size, quality, and format into the selected provider’s native request. For OpenAI, format becomes the provider-side output_format option.
For edits, set intent to edit, provide an instruction, and include at least one source_images entry:
{
"request": {
"intent": "edit",
"instruction": "replace the background with a plain white studio backdrop and keep the subject unchanged",
"source_images": [
{
"kind": "assistant_image",
"image_id": "01985f61-8f7b-7000-8000-000000000001"
}
],
"format": "png"
}
}
Reference an existing blob when the source image did not come from a previous generate_image result:
{
"kind": "blob",
"blob_ref": {
"blob_id": "sha256:...",
"media_type": "image/png"
}
}
Fields
| Field | Values | Notes |
|---|
intent | generate, edit | Defaults to generate when prompt is present. |
prompt | string | Required for generate. Object form { "content": "..." } is also accepted. |
instruction | string | Required for edit. Object form { "content": "..." } is also accepted. |
source_images | array | Required for edit; use assistant_image, blob, transcript_block, or provider_native references. |
reference_images | array | Optional image references for generation. Provider support varies. |
size | auto, 1024x1024, 1024x1536, 1536x1024, WIDTHxHEIGHT | Custom sizes are parsed and then accepted or rejected by the selected provider. |
quality | auto, low, medium, high | Provider support varies. |
format | auto, png, jpeg, jpg, webp | jpg is accepted as jpeg. |
count / n | 1 | Only 1 is supported today. |
target | auto or provider/model target | Use the shorthand provider and model fields unless you need the canonical tagged target shape. |
provider | openai, gemini, google | Forces a provider default image model. google is a Gemini alias. |
model | provider-owned model id | Requires provider unless Meerkat can infer the owning provider from the model id. |
provider_params | object | Provider-specific options. Unknown fields are rejected by provider profiles that define a closed parameter shape. |
Target Selection
By default, target is auto.
- If the current session model belongs to a provider with an image profile,
auto uses that provider’s default image target.
- If the current session provider does not support image generation, set
provider explicitly.
- To force a model, pass both
provider and model.
OpenAI default:
{
"request": {
"intent": "generate",
"prompt": "a clean product render of a matte black desk lamp",
"provider": "openai",
"size": "1024x1024",
"quality": "auto",
"format": "webp"
}
}
Gemini default:
{
"request": {
"intent": "generate",
"prompt": "a widescreen storyboard frame of a rover crossing red dunes",
"provider": "gemini",
"size": "1536x1024",
"provider_params": {
"aspect_ratio": "16:9",
"image_size": "2K"
}
}
}
Provider Options
OpenAI
provider: "openai" uses the OpenAI image default, currently gpt-image-2.
Meerkat fields and OpenAI lowering:
| Field | Values | Notes |
|---|
size | auto, 1024x1024, 1024x1536, 1536x1024, WIDTHxHEIGHT | Universal Meerkat field. Lowered to OpenAI size. |
quality | auto, low, medium, high | Universal Meerkat field. Lowered to OpenAI quality. |
format | auto, png, jpeg, jpg, webp | Universal Meerkat field. Lowered to OpenAI output_format; jpg normalizes to jpeg. |
provider_params.background | auto, opaque | Advanced OpenAI override. gpt-image-2 does not support transparent backgrounds. Use an opaque background and post-process if you need transparency. |
provider_params.output_compression | 0 to 100 | Advanced OpenAI override. Applies only when the universal format is jpeg or webp. |
provider_params.moderation | auto, low | Advanced GPT Image moderation strictness override. Omit it for OpenAI’s default filtering. |
provider_params.action | auto, generate, edit | Advanced hosted Responses image-tool override. Normal Meerkat calls should use top-level intent and omit action; Images API routes reject it. |
provider_params.reasoning_effort | none, low, medium, high, xhigh | Hosted gpt-image-2 route only. Lowered to OpenAI reasoning.effort. |
provider_params.web_search | true, false, null, or object | Hosted gpt-image-2 route only. true/{} enables OpenAI web_search; object values may include OpenAI web-search options such as search_context_size, filters, user_location, external_web_access, and return_token_budget. |
Model behavior:
gpt-image-2 uses the hosted Responses image tool internally, but callers still pass Meerkat’s generate_image request shape.
- On the
azure_openai backend, hosted image generation requires backend option image_generation_deployment (for example gpt-image-2). The session model remains the Azure text deployment name, and Meerkat selects the image deployment with Azure’s Responses image-generation header.
gpt-image-2 accepts flexible WIDTHxHEIGHT sizes when they satisfy OpenAI’s constraints: both edges are multiples of 16 px, max edge is 3840 px, aspect ratio is at most 3:1, and total pixels are between 655,360 and 8,294,400. Common values include 1024x1024, 1536x1024, 1024x1536, 2048x2048, 2048x1152, 3840x2160, and 2160x3840.
- Do not copy raw Responses API JSON into
provider_params. Use Meerkat’s universal size, quality, format, and intent fields, and reserve provider_params for the advanced OpenAI overrides above.
- For fresh/current image-only requests, prefer passing
provider_params.web_search to generate_image over doing a separate search turn first.
- Do not pass
background: "transparent" for gpt-image-2; OpenAI rejects it. Do not pass input_fidelity; Meerkat’s 0.6.6 OpenAI adapter uses a closed provider-params shape and rejects unknown fields.
- Other OpenAI-owned
gpt-image* or dall-e* models use the Images API path.
- The Images API path currently rejects edit requests, reference images,
action, reasoning_effort, and web_search.
Gemini
provider: "gemini" and provider: "google" use the Gemini image default, currently gemini-3.1-flash-image-preview.
Supported Gemini image models:
gemini-3.1-flash-image-preview
gemini-3-pro-image-preview
gemini-2.5-flash-image
Gemini provider_params:
| Field | Values | Notes |
|---|
aspect_ratio | 1:1, 16:9, 9:16, square1x1, landscape16x9, portrait9x16 | Overrides the universal size to aspect-ratio mapping. |
image_size | 1K, 2K, 4K, one_k, two_k, four_k | Ignored for gemini-2.5-flash-image, which does not use an explicit image-size field. |
Gemini image generation runs through a scoped image-model turn internally. The user and model still use one generate_image operation; they do not need to call switch_turn.
Result Shape
generate_image returns structured JSON:
{
"operation_id": "01985f61-8f7b-7000-8000-000000000002",
"terminal": { "terminal": "generated" },
"images": [
{
"image_id": "01985f61-8f7b-7000-8000-000000000003",
"blob_ref": {
"blob_id": "sha256:...",
"media_type": "image/png"
},
"media_type": "image/png",
"width": 1024,
"height": 1024
}
],
"provider_text": { "disposition": "not_emitted" },
"revised_prompt": { "disposition": "not_requested" },
"native_metadata": {
"provider": "open_ai",
"target_model": "gpt-5.4",
"response_id": "resp_..."
},
"warnings": []
}
Important fields:
| Field | Meaning |
|---|
operation_id | Runtime identity for this image operation. |
terminal | Final state: generated, empty_result, denied, refused_by_provider, safety_filtered, failed, cancelled, timeout, or scoped_restore_failed. |
images | Durable assistant image references. Use blob_ref.blob_id with blob_save_file inside a run or rkat blob get outside the run. |
provider_text | Captured provider-side text when the image backend emits text alongside images. |
revised_prompt | Provider-revised prompt when the backend returns one. |
native_metadata | Provider-specific metadata such as response ids and target model. |
warnings | Non-terminal warnings such as fewer returned images, provider execution failure, or blob commit failure. |
Troubleshooting
| Symptom | Cause | Fix |
|---|
denied with unsupported_target | No configured image provider owns the requested target, or auto resolved to a non-image provider. | Set provider: "openai" or provider: "gemini" and verify credentials with rkat doctor. |
denied with unsupported_count | count or n was greater than 1. | Run one image operation per variant. |
denied with projection_unsupported | The selected backend cannot represent the request or provider params. | Remove unsupported params, avoid edits/reference images on the OpenAI Images API path, or use the provider default target. |
denied with realtime_transport_conflict | The operation would require an internal scoped image-model turn while a realtime binding is active. | Run image generation from a non-realtime session or choose an image target that does not require a scoped override. |
safety_filtered or refused_by_provider | The provider rejected the prompt or output. | Revise the prompt and retry. |
failed with provider_execution_failed | Provider call failed after planning. | Check provider credentials, model access, rate limits, and network connectivity. |
failed with blob_commit_failed | Image bytes were returned but could not be stored. | Check the realm’s blob-store backend and disk permissions. |