Image generation in Meerkat is exposed as the session-scoped generate_image builtin tool. The model asks for an image during a normal turn, Meerkat routes the request to a configured image provider, stores generated bytes in the realm blob store, and returns durable image references back into the session.
This is not a standalone image/generate RPC. Use it by running a session with builtins enabled and telling the agent to call generate_image when an image artifact is required.
Requirements
- A runtime-backed session surface with builtins enabled. The CLI default
--tools safe enables builtins; --tools none hides generate_image.
- At least one configured image provider. OpenAI uses
RKAT_OPENAI_API_KEY or OPENAI_API_KEY; Gemini uses RKAT_GEMINI_API_KEY, GEMINI_API_KEY, or GOOGLE_API_KEY.
- A blob store. CLI, REST, RPC, MCP, and persistent SDK-backed sessions wire this automatically through the runtime-backed surface.
count must be 1. Multiple image output is rejected today with unsupported_count; run multiple operations if you need several variants.
generate_image is separate from view_image. view_image reads an existing file into the model as image input. generate_image creates a new assistant-owned image and stores it as a blob.
CLI quickstart
The safest way to force a first turn to use image generation is to allow only the image tool for that turn:
rkat run \
--model gpt-5.5 \
--allow-tool generate_image \
"Use generate_image to create a square PNG of a cozy tabby cat by a sunlit window. Return the blob id and a one-sentence caption."
When the tool succeeds, the tool result includes images[].blob_ref.blob_id. Save the generated image with:
rkat blob get <blob_id> --output cat.png
Inspect the stored payload instead of writing raw bytes:
rkat blob get <blob_id> --json
Request Shape
The tool accepts one top-level field, request. For normal use, pass the simple request shape:
{
"request": {
"intent": "generate",
"prompt": "a cozy tabby cat by a sunlit window",
"size": "1024x1024",
"quality": "auto",
"format": "png",
"count": 1
}
}
For edits, set intent to edit, provide an instruction, and include at least one source_images entry:
{
"request": {
"intent": "edit",
"instruction": "make the background transparent and keep the subject unchanged",
"source_images": [
{
"kind": "assistant_image",
"image_id": "01985f61-8f7b-7000-8000-000000000001"
}
],
"format": "png"
}
}
Reference an existing blob when the source image did not come from a previous generate_image result:
{
"kind": "blob",
"blob_ref": {
"blob_id": "sha256:...",
"media_type": "image/png"
}
}
Fields
| Field | Values | Notes |
|---|
intent | generate, edit | Defaults to generate when prompt is present. |
prompt | string | Required for generate. Object form { "content": "..." } is also accepted. |
instruction | string | Required for edit. Object form { "content": "..." } is also accepted. |
source_images | array | Required for edit; use assistant_image, blob, transcript_block, or provider_native references. |
reference_images | array | Optional image references for generation. Provider support varies. |
size | auto, 1024x1024, 1024x1536, 1536x1024, WIDTHxHEIGHT | Custom sizes are parsed and then accepted or rejected by the selected provider. |
quality | auto, low, medium, high | Provider support varies. |
format | auto, png, jpeg, jpg, webp | jpg is accepted as jpeg. |
count / n | 1 | Only 1 is supported today. |
target | auto or provider/model target | Use the shorthand provider and model fields unless you need the canonical tagged target shape. |
provider | openai, gemini, google | Forces a provider default image model. google is a Gemini alias. |
model | provider-owned model id | Requires provider unless Meerkat can infer the owning provider from the model id. |
provider_params | object | Provider-specific options. Unknown fields are rejected by provider profiles that define a closed parameter shape. |
Target Selection
By default, target is auto.
- If the current session model belongs to a provider with an image profile,
auto uses that provider’s default image target.
- If the current session provider does not support image generation, set
provider explicitly.
- To force a model, pass both
provider and model.
OpenAI default:
{
"request": {
"intent": "generate",
"prompt": "a clean product render of a matte black desk lamp",
"provider": "openai",
"format": "png",
"provider_params": {
"background": "transparent"
}
}
}
Gemini default:
{
"request": {
"intent": "generate",
"prompt": "a widescreen storyboard frame of a rover crossing red dunes",
"provider": "gemini",
"size": "1536x1024",
"provider_params": {
"aspect_ratio": "16:9",
"image_size": "2K"
}
}
}
Provider Options
OpenAI
provider: "openai" uses the OpenAI image default, currently gpt-image-2.
OpenAI provider_params:
| Field | Values | Notes |
|---|
background | auto, transparent, opaque | transparent is model-dependent. |
output_compression | 0 to 100 | Applies where the selected OpenAI backend supports it. |
moderation | auto, low | OpenAI moderation mode. |
action | auto, generate, edit | Applies only to the hosted Responses image tool. Images API requests reject action. |
Model behavior:
gpt-image-2 uses the hosted Responses image tool.
- Other OpenAI-owned
gpt-image* or dall-e* models use the Images API path.
- The Images API path currently rejects edit requests, reference images, and
action.
Gemini
provider: "gemini" and provider: "google" use the Gemini image default, currently gemini-3.1-flash-image-preview.
Supported Gemini image models:
gemini-3.1-flash-image-preview
gemini-3-pro-image-preview
gemini-2.5-flash-image
Gemini provider_params:
| Field | Values | Notes |
|---|
aspect_ratio | 1:1, 16:9, 9:16, square1x1, landscape16x9, portrait9x16 | Overrides the universal size to aspect-ratio mapping. |
image_size | 1K, 2K, 4K, one_k, two_k, four_k | Ignored for gemini-2.5-flash-image, which does not use an explicit image-size field. |
Gemini image generation runs through a scoped image-model turn internally. The user and model still use one generate_image operation; they do not need to call switch_turn.
Result Shape
generate_image returns structured JSON:
{
"operation_id": "01985f61-8f7b-7000-8000-000000000002",
"terminal": { "terminal": "generated" },
"images": [
{
"image_id": "01985f61-8f7b-7000-8000-000000000003",
"blob_ref": {
"blob_id": "sha256:...",
"media_type": "image/png"
},
"media_type": "image/png",
"width": 1024,
"height": 1024
}
],
"provider_text": { "disposition": "not_emitted" },
"revised_prompt": { "disposition": "not_requested" },
"native_metadata": {
"provider": "open_ai",
"target_model": "gpt-5.4",
"response_id": "resp_..."
},
"warnings": []
}
Important fields:
| Field | Meaning |
|---|
operation_id | Runtime identity for this image operation. |
terminal | Final state: generated, empty_result, denied, refused_by_provider, safety_filtered, failed, cancelled, timeout, or scoped_restore_failed. |
images | Durable assistant image references. Use blob_ref.blob_id with rkat blob get. |
provider_text | Captured provider-side text when the image backend emits text alongside images. |
revised_prompt | Provider-revised prompt when the backend returns one. |
native_metadata | Provider-specific metadata such as response ids and target model. |
warnings | Non-terminal warnings such as fewer returned images, provider execution failure, or blob commit failure. |
Troubleshooting
| Symptom | Cause | Fix |
|---|
denied with unsupported_target | No configured image provider owns the requested target, or auto resolved to a non-image provider. | Set provider: "openai" or provider: "gemini" and verify credentials with rkat doctor. |
denied with unsupported_count | count or n was greater than 1. | Run one image operation per variant. |
denied with projection_unsupported | The selected backend cannot represent the request or provider params. | Remove unsupported params, avoid edits/reference images on the OpenAI Images API path, or use the provider default target. |
denied with realtime_transport_conflict | The operation would require an internal scoped image-model turn while a realtime binding is active. | Run image generation from a non-realtime session or choose an image target that does not require a scoped override. |
safety_filtered or refused_by_provider | The provider rejected the prompt or output. | Revise the prompt and retry. |
failed with provider_execution_failed | Provider call failed after planning. | Check provider credentials, model access, rate limits, and network connectivity. |
failed with blob_commit_failed | Image bytes were returned but could not be stored. | Check the realm’s blob-store backend and disk permissions. |
- Tools for how builtins are enabled and scoped.
- Built-in tools reference for the concise parameter reference.
- Providers for provider credentials and model catalog behavior.
docs/architecture/assistant-image-generation-substrate.md for the internal design record.