Image generation

Image generation in Meerkat is exposed as the session-scoped generate_image builtin tool. The model asks for an image during a normal turn, Meerkat routes the request to a configured image provider, stores generated bytes in the realm blob store, and returns durable image references back into the session. This is not a standalone image/generate RPC. Use it by running a session with builtins enabled and telling the agent to call generate_image when an image artifact is required.

Requirements

A runtime-backed session surface with builtins enabled. The CLI default --tools safe enables builtins; --tools none hides generate_image.
At least one configured image provider. OpenAI uses RKAT_OPENAI_API_KEY or OPENAI_API_KEY; Gemini uses RKAT_GEMINI_API_KEY, GEMINI_API_KEY, or GOOGLE_API_KEY.
A blob store. CLI, REST, RPC, MCP, and persistent SDK-backed sessions wire this automatically through the runtime-backed surface.
count must be 1. Multiple image output is rejected today with unsupported_count; run multiple operations if you need several variants.

generate_image is separate from view_image. view_image reads an existing file into the model as image input. generate_image creates a new assistant-owned image and stores it as a blob.

CLI quickstart

The safest way to force a first turn to use image generation is to allow only the image tool for that turn:

rkat run \
  --model gpt-5.5 \
  --allow-tool generate_image \
  "Use generate_image to create a square PNG of a cozy tabby cat by a sunlit window. Return the blob id and a one-sentence caption."

When the tool succeeds, the tool result includes images[].blob_ref.blob_id. Save the generated image with:

rkat blob get <blob_id> --output cat.png

Inspect the stored payload instead of writing raw bytes:

rkat blob get <blob_id> --json

Request Shape

The tool accepts one top-level field, request. For normal use, pass the simple request shape:

{
  "request": {
    "intent": "generate",
    "prompt": "a cozy tabby cat by a sunlit window",
    "size": "1024x1024",
    "quality": "auto",
    "format": "png",
    "count": 1
  }
}

For edits, set intent to edit, provide an instruction, and include at least one source_images entry:

{
  "request": {
    "intent": "edit",
    "instruction": "make the background transparent and keep the subject unchanged",
    "source_images": [
      {
        "kind": "assistant_image",
        "image_id": "01985f61-8f7b-7000-8000-000000000001"
      }
    ],
    "format": "png"
  }
}

Reference an existing blob when the source image did not come from a previous generate_image result:

{
  "kind": "blob",
  "blob_ref": {
    "blob_id": "sha256:...",
    "media_type": "image/png"
  }
}

Fields

Field	Values	Notes
`intent`	`generate`, `edit`	Defaults to `generate` when `prompt` is present.
`prompt`	string	Required for `generate`. Object form `{ "content": "..." }` is also accepted.
`instruction`	string	Required for `edit`. Object form `{ "content": "..." }` is also accepted.
`source_images`	array	Required for `edit`; use `assistant_image`, `blob`, `transcript_block`, or `provider_native` references.
`reference_images`	array	Optional image references for generation. Provider support varies.
`size`	`auto`, `1024x1024`, `1024x1536`, `1536x1024`, `WIDTHxHEIGHT`	Custom sizes are parsed and then accepted or rejected by the selected provider.
`quality`	`auto`, `low`, `medium`, `high`	Provider support varies.
`format`	`auto`, `png`, `jpeg`, `jpg`, `webp`	`jpg` is accepted as `jpeg`.
`count` / `n`	`1`	Only `1` is supported today.
`target`	`auto` or provider/model target	Use the shorthand `provider` and `model` fields unless you need the canonical tagged target shape.
`provider`	`openai`, `gemini`, `google`	Forces a provider default image model. `google` is a Gemini alias.
`model`	provider-owned model id	Requires `provider` unless Meerkat can infer the owning provider from the model id.
`provider_params`	object	Provider-specific options. Unknown fields are rejected by provider profiles that define a closed parameter shape.

Target Selection

By default, target is auto.

If the current session model belongs to a provider with an image profile, auto uses that provider’s default image target.
If the current session provider does not support image generation, set provider explicitly.
To force a model, pass both provider and model.

OpenAI default:

{
  "request": {
    "intent": "generate",
    "prompt": "a clean product render of a matte black desk lamp",
    "provider": "openai",
    "format": "png",
    "provider_params": {
      "background": "transparent"
    }
  }
}

Gemini default:

{
  "request": {
    "intent": "generate",
    "prompt": "a widescreen storyboard frame of a rover crossing red dunes",
    "provider": "gemini",
    "size": "1536x1024",
    "provider_params": {
      "aspect_ratio": "16:9",
      "image_size": "2K"
    }
  }
}

Provider Options

OpenAI

provider: "openai" uses the OpenAI image default, currently gpt-image-2. OpenAI provider_params:

Field	Values	Notes
`background`	`auto`, `transparent`, `opaque`	`transparent` is model-dependent.
`output_compression`	`0` to `100`	Applies where the selected OpenAI backend supports it.
`moderation`	`auto`, `low`	OpenAI moderation mode.
`action`	`auto`, `generate`, `edit`	Applies only to the hosted Responses image tool. Images API requests reject `action`.

Model behavior:

gpt-image-2 uses the hosted Responses image tool.
Other OpenAI-owned gpt-image* or dall-e* models use the Images API path.
The Images API path currently rejects edit requests, reference images, and action.

Gemini

provider: "gemini" and provider: "google" use the Gemini image default, currently gemini-3.1-flash-image-preview. Supported Gemini image models:

gemini-3.1-flash-image-preview
gemini-3-pro-image-preview
gemini-2.5-flash-image

Gemini provider_params:

Field	Values	Notes
`aspect_ratio`	`1:1`, `16:9`, `9:16`, `square1x1`, `landscape16x9`, `portrait9x16`	Overrides the universal `size` to aspect-ratio mapping.
`image_size`	`1K`, `2K`, `4K`, `one_k`, `two_k`, `four_k`	Ignored for `gemini-2.5-flash-image`, which does not use an explicit image-size field.

Gemini image generation runs through a scoped image-model turn internally. The user and model still use one generate_image operation; they do not need to call switch_turn.

Result Shape

generate_image returns structured JSON:

{
  "operation_id": "01985f61-8f7b-7000-8000-000000000002",
  "terminal": { "terminal": "generated" },
  "images": [
    {
      "image_id": "01985f61-8f7b-7000-8000-000000000003",
      "blob_ref": {
        "blob_id": "sha256:...",
        "media_type": "image/png"
      },
      "media_type": "image/png",
      "width": 1024,
      "height": 1024
    }
  ],
  "provider_text": { "disposition": "not_emitted" },
  "revised_prompt": { "disposition": "not_requested" },
  "native_metadata": {
    "provider": "open_ai",
    "target_model": "gpt-5.4",
    "response_id": "resp_..."
  },
  "warnings": []
}

Important fields:

Field	Meaning
`operation_id`	Runtime identity for this image operation.
`terminal`	Final state: `generated`, `empty_result`, `denied`, `refused_by_provider`, `safety_filtered`, `failed`, `cancelled`, `timeout`, or `scoped_restore_failed`.
`images`	Durable assistant image references. Use `blob_ref.blob_id` with `rkat blob get`.
`provider_text`	Captured provider-side text when the image backend emits text alongside images.
`revised_prompt`	Provider-revised prompt when the backend returns one.
`native_metadata`	Provider-specific metadata such as response ids and target model.
`warnings`	Non-terminal warnings such as fewer returned images, provider execution failure, or blob commit failure.

Troubleshooting

Symptom	Cause	Fix
`denied` with `unsupported_target`	No configured image provider owns the requested target, or `auto` resolved to a non-image provider.	Set `provider: "openai"` or `provider: "gemini"` and verify credentials with `rkat doctor`.
`denied` with `unsupported_count`	`count` or `n` was greater than `1`.	Run one image operation per variant.
`denied` with `projection_unsupported`	The selected backend cannot represent the request or provider params.	Remove unsupported params, avoid edits/reference images on the OpenAI Images API path, or use the provider default target.
`denied` with `realtime_transport_conflict`	The operation would require an internal scoped image-model turn while a realtime binding is active.	Run image generation from a non-realtime session or choose an image target that does not require a scoped override.
`safety_filtered` or `refused_by_provider`	The provider rejected the prompt or output.	Revise the prompt and retry.
`failed` with `provider_execution_failed`	Provider call failed after planning.	Check provider credentials, model access, rate limits, and network connectivity.
`failed` with `blob_commit_failed`	Image bytes were returned but could not be stored.	Check the realm’s blob-store backend and disk permissions.

Tools for how builtins are enabled and scoped.
Built-in tools reference for the concise parameter reference.
Providers for provider credentials and model catalog behavior.
docs/architecture/assistant-image-generation-substrate.md for the internal design record.

Getting started

Core concepts

Guides

Examples

Image generation

Requirements

CLI quickstart

Request Shape

Fields

Target Selection

Provider Options

OpenAI

Gemini

Result Shape

Troubleshooting

Getting started

Core concepts

Guides

Examples

​Requirements

​CLI quickstart

​Request Shape

​Fields

​Target Selection

​Provider Options

​OpenAI

​Gemini

​Result Shape

​Troubleshooting

​Related Docs

Requirements

CLI quickstart

Request Shape

Fields

Target Selection

Provider Options

OpenAI

Gemini

Result Shape

Troubleshooting

Related Docs