Skip to main content
Image generation in Meerkat is exposed as the session-scoped generate_image builtin tool. The model asks for an image during a normal turn, Meerkat routes the request to a configured image provider, stores generated bytes in the realm blob store, and returns durable image references back into the session. This is not a standalone image/generate RPC. Use it by running a session with builtins enabled and telling the agent to call generate_image when an image artifact is required.

Requirements

  • A runtime-backed session surface with builtins enabled. The CLI default --tools safe enables builtins; --tools none hides generate_image.
  • At least one configured image provider. OpenAI uses RKAT_OPENAI_API_KEY or OPENAI_API_KEY; Gemini uses RKAT_GEMINI_API_KEY, GEMINI_API_KEY, or GOOGLE_API_KEY.
  • A blob store. CLI, REST, RPC, MCP, and persistent SDK-backed sessions wire this automatically through the runtime-backed surface.
  • count must be 1. Multiple image output is rejected today with unsupported_count; run multiple operations if you need several variants.
generate_image is separate from view_image. view_image reads an existing file into the model as image input. generate_image creates a new assistant-owned image and stores it as a blob.

CLI quickstart

The safest way to force a first turn to use image generation is to allow only the image tool for that turn:
rkat run \
  --model gpt-5.5 \
  --allow-tool generate_image \
  "Use generate_image to create a square PNG of a cozy tabby cat by a sunlit window. Return the blob id and a one-sentence caption."
When the tool succeeds, the tool result includes images[].blob_ref.blob_id. Save the generated image with:
rkat blob get <blob_id> --output cat.png
Inspect the stored payload instead of writing raw bytes:
rkat blob get <blob_id> --json

Request Shape

The tool accepts one top-level field, request. For normal use, pass the simple request shape:
{
  "request": {
    "intent": "generate",
    "prompt": "a cozy tabby cat by a sunlit window",
    "size": "1024x1024",
    "quality": "auto",
    "format": "png",
    "count": 1
  }
}
For edits, set intent to edit, provide an instruction, and include at least one source_images entry:
{
  "request": {
    "intent": "edit",
    "instruction": "make the background transparent and keep the subject unchanged",
    "source_images": [
      {
        "kind": "assistant_image",
        "image_id": "01985f61-8f7b-7000-8000-000000000001"
      }
    ],
    "format": "png"
  }
}
Reference an existing blob when the source image did not come from a previous generate_image result:
{
  "kind": "blob",
  "blob_ref": {
    "blob_id": "sha256:...",
    "media_type": "image/png"
  }
}

Fields

FieldValuesNotes
intentgenerate, editDefaults to generate when prompt is present.
promptstringRequired for generate. Object form { "content": "..." } is also accepted.
instructionstringRequired for edit. Object form { "content": "..." } is also accepted.
source_imagesarrayRequired for edit; use assistant_image, blob, transcript_block, or provider_native references.
reference_imagesarrayOptional image references for generation. Provider support varies.
sizeauto, 1024x1024, 1024x1536, 1536x1024, WIDTHxHEIGHTCustom sizes are parsed and then accepted or rejected by the selected provider.
qualityauto, low, medium, highProvider support varies.
formatauto, png, jpeg, jpg, webpjpg is accepted as jpeg.
count / n1Only 1 is supported today.
targetauto or provider/model targetUse the shorthand provider and model fields unless you need the canonical tagged target shape.
provideropenai, gemini, googleForces a provider default image model. google is a Gemini alias.
modelprovider-owned model idRequires provider unless Meerkat can infer the owning provider from the model id.
provider_paramsobjectProvider-specific options. Unknown fields are rejected by provider profiles that define a closed parameter shape.

Target Selection

By default, target is auto.
  • If the current session model belongs to a provider with an image profile, auto uses that provider’s default image target.
  • If the current session provider does not support image generation, set provider explicitly.
  • To force a model, pass both provider and model.
OpenAI default:
{
  "request": {
    "intent": "generate",
    "prompt": "a clean product render of a matte black desk lamp",
    "provider": "openai",
    "format": "png",
    "provider_params": {
      "background": "transparent"
    }
  }
}
Gemini default:
{
  "request": {
    "intent": "generate",
    "prompt": "a widescreen storyboard frame of a rover crossing red dunes",
    "provider": "gemini",
    "size": "1536x1024",
    "provider_params": {
      "aspect_ratio": "16:9",
      "image_size": "2K"
    }
  }
}

Provider Options

OpenAI

provider: "openai" uses the OpenAI image default, currently gpt-image-2. OpenAI provider_params:
FieldValuesNotes
backgroundauto, transparent, opaquetransparent is model-dependent.
output_compression0 to 100Applies where the selected OpenAI backend supports it.
moderationauto, lowOpenAI moderation mode.
actionauto, generate, editApplies only to the hosted Responses image tool. Images API requests reject action.
Model behavior:
  • gpt-image-2 uses the hosted Responses image tool.
  • Other OpenAI-owned gpt-image* or dall-e* models use the Images API path.
  • The Images API path currently rejects edit requests, reference images, and action.

Gemini

provider: "gemini" and provider: "google" use the Gemini image default, currently gemini-3.1-flash-image-preview. Supported Gemini image models:
  • gemini-3.1-flash-image-preview
  • gemini-3-pro-image-preview
  • gemini-2.5-flash-image
Gemini provider_params:
FieldValuesNotes
aspect_ratio1:1, 16:9, 9:16, square1x1, landscape16x9, portrait9x16Overrides the universal size to aspect-ratio mapping.
image_size1K, 2K, 4K, one_k, two_k, four_kIgnored for gemini-2.5-flash-image, which does not use an explicit image-size field.
Gemini image generation runs through a scoped image-model turn internally. The user and model still use one generate_image operation; they do not need to call switch_turn.

Result Shape

generate_image returns structured JSON:
{
  "operation_id": "01985f61-8f7b-7000-8000-000000000002",
  "terminal": { "terminal": "generated" },
  "images": [
    {
      "image_id": "01985f61-8f7b-7000-8000-000000000003",
      "blob_ref": {
        "blob_id": "sha256:...",
        "media_type": "image/png"
      },
      "media_type": "image/png",
      "width": 1024,
      "height": 1024
    }
  ],
  "provider_text": { "disposition": "not_emitted" },
  "revised_prompt": { "disposition": "not_requested" },
  "native_metadata": {
    "provider": "open_ai",
    "target_model": "gpt-5.4",
    "response_id": "resp_..."
  },
  "warnings": []
}
Important fields:
FieldMeaning
operation_idRuntime identity for this image operation.
terminalFinal state: generated, empty_result, denied, refused_by_provider, safety_filtered, failed, cancelled, timeout, or scoped_restore_failed.
imagesDurable assistant image references. Use blob_ref.blob_id with rkat blob get.
provider_textCaptured provider-side text when the image backend emits text alongside images.
revised_promptProvider-revised prompt when the backend returns one.
native_metadataProvider-specific metadata such as response ids and target model.
warningsNon-terminal warnings such as fewer returned images, provider execution failure, or blob commit failure.

Troubleshooting

SymptomCauseFix
denied with unsupported_targetNo configured image provider owns the requested target, or auto resolved to a non-image provider.Set provider: "openai" or provider: "gemini" and verify credentials with rkat doctor.
denied with unsupported_countcount or n was greater than 1.Run one image operation per variant.
denied with projection_unsupportedThe selected backend cannot represent the request or provider params.Remove unsupported params, avoid edits/reference images on the OpenAI Images API path, or use the provider default target.
denied with realtime_transport_conflictThe operation would require an internal scoped image-model turn while a realtime binding is active.Run image generation from a non-realtime session or choose an image target that does not require a scoped override.
safety_filtered or refused_by_providerThe provider rejected the prompt or output.Revise the prompt and retry.
failed with provider_execution_failedProvider call failed after planning.Check provider credentials, model access, rate limits, and network connectivity.
failed with blob_commit_failedImage bytes were returned but could not be stored.Check the realm’s blob-store backend and disk permissions.
  • Tools for how builtins are enabled and scoped.
  • Built-in tools reference for the concise parameter reference.
  • Providers for provider credentials and model catalog behavior.
  • docs/architecture/assistant-image-generation-substrate.md for the internal design record.