Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.rkat.ai/llms.txt

Use this file to discover all available pages before exploring further.

Image generation in Meerkat is exposed as the session-scoped generate_image builtin tool. The model asks for an image during a normal turn, Meerkat routes the request to a configured image provider, stores generated bytes in the realm blob store, and returns durable image references back into the session. This is not a standalone image/generate RPC. Use it by running a session with builtins enabled and telling the agent to call generate_image when an image artifact is required.

Requirements

  • A runtime-backed session surface with builtins enabled. The CLI default --tools safe enables builtins; --tools none hides generate_image.
  • At least one configured image provider. OpenAI uses RKAT_OPENAI_API_KEY or OPENAI_API_KEY; Azure OpenAI uses AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, and AZURE_OPENAI_IMAGE_GENERATION_DEPLOYMENT; Gemini uses RKAT_GEMINI_API_KEY, GEMINI_API_KEY, or GOOGLE_API_KEY.
  • A blob store. CLI, REST, RPC, MCP, and persistent SDK-backed sessions wire this automatically through the runtime-backed surface.
  • count must be 1. Multiple image output is rejected today with unsupported_count; run multiple operations if you need several variants.
generate_image is separate from view_image. view_image reads an existing file into the model as image input. generate_image creates a new assistant-owned image and stores it as a blob. When the model needs to save a generated blob to disk during the same run, it can call blob_save_file.

CLI quickstart

The safest way to force a first turn to use image generation is to allow only the image tool for that turn:
rkat run \
  --model gpt-5.5 \
  --allow-tool generate_image \
  "Use generate_image to create a square PNG of a cozy tabby cat by a sunlit window. Return the blob id and a one-sentence caption."
When the tool succeeds, the tool result includes images[].blob_ref.blob_id. Save the generated image with:
rkat blob get <blob_id> --output cat.png
You can also ask the agent to save the image itself:
rkat run "Create a PNG infographic about today's top news and save it to top-news-infographic.png" --yolo -m gpt-5.5
Inspect the stored payload instead of writing raw bytes:
rkat blob get <blob_id> --json

Request Shape

The tool accepts one top-level field, request. For normal use, pass the simple request shape:
{
  "request": {
    "intent": "generate",
    "prompt": "a cozy tabby cat by a sunlit window",
    "size": "1024x1024",
    "quality": "auto",
    "format": "png",
    "count": 1
  }
}
This is the Meerkat generate_image tool input, not a raw OpenAI Responses API request. Meerkat lowers universal fields such as size, quality, and format into the selected provider’s native request. For OpenAI, format becomes the provider-side output_format option. For edits, set intent to edit, provide an instruction, and include at least one source_images entry:
{
  "request": {
    "intent": "edit",
    "instruction": "replace the background with a plain white studio backdrop and keep the subject unchanged",
    "source_images": [
      {
        "kind": "assistant_image",
        "image_id": "01985f61-8f7b-7000-8000-000000000001"
      }
    ],
    "format": "png"
  }
}
Reference an existing blob when the source image did not come from a previous generate_image result:
{
  "kind": "blob",
  "blob_ref": {
    "blob_id": "sha256:...",
    "media_type": "image/png"
  }
}

Fields

FieldValuesNotes
intentgenerate, editDefaults to generate when prompt is present.
promptstringRequired for generate. Object form { "content": "..." } is also accepted.
instructionstringRequired for edit. Object form { "content": "..." } is also accepted.
source_imagesarrayRequired for edit; use assistant_image, blob, transcript_block, or provider_native references.
reference_imagesarrayOptional image references for generation. Provider support varies.
sizeauto, 1024x1024, 1024x1536, 1536x1024, WIDTHxHEIGHTCustom sizes are parsed and then accepted or rejected by the selected provider.
qualityauto, low, medium, highProvider support varies.
formatauto, png, jpeg, jpg, webpjpg is accepted as jpeg.
count / n1Only 1 is supported today.
targetauto or provider/model targetUse the shorthand provider and model fields unless you need the canonical tagged target shape.
provideropenai, gemini, googleForces a provider default image model. google is a Gemini alias.
modelprovider-owned model idRequires provider unless Meerkat can infer the owning provider from the model id.
provider_paramsobjectProvider-specific options. Unknown fields are rejected by provider profiles that define a closed parameter shape.

Target Selection

By default, target is auto.
  • If the current session model belongs to a provider with an image profile, auto uses that provider’s default image target.
  • If the current session provider does not support image generation, set provider explicitly.
  • To force a model, pass both provider and model.
OpenAI default:
{
  "request": {
    "intent": "generate",
    "prompt": "a clean product render of a matte black desk lamp",
    "provider": "openai",
    "size": "1024x1024",
    "quality": "auto",
    "format": "webp"
  }
}
Gemini default:
{
  "request": {
    "intent": "generate",
    "prompt": "a widescreen storyboard frame of a rover crossing red dunes",
    "provider": "gemini",
    "size": "1536x1024",
    "provider_params": {
      "aspect_ratio": "16:9",
      "image_size": "2K"
    }
  }
}

Provider Options

OpenAI

provider: "openai" uses the OpenAI image default, currently gpt-image-2. Meerkat fields and OpenAI lowering:
FieldValuesNotes
sizeauto, 1024x1024, 1024x1536, 1536x1024, WIDTHxHEIGHTUniversal Meerkat field. Lowered to OpenAI size.
qualityauto, low, medium, highUniversal Meerkat field. Lowered to OpenAI quality.
formatauto, png, jpeg, jpg, webpUniversal Meerkat field. Lowered to OpenAI output_format; jpg normalizes to jpeg.
provider_params.backgroundauto, opaqueAdvanced OpenAI override. gpt-image-2 does not support transparent backgrounds. Use an opaque background and post-process if you need transparency.
provider_params.output_compression0 to 100Advanced OpenAI override. Applies only when the universal format is jpeg or webp.
provider_params.moderationauto, lowAdvanced GPT Image moderation strictness override. Omit it for OpenAI’s default filtering.
provider_params.actionauto, generate, editAdvanced hosted Responses image-tool override. Normal Meerkat calls should use top-level intent and omit action; Images API routes reject it.
provider_params.reasoning_effortnone, low, medium, high, xhighHosted gpt-image-2 route only. Lowered to OpenAI reasoning.effort.
provider_params.web_searchtrue, false, null, or objectHosted gpt-image-2 route only. true/{} enables OpenAI web_search; object values may include OpenAI web-search options such as search_context_size, filters, user_location, external_web_access, and return_token_budget.
Model behavior:
  • gpt-image-2 uses the hosted Responses image tool internally, but callers still pass Meerkat’s generate_image request shape.
  • On the azure_openai backend, hosted image generation requires backend option image_generation_deployment (for example gpt-image-2). The session model remains the Azure text deployment name, and Meerkat selects the image deployment with Azure’s Responses image-generation header.
  • gpt-image-2 accepts flexible WIDTHxHEIGHT sizes when they satisfy OpenAI’s constraints: both edges are multiples of 16 px, max edge is 3840 px, aspect ratio is at most 3:1, and total pixels are between 655,360 and 8,294,400. Common values include 1024x1024, 1536x1024, 1024x1536, 2048x2048, 2048x1152, 3840x2160, and 2160x3840.
  • Do not copy raw Responses API JSON into provider_params. Use Meerkat’s universal size, quality, format, and intent fields, and reserve provider_params for the advanced OpenAI overrides above.
  • For fresh/current image-only requests, prefer passing provider_params.web_search to generate_image over doing a separate search turn first.
  • Do not pass background: "transparent" for gpt-image-2; OpenAI rejects it. Do not pass input_fidelity; Meerkat’s 0.6.6 OpenAI adapter uses a closed provider-params shape and rejects unknown fields.
  • Other OpenAI-owned gpt-image* or dall-e* models use the Images API path.
  • The Images API path currently rejects edit requests, reference images, action, reasoning_effort, and web_search.

Gemini

provider: "gemini" and provider: "google" use the Gemini image default, currently gemini-3.1-flash-image-preview. Supported Gemini image models:
  • gemini-3.1-flash-image-preview
  • gemini-3-pro-image-preview
  • gemini-2.5-flash-image
Gemini provider_params:
FieldValuesNotes
aspect_ratio1:1, 16:9, 9:16, square1x1, landscape16x9, portrait9x16Overrides the universal size to aspect-ratio mapping.
image_size1K, 2K, 4K, one_k, two_k, four_kIgnored for gemini-2.5-flash-image, which does not use an explicit image-size field.
Gemini image generation runs through a scoped image-model turn internally. The user and model still use one generate_image operation; they do not need to call switch_turn.

Result Shape

generate_image returns structured JSON:
{
  "operation_id": "01985f61-8f7b-7000-8000-000000000002",
  "terminal": { "terminal": "generated" },
  "images": [
    {
      "image_id": "01985f61-8f7b-7000-8000-000000000003",
      "blob_ref": {
        "blob_id": "sha256:...",
        "media_type": "image/png"
      },
      "media_type": "image/png",
      "width": 1024,
      "height": 1024
    }
  ],
  "provider_text": { "disposition": "not_emitted" },
  "revised_prompt": { "disposition": "not_requested" },
  "native_metadata": {
    "provider": "open_ai",
    "target_model": "gpt-5.4",
    "response_id": "resp_..."
  },
  "warnings": []
}
Important fields:
FieldMeaning
operation_idRuntime identity for this image operation.
terminalFinal state: generated, empty_result, denied, refused_by_provider, safety_filtered, failed, cancelled, timeout, or scoped_restore_failed.
imagesDurable assistant image references. Use blob_ref.blob_id with blob_save_file inside a run or rkat blob get outside the run.
provider_textCaptured provider-side text when the image backend emits text alongside images.
revised_promptProvider-revised prompt when the backend returns one.
native_metadataProvider-specific metadata such as response ids and target model.
warningsNon-terminal warnings such as fewer returned images, provider execution failure, or blob commit failure.

Troubleshooting

SymptomCauseFix
denied with unsupported_targetNo configured image provider owns the requested target, or auto resolved to a non-image provider.Set provider: "openai" or provider: "gemini" and verify credentials with rkat doctor.
denied with unsupported_countcount or n was greater than 1.Run one image operation per variant.
denied with projection_unsupportedThe selected backend cannot represent the request or provider params.Remove unsupported params, avoid edits/reference images on the OpenAI Images API path, or use the provider default target.
denied with realtime_transport_conflictThe operation would require an internal scoped image-model turn while a realtime binding is active.Run image generation from a non-realtime session or choose an image target that does not require a scoped override.
safety_filtered or refused_by_providerThe provider rejected the prompt or output.Revise the prompt and retry.
failed with provider_execution_failedProvider call failed after planning.Check provider credentials, model access, rate limits, and network connectivity.
failed with blob_commit_failedImage bytes were returned but could not be stored.Check the realm’s blob-store backend and disk permissions.