> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rkat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Image generation

> Generate or edit assistant-owned images through the session-scoped generate_image tool.

Image generation in Meerkat is exposed as the session-scoped `generate_image` builtin tool. The model asks for an image during a normal turn, Meerkat routes the request to a configured image provider, stores generated bytes in the realm blob store, and returns durable image references back into the session.

This is not a standalone `image/generate` RPC. Use it by running a session with builtins enabled and telling the agent to call `generate_image` when an image artifact is required.

## Requirements

* A runtime-backed session surface with builtins enabled. The CLI default `--tools safe` enables builtins; `--tools none` hides `generate_image`.
* At least one configured image provider. OpenAI uses `RKAT_OPENAI_API_KEY` or `OPENAI_API_KEY`; Azure OpenAI uses `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`, and `AZURE_OPENAI_IMAGE_GENERATION_DEPLOYMENT`; Gemini uses `RKAT_GEMINI_API_KEY`, `GEMINI_API_KEY`, or `GOOGLE_API_KEY`.
* A blob store. CLI, REST, RPC, MCP, and persistent SDK-backed sessions wire this automatically through the runtime-backed surface.
* `count` must be `1`. Multiple image output is rejected today with `unsupported_count`; run multiple operations if you need several variants.

<Note>
  `generate_image` is separate from `view_image`. `view_image` reads an existing file into the model as image input. `generate_image` creates a new assistant-owned image and stores it as a blob. When the model needs to save a generated blob to disk during the same run, it can call `blob_save_file`.
</Note>

## CLI quickstart

The safest way to force a first turn to use image generation is to allow only the image tool for that turn:

```bash theme={null}
rkat run \
  --model gpt-5.5 \
  --allow-tool generate_image \
  "Use generate_image to create a square PNG of a cozy tabby cat by a sunlit window. Return the blob id and a one-sentence caption."
```

When the tool succeeds, the tool result includes `images[].blob_ref.blob_id`. Save the generated image with:

```bash theme={null}
rkat blob get <blob_id> --output cat.png
```

You can also ask the agent to save the image itself:

```bash theme={null}
rkat run "Create a PNG infographic about today's top news and save it to top-news-infographic.png" --yolo -m gpt-5.5
```

Inspect the stored payload instead of writing raw bytes:

```bash theme={null}
rkat blob get <blob_id> --json
```

## Request Shape

The tool accepts one top-level field, `request`. For normal use, pass the simple request shape:

```json theme={null}
{
  "request": {
    "intent": "generate",
    "prompt": "a cozy tabby cat by a sunlit window",
    "size": "1024x1024",
    "quality": "auto",
    "format": "png",
    "count": 1
  }
}
```

This is the Meerkat `generate_image` tool input, not a raw OpenAI Responses API request. Meerkat lowers universal fields such as `size`, `quality`, and `format` into the selected provider's native request. For OpenAI, `format` becomes the provider-side `output_format` option.

For edits, set `intent` to `edit`, provide an `instruction`, and include at least one `source_images` entry:

```json theme={null}
{
  "request": {
    "intent": "edit",
    "instruction": "replace the background with a plain white studio backdrop and keep the subject unchanged",
    "source_images": [
      {
        "kind": "assistant_image",
        "image_id": "01985f61-8f7b-7000-8000-000000000001"
      }
    ],
    "format": "png"
  }
}
```

Reference an existing blob when the source image did not come from a previous `generate_image` result:

```json theme={null}
{
  "kind": "blob",
  "blob_ref": {
    "blob_id": "sha256:...",
    "media_type": "image/png"
  }
}
```

### Fields

| Field              | Values                                                        | Notes                                                                                                             |
| ------------------ | ------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
| `intent`           | `generate`, `edit`                                            | Defaults to `generate` when `prompt` is present.                                                                  |
| `prompt`           | string                                                        | Required for `generate`. Object form `{ "content": "..." }` is also accepted.                                     |
| `instruction`      | string                                                        | Required for `edit`. Object form `{ "content": "..." }` is also accepted.                                         |
| `source_images`    | array                                                         | Required for `edit`; use `assistant_image`, `blob`, `transcript_block`, or `provider_native` references.          |
| `reference_images` | array                                                         | Optional image references for generation. Provider support varies.                                                |
| `size`             | `auto`, `1024x1024`, `1024x1536`, `1536x1024`, `WIDTHxHEIGHT` | Custom sizes are parsed and then accepted or rejected by the selected provider.                                   |
| `quality`          | `auto`, `low`, `medium`, `high`                               | Provider support varies.                                                                                          |
| `format`           | `auto`, `png`, `jpeg`, `jpg`, `webp`                          | `jpg` is accepted as `jpeg`.                                                                                      |
| `count` / `n`      | `1`                                                           | Only `1` is supported today.                                                                                      |
| `target`           | `auto` or provider/model target                               | Use the shorthand `provider` and `model` fields unless you need the canonical tagged target shape.                |
| `provider`         | `openai`, `gemini`, `google`                                  | Forces a provider default image model. `google` is a Gemini alias.                                                |
| `model`            | provider-owned model id                                       | Requires `provider` unless Meerkat can infer the owning provider from the model id.                               |
| `provider_params`  | object                                                        | Provider-specific options. Unknown fields are rejected by provider profiles that define a closed parameter shape. |

## Target Selection

By default, `target` is `auto`.

* If the current session model belongs to a provider with an image profile, `auto` uses that provider's default image target.
* If the current session provider does not support image generation, set `provider` explicitly.
* To force a model, pass both `provider` and `model`.

OpenAI default:

```json theme={null}
{
  "request": {
    "intent": "generate",
    "prompt": "a clean product render of a matte black desk lamp",
    "provider": "openai",
    "size": "1024x1024",
    "quality": "auto",
    "format": "webp"
  }
}
```

Gemini default:

```json theme={null}
{
  "request": {
    "intent": "generate",
    "prompt": "a widescreen storyboard frame of a rover crossing red dunes",
    "provider": "gemini",
    "size": "1536x1024",
    "provider_params": {
      "aspect_ratio": "16:9",
      "image_size": "2K"
    }
  }
}
```

## Provider Options

### OpenAI

`provider: "openai"` uses the OpenAI image default, currently `gpt-image-2`.

Meerkat fields and OpenAI lowering:

| Field                                | Values                                                        | Notes                                                                                                                                                                                                                                      |
| ------------------------------------ | ------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `size`                               | `auto`, `1024x1024`, `1024x1536`, `1536x1024`, `WIDTHxHEIGHT` | Universal Meerkat field. Lowered to OpenAI `size`.                                                                                                                                                                                         |
| `quality`                            | `auto`, `low`, `medium`, `high`                               | Universal Meerkat field. Lowered to OpenAI `quality`.                                                                                                                                                                                      |
| `format`                             | `auto`, `png`, `jpeg`, `jpg`, `webp`                          | Universal Meerkat field. Lowered to OpenAI `output_format`; `jpg` normalizes to `jpeg`.                                                                                                                                                    |
| `provider_params.background`         | `auto`, `opaque`                                              | Advanced OpenAI override. `gpt-image-2` does not support transparent backgrounds. Use an opaque background and post-process if you need transparency.                                                                                      |
| `provider_params.output_compression` | `0` to `100`                                                  | Advanced OpenAI override. Applies only when the universal `format` is `jpeg` or `webp`.                                                                                                                                                    |
| `provider_params.moderation`         | `auto`, `low`                                                 | Advanced GPT Image moderation strictness override. Omit it for OpenAI's default filtering.                                                                                                                                                 |
| `provider_params.action`             | `auto`, `generate`, `edit`                                    | Advanced hosted Responses image-tool override. Normal Meerkat calls should use top-level `intent` and omit `action`; Images API routes reject it.                                                                                          |
| `provider_params.reasoning_effort`   | `none`, `low`, `medium`, `high`, `xhigh`                      | Hosted `gpt-image-2` route only. Lowered to OpenAI `reasoning.effort`.                                                                                                                                                                     |
| `provider_params.web_search`         | `true`, `false`, `null`, or object                            | Hosted `gpt-image-2` route only. `true`/`{}` enables OpenAI `web_search`; object values may include OpenAI web-search options such as `search_context_size`, `filters`, `user_location`, `external_web_access`, and `return_token_budget`. |

Model behavior:

* `gpt-image-2` uses the hosted Responses image tool internally, but callers still pass Meerkat's `generate_image` request shape.
* On the `azure_openai` backend, hosted image generation requires backend option `image_generation_deployment` (for example `gpt-image-2`). The session model remains the Azure text deployment name, and Meerkat selects the image deployment with Azure's Responses image-generation header.
* `gpt-image-2` accepts flexible `WIDTHxHEIGHT` sizes when they satisfy OpenAI's constraints: both edges are multiples of 16 px, max edge is 3840 px, aspect ratio is at most 3:1, and total pixels are between 655,360 and 8,294,400. Common values include `1024x1024`, `1536x1024`, `1024x1536`, `2048x2048`, `2048x1152`, `3840x2160`, and `2160x3840`.
* Do not copy raw Responses API JSON into `provider_params`. Use Meerkat's universal `size`, `quality`, `format`, and `intent` fields, and reserve `provider_params` for the advanced OpenAI overrides above.
* For fresh/current image-only requests, prefer passing `provider_params.web_search` to `generate_image` over doing a separate search turn first.
* Do not pass `background: "transparent"` for `gpt-image-2`; OpenAI rejects it. Do not pass `input_fidelity`; Meerkat's 0.6.6 OpenAI adapter uses a closed provider-params shape and rejects unknown fields.
* Other OpenAI-owned `gpt-image*` or `dall-e*` models use the Images API path.
* The Images API path currently rejects edit requests, reference images, `action`, `reasoning_effort`, and `web_search`.

### Gemini

`provider: "gemini"` and `provider: "google"` use the Gemini image default, currently `gemini-3.1-flash-image-preview`.

Supported Gemini image models:

* `gemini-3.1-flash-image-preview`
* `gemini-3-pro-image-preview`
* `gemini-2.5-flash-image`

Gemini `provider_params`:

| Field          | Values                                                              | Notes                                                                                  |
| -------------- | ------------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
| `aspect_ratio` | `1:1`, `16:9`, `9:16`, `square1x1`, `landscape16x9`, `portrait9x16` | Overrides the universal `size` to aspect-ratio mapping.                                |
| `image_size`   | `1K`, `2K`, `4K`, `one_k`, `two_k`, `four_k`                        | Ignored for `gemini-2.5-flash-image`, which does not use an explicit image-size field. |

Gemini image generation runs through a scoped image-model turn internally. The user and model still use one `generate_image` operation; they do not need to call `switch_turn`.

## Result Shape

`generate_image` returns structured JSON:

```json theme={null}
{
  "operation_id": "01985f61-8f7b-7000-8000-000000000002",
  "terminal": { "terminal": "generated" },
  "images": [
    {
      "image_id": "01985f61-8f7b-7000-8000-000000000003",
      "blob_ref": {
        "blob_id": "sha256:...",
        "media_type": "image/png"
      },
      "media_type": "image/png",
      "width": 1024,
      "height": 1024
    }
  ],
  "provider_text": { "disposition": "not_emitted" },
  "revised_prompt": { "disposition": "not_requested" },
  "native_metadata": {
    "provider": "open_ai",
    "target_model": "gpt-5.4",
    "response_id": "resp_..."
  },
  "warnings": []
}
```

Important fields:

| Field             | Meaning                                                                                                                                                     |
| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `operation_id`    | Runtime identity for this image operation.                                                                                                                  |
| `terminal`        | Final state: `generated`, `empty_result`, `denied`, `refused_by_provider`, `safety_filtered`, `failed`, `cancelled`, `timeout`, or `scoped_restore_failed`. |
| `images`          | Durable assistant image references. Use `blob_ref.blob_id` with `blob_save_file` inside a run or `rkat blob get` outside the run.                           |
| `provider_text`   | Captured provider-side text when the image backend emits text alongside images.                                                                             |
| `revised_prompt`  | Provider-revised prompt when the backend returns one.                                                                                                       |
| `native_metadata` | Provider-specific metadata such as response ids and target model.                                                                                           |
| `warnings`        | Non-terminal warnings such as fewer returned images, provider execution failure, or blob commit failure.                                                    |

## Troubleshooting

| Symptom                                     | Cause                                                                                               | Fix                                                                                                                        |
| ------------------------------------------- | --------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- |
| `denied` with `unsupported_target`          | No configured image provider owns the requested target, or `auto` resolved to a non-image provider. | Set `provider: "openai"` or `provider: "gemini"` and verify credentials with `rkat doctor`.                                |
| `denied` with `unsupported_count`           | `count` or `n` was greater than `1`.                                                                | Run one image operation per variant.                                                                                       |
| `denied` with `projection_unsupported`      | The selected backend cannot represent the request or provider params.                               | Remove unsupported params, avoid edits/reference images on the OpenAI Images API path, or use the provider default target. |
| `denied` with `realtime_transport_conflict` | The operation would require an internal scoped image-model turn while a realtime binding is active. | Run image generation from a non-realtime session or choose an image target that does not require a scoped override.        |
| `safety_filtered` or `refused_by_provider`  | The provider rejected the prompt or output.                                                         | Revise the prompt and retry.                                                                                               |
| `failed` with `provider_execution_failed`   | Provider call failed after planning.                                                                | Check provider credentials, model access, rate limits, and network connectivity.                                           |
| `failed` with `blob_commit_failed`          | Image bytes were returned but could not be stored.                                                  | Check the realm's blob-store backend and disk permissions.                                                                 |

## Related Docs

* [Tools](/concepts/tools) for how builtins are enabled and scoped.
* [Built-in tools reference](/reference/builtin-tools) for the concise parameter reference.
* [Providers](/concepts/providers) for provider credentials and model catalog behavior.
* [Built-in tools reference](/reference/builtin-tools) for the model-facing tool contract.
