> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rkat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Providers

> Provider-agnostic: same tools, same sessions across Anthropic, OpenAI, Gemini, and configured self-hosted models.

Meerkat is provider-agnostic at the session/config/runtime model level. You can switch providers by changing the model name, and a configured self-hosted alias such as `gemma-4-31b` behaves like any other model ID in the runtime. Tool visibility and multimodal behavior still depend on model capabilities, so provider/model differences can affect the effective tool surface.

<Note>
  This page is the concept layer for provider abstraction. Use [Auth](/guides/auth) and [Self-hosting models](/guides/self-hosting-models) for setup workflows, and use reference pages for exact capability and contract details.
</Note>

## Provider setup

<Tabs>
  <Tab title="Anthropic">
    ```bash theme={null}
    export ANTHROPIC_API_KEY="sk-ant-..."
    ```

    | Model               | Context | Max output | Best for                                       |
    | ------------------- | ------- | ---------- | ---------------------------------------------- |
    | `claude-fable-5`    | 1M      | 128K       | Most capable Anthropic model (premium pricing) |
    | `claude-opus-4-8`   | 1M      | 128K       | Default Anthropic recommendation               |
    | `claude-sonnet-4-6` | 1M      | 64K        | Balanced performance and cost                  |
    | `claude-sonnet-4-5` | 200K    | 64K        | Legacy supported Sonnet                        |

    ```toml config.toml (active realm) theme={null}
    [agent]
    model = "claude-opus-4-8"
    max_tokens_per_turn = 16384
    ```
  </Tab>

  <Tab title="OpenAI">
    ```bash theme={null}
    export OPENAI_API_KEY="sk-..."
    ```

    | Model            | Context | Best for                                        |
    | ---------------- | ------- | ----------------------------------------------- |
    | `gpt-5.5`        | 1.05M   | Default OpenAI recommendation                   |
    | `gpt-5.5-pro`    | 1.05M   | Long-running pro-tier reasoning                 |
    | `gpt-5.4`        | 1.05M   | Supported GPT-5 fallback                        |
    | `gpt-5.4-mini`   | 128K    | Smaller fast-path model with compaction support |
    | `gpt-5.3-codex`  | 400K    | Coding-focused workflows                        |
    | `gpt-realtime-2` | 128K    | Live audio sessions                             |

    ```toml config.toml (active realm) theme={null}
    [agent]
    model = "gpt-5.5"
    max_tokens_per_turn = 8192
    ```
  </Tab>

  <Tab title="Gemini">
    ```bash theme={null}
    export GOOGLE_API_KEY="AIza..."
    ```

    | Model                           | Context | Best for                                             |
    | ------------------------------- | ------- | ---------------------------------------------------- |
    | `gemini-3.5-flash`              | 1M      | Default Gemini recommendation, stable fast reasoning |
    | `gemini-3.1-pro-preview`        | 1M      | Advanced preview reasoning                           |
    | `gemini-3.1-flash-lite-preview` | 1M      | Lightweight fast-path tasks                          |

    ```toml config.toml (active realm) theme={null}
    [agent]
    model = "gemini-3.5-flash"
    max_tokens_per_turn = 8192
    ```
  </Tab>

  <Tab title="Self-hosted">
    Configure an OpenAI-compatible local or remote server, then register aliases under `self_hosted.models`.

    ```toml config.toml (active realm) theme={null}
    [self_hosted.servers.ollama]
    transport = "openai_compatible"
    base_url = "http://127.0.0.1:11434"
    api_style = "chat_completions"

    [self_hosted.models.gemma-4-31b]
    server = "ollama"
    remote_model = "gemma4:31b"
    display_name = "Gemma 4 31B"
    family = "gemma-4"
    tier = "supported"
    context_window = 256000
    max_output_tokens = 8192
    vision = true
    image_tool_results = true
    inline_video = false
    supports_temperature = true
    supports_thinking = true
    supports_reasoning = true
    ```

    Running a self-hosted alias also requires a realm binding for
    `provider = "self_hosted"` (credentials are realm-owned, not server-entry
    fields). See [Self-hosting models](/guides/self-hosting-models).
  </Tab>
</Tabs>

## Environment variables

| Variable                                                      | Fallback                                            | Provider         |
| ------------------------------------------------------------- | --------------------------------------------------- | ---------------- |
| `RKAT_ANTHROPIC_API_KEY`                                      | `ANTHROPIC_API_KEY`                                 | Anthropic Claude |
| `RKAT_OPENAI_API_KEY`                                         | `OPENAI_API_KEY`                                    | OpenAI GPT       |
| `RKAT_AZURE_OPENAI_API_KEY` plus `RKAT_AZURE_OPENAI_ENDPOINT` | `AZURE_OPENAI_API_KEY` plus `AZURE_OPENAI_ENDPOINT` | Azure OpenAI     |
| `RKAT_GEMINI_API_KEY`                                         | `GEMINI_API_KEY`, `GOOGLE_API_KEY`                  | Google Gemini    |

<Note>
  The `RKAT_*` variants take precedence over provider-native names, so you can run Meerkat with dedicated keys separate from other tools.
</Note>

Self-hosted servers do not use the shared provider env vars above. Server entries carry connection facts only; credentials are owned by a realm auth binding for `provider = "self_hosted"` (`auth_method` of `none`, `api_key`, or `static_bearer`). See [Self-hosting models](/guides/self-hosting-models) for the realm binding shape.

## Image generation providers

The `generate_image` builtin uses provider-specific image profiles behind a single Meerkat request shape. The active session model does not have to be an image model; image operations can route to a provider default or a forced image target while preserving the original session identity.

| Provider | Default image target             | Notes                                                                                                                          |
| -------- | -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ |
| OpenAI   | `gpt-image-2`                    | Uses the hosted Responses image tool by default. Other OpenAI-owned `gpt-image*` or `dall-e*` targets use the Images API path. |
| Gemini   | `gemini-3.1-flash-image-preview` | Also accepts provider alias `google`. Gemini image targets run through an internal scoped image-model turn.                    |

Request `provider_params` are provider-specific and do not replace Meerkat's universal image fields. Use top-level `size`, `quality`, `format`, and `intent`; the OpenAI adapter lowers `format` to provider-side `output_format`. For the current `gpt-image-2` default, public callers should only need `background`, `output_compression`, `moderation`, hosted-tool-only `action`, hosted-tool-only `reasoning_effort`, and hosted-tool-only `web_search`; use `background: "auto"` or `"opaque"` (not `"transparent"`), use `output_compression` only with `format: "jpeg"` or `"webp"`, usually omit `action`, and omit `input_fidelity` because Meerkat rejects unknown OpenAI image provider params. Gemini accepts `aspect_ratio` and `image_size`. See [Image generation](/guides/image-generation) for the exact request shape and troubleshooting.

## SDK feature flags

When using Meerkat as a Rust library, enable only the providers you need:

| Feature         | Description                           | Default |
| --------------- | ------------------------------------- | ------- |
| `anthropic`     | Anthropic Claude support              | Yes     |
| `openai`        | OpenAI GPT support                    | Yes     |
| `gemini`        | Google Gemini support                 | Yes     |
| `all-providers` | All LLM providers (convenience alias) | No      |

<CodeGroup>
  ```toml Anthropic only (smallest binary) theme={null}
  meerkat = { version = "0.7.11", features = ["anthropic", "jsonl-store"] }
  ```

  ```toml All providers theme={null}
  meerkat = { version = "0.7.11", features = ["all-providers", "jsonl-store"] }
  ```
</CodeGroup>

## Provider parameters

Provider-specific options can be passed via the `--param` CLI flag or `provider_params` in the SDK:

Provider-native web search is on by default for catalog models that support it.
Disable it in config with the matching `provider_tools.<provider>` search toggle,
or for a single CLI run with `rkat run --no-web-search "..."`.

<Tabs>
  <Tab title="Anthropic">
    | Parameter         | Description                                  |
    | ----------------- | -------------------------------------------- |
    | `thinking_budget` | Token budget for extended thinking (integer) |
    | `top_k`           | Top-k sampling parameter (integer)           |

    ```bash theme={null}
    rkat run --model claude-sonnet-4-6 --param thinking_budget=10000 "Solve this problem"
    ```
  </Tab>

  <Tab title="OpenAI">
    | Parameter          | Values                                   | Description                                                                                 |
    | ------------------ | ---------------------------------------- | ------------------------------------------------------------------------------------------- |
    | `reasoning_effort` | `none`, `low`, `medium`, `high`, `xhigh` | Reasoning effort for GPT-5-era OpenAI models and other compatible OpenAI reasoning surfaces |
    | `seed`             | Integer                                  | Seed for deterministic outputs                                                              |

    ```bash theme={null}
    rkat run --model gpt-5.5 --param reasoning_effort=high "Prove this theorem"
    ```
  </Tab>

  <Tab title="Gemini">
    | Parameter         | Description                                  |
    | ----------------- | -------------------------------------------- |
    | `thinking_budget` | Token budget for extended thinking (integer) |
    | `top_k`           | Top-k sampling parameter (integer)           |

    ```bash theme={null}
    rkat run --model gemini-3.5-flash --param thinking_budget=5000 "Analyze this data"
    ```
  </Tab>
</Tabs>

## Model catalog

Meerkat ships a curated built-in model catalog in the `meerkat-models` crate (`meerkat-core` owns only the vocabulary types and `ModelCatalog` mechanics) and merges it with any configured self-hosted aliases into one effective runtime registry used for capability detection, provider resolution, and catalog responses.

Query the catalog programmatically from any surface:

* **CLI**: `rkat models`
* **RPC**: `models/catalog`
* **REST**: `GET /models/catalog`
* **MCP**: `meerkat_models_catalog`

Configured self-hosted aliases appear under the `self_hosted` provider group and include their backing `server_id`.

For Gemma 4 specifically, prefer `chat_completions` as the default OpenAI-compatible interface. It is the clearest common path for tool calling across Ollama, LM Studio, and vLLM, while reasoning-trace semantics still vary by server.

## Provider resolution

The provider for a model is resolved by exact match against the config-backed
model registry (the compiled-in catalog merged with any configured self-hosted
aliases). There is no name-prefix inference: a model id resolves to the
provider recorded for its catalog entry (or its self-hosted alias config), and
an uncatalogued id — even a prefix-shaped one like `gpt-unknown-preview` or
`claude-unknown-preview` — is rejected rather than guessed at.

A configured self-hosted alias such as `gemma-4-31b` resolves by exact model ID
match, so it works without `--provider`.

You can still override this with `--provider` on the CLI or `provider` in API requests.

## Model fallback chain

Factory-built agents can use an ordered model fallback chain when the active
model reaches a typed, recoverable LLM failure boundary. This protects long
sessions from provider-side disruptions such as removed models, rate limits,
provider overload, auth failures, and context-window overflow.

Fallback is enabled by default. With no explicit chain, Meerkat builds a
catalog-owned backup order from the configured provider defaults and the global
catalog default, excluding the active model and duplicate targets. You can
replace that order in realm config:

```toml config.toml theme={null}
[model_fallback]
enabled = true

[[model_fallback.chain]]
model = "claude-opus-4-8"
provider = "anthropic"

[[model_fallback.chain]]
model = "gpt-5.5"
provider = "openai"
auth_binding = { realm = "dev", binding = "openai_oauth" }

[[model_fallback.chain]]
model = "gemma-4-31b"
provider = "self_hosted"
auth_binding = { realm = "lab", binding = "ollama" }
```

`provider` is optional when the model exists in the effective registry.
`auth_binding` is optional; when omitted, the target resolves through the same
provider-runtime registry used for normal session creation. The same
model/provider with a different `auth_binding` is a distinct fallback target,
so custom chains can fail over to another credential realm. For the catalog
default chain, Meerkat keeps candidates inside the selected non-env realm when
that realm has a matching provider binding; unavailable catalog-default
candidates are skipped rather than silently bleeding into another realm. For an
explicit custom chain, an unavailable target is a configuration error.

When a switch is applied, Meerkat updates the session LLM identity, request
policy, auth lease, provider parameters, output-token ceiling, and capability
base filter before retrying. The agent receives a hidden system notice with the
source model, fallback model, failure reason, skipped targets, active model
limits, and any tools hidden by the fallback model's capabilities.

For structured-output extraction turns, fallback keeps the extraction request
deterministic: Meerkat reapplies the output schema for the new provider and
keeps provider-native web search/grounding disabled on the retry.

Capability changes are expected. For example, falling back from a 1M-context
vision model to a 128K local model can clamp output tokens, hide image-result
tools, and skip the local target entirely if the failure was a context overflow
larger than that target's context window. Later turns remain sticky to the
active fallback model until the session is explicitly hot-swapped or rebuilt.

Meerkat does not fallback on call/network timeouts, and it suppresses
cross-model fallback for any retryable error after user-visible text or
reasoning stream output has been emitted. Ordinary same-model retry policy can
still apply when the recovery authority permits it.

## See also

* [Auth & bindings](/concepts/auth-and-bindings)
* [Self-hosting models](/guides/self-hosting-models)
* [Built-in tools reference](/reference/builtin-tools)
* [Capability matrix](/reference/capability-matrix)
