Skip to main content
Meerkat is provider-agnostic at the session/config/runtime model level. You can switch providers by changing the model name, and a configured self-hosted alias such as gemma-4-31b behaves like any other model ID in the runtime. Tool visibility and multimodal behavior still depend on model capabilities, so provider/model differences can affect the effective tool surface.
This page is the concept layer for provider abstraction. Use Auth and Self-hosting models for setup workflows, and use reference pages for exact capability and contract details.

Provider setup

export ANTHROPIC_API_KEY="sk-ant-..."
ModelContextMax outputBest for
claude-fable-51M128KMost capable Anthropic model (premium pricing)
claude-opus-4-81M128KDefault Anthropic recommendation
claude-sonnet-4-61M64KBalanced performance and cost
claude-sonnet-4-5200K64KLegacy supported Sonnet
config.toml (active realm)
[agent]
model = "claude-opus-4-8"
max_tokens_per_turn = 16384

Environment variables

VariableFallbackProvider
RKAT_ANTHROPIC_API_KEYANTHROPIC_API_KEYAnthropic Claude
RKAT_OPENAI_API_KEYOPENAI_API_KEYOpenAI GPT
RKAT_AZURE_OPENAI_API_KEY plus RKAT_AZURE_OPENAI_ENDPOINTAZURE_OPENAI_API_KEY plus AZURE_OPENAI_ENDPOINTAzure OpenAI
RKAT_GEMINI_API_KEYGEMINI_API_KEY, GOOGLE_API_KEYGoogle Gemini
The RKAT_* variants take precedence over provider-native names, so you can run Meerkat with dedicated keys separate from other tools.
Self-hosted servers do not use the shared provider env vars above. Server entries carry connection facts only; credentials are owned by a realm auth binding for provider = "self_hosted" (auth_method of none, api_key, or static_bearer). See Self-hosting models for the realm binding shape.

Image generation providers

The generate_image builtin uses provider-specific image profiles behind a single Meerkat request shape. The active session model does not have to be an image model; image operations can route to a provider default or a forced image target while preserving the original session identity.
ProviderDefault image targetNotes
OpenAIgpt-image-2Uses the hosted Responses image tool by default. Other OpenAI-owned gpt-image* or dall-e* targets use the Images API path.
Geminigemini-3.1-flash-image-previewAlso accepts provider alias google. Gemini image targets run through an internal scoped image-model turn.
Request provider_params are provider-specific and do not replace Meerkat’s universal image fields. Use top-level size, quality, format, and intent; the OpenAI adapter lowers format to provider-side output_format. For the current gpt-image-2 default, public callers should only need background, output_compression, moderation, hosted-tool-only action, hosted-tool-only reasoning_effort, and hosted-tool-only web_search; use background: "auto" or "opaque" (not "transparent"), use output_compression only with format: "jpeg" or "webp", usually omit action, and omit input_fidelity because Meerkat rejects unknown OpenAI image provider params. Gemini accepts aspect_ratio and image_size. See Image generation for the exact request shape and troubleshooting.

SDK feature flags

When using Meerkat as a Rust library, enable only the providers you need:
FeatureDescriptionDefault
anthropicAnthropic Claude supportYes
openaiOpenAI GPT supportYes
geminiGoogle Gemini supportYes
all-providersAll LLM providers (convenience alias)No
meerkat = { version = "0.7.11", features = ["anthropic", "jsonl-store"] }

Provider parameters

Provider-specific options can be passed via the --param CLI flag or provider_params in the SDK: Provider-native web search is on by default for catalog models that support it. Disable it in config with the matching provider_tools.<provider> search toggle, or for a single CLI run with rkat run --no-web-search "...".
ParameterDescription
thinking_budgetToken budget for extended thinking (integer)
top_kTop-k sampling parameter (integer)
rkat run --model claude-sonnet-4-6 --param thinking_budget=10000 "Solve this problem"

Model catalog

Meerkat ships a curated built-in model catalog in the meerkat-models crate (meerkat-core owns only the vocabulary types and ModelCatalog mechanics) and merges it with any configured self-hosted aliases into one effective runtime registry used for capability detection, provider resolution, and catalog responses. Query the catalog programmatically from any surface:
  • CLI: rkat models
  • RPC: models/catalog
  • REST: GET /models/catalog
  • MCP: meerkat_models_catalog
Configured self-hosted aliases appear under the self_hosted provider group and include their backing server_id. For Gemma 4 specifically, prefer chat_completions as the default OpenAI-compatible interface. It is the clearest common path for tool calling across Ollama, LM Studio, and vLLM, while reasoning-trace semantics still vary by server.

Provider resolution

The provider for a model is resolved by exact match against the config-backed model registry (the compiled-in catalog merged with any configured self-hosted aliases). There is no name-prefix inference: a model id resolves to the provider recorded for its catalog entry (or its self-hosted alias config), and an uncatalogued id — even a prefix-shaped one like gpt-unknown-preview or claude-unknown-preview — is rejected rather than guessed at. A configured self-hosted alias such as gemma-4-31b resolves by exact model ID match, so it works without --provider. You can still override this with --provider on the CLI or provider in API requests.

Model fallback chain

Factory-built agents can use an ordered model fallback chain when the active model reaches a typed, recoverable LLM failure boundary. This protects long sessions from provider-side disruptions such as removed models, rate limits, provider overload, auth failures, and context-window overflow. Fallback is enabled by default. With no explicit chain, Meerkat builds a catalog-owned backup order from the configured provider defaults and the global catalog default, excluding the active model and duplicate targets. You can replace that order in realm config:
config.toml
[model_fallback]
enabled = true

[[model_fallback.chain]]
model = "claude-opus-4-8"
provider = "anthropic"

[[model_fallback.chain]]
model = "gpt-5.5"
provider = "openai"
auth_binding = { realm = "dev", binding = "openai_oauth" }

[[model_fallback.chain]]
model = "gemma-4-31b"
provider = "self_hosted"
auth_binding = { realm = "lab", binding = "ollama" }
provider is optional when the model exists in the effective registry. auth_binding is optional; when omitted, the target resolves through the same provider-runtime registry used for normal session creation. The same model/provider with a different auth_binding is a distinct fallback target, so custom chains can fail over to another credential realm. For the catalog default chain, Meerkat keeps candidates inside the selected non-env realm when that realm has a matching provider binding; unavailable catalog-default candidates are skipped rather than silently bleeding into another realm. For an explicit custom chain, an unavailable target is a configuration error. When a switch is applied, Meerkat updates the session LLM identity, request policy, auth lease, provider parameters, output-token ceiling, and capability base filter before retrying. The agent receives a hidden system notice with the source model, fallback model, failure reason, skipped targets, active model limits, and any tools hidden by the fallback model’s capabilities. For structured-output extraction turns, fallback keeps the extraction request deterministic: Meerkat reapplies the output schema for the new provider and keeps provider-native web search/grounding disabled on the retry. Capability changes are expected. For example, falling back from a 1M-context vision model to a 128K local model can clamp output tokens, hide image-result tools, and skip the local target entirely if the failure was a context overflow larger than that target’s context window. Later turns remain sticky to the active fallback model until the session is explicitly hot-swapped or rebuilt. Meerkat does not fallback on call/network timeouts, and it suppresses cross-model fallback for any retryable error after user-visible text or reasoning stream output has been emitted. Ordinary same-model retry policy can still apply when the recovery authority permits it.

See also