Provider-agnostic: same tools, same sessions across Anthropic, OpenAI, Gemini, and configured self-hosted models.
Meerkat is provider-agnostic at the session/config/runtime model level. You can switch providers by changing the model name, and a configured self-hosted alias such as gemma-4-31b behaves like any other model ID in the runtime. Tool visibility and multimodal behavior still depend on model capabilities, so provider/model differences can affect the effective tool surface.
This page is the concept layer for provider abstraction. Use Auth and Self-hosting models for setup workflows, and use reference pages for exact capability and contract details.
Running a self-hosted alias also requires a realm binding for
provider = "self_hosted" (credentials are realm-owned, not server-entry
fields). See Self-hosting models.
RKAT_AZURE_OPENAI_API_KEY plus RKAT_AZURE_OPENAI_ENDPOINT
AZURE_OPENAI_API_KEY plus AZURE_OPENAI_ENDPOINT
Azure OpenAI
RKAT_GEMINI_API_KEY
GEMINI_API_KEY, GOOGLE_API_KEY
Google Gemini
The RKAT_* variants take precedence over provider-native names, so you can run Meerkat with dedicated keys separate from other tools.
Self-hosted servers do not use the shared provider env vars above. Server entries carry connection facts only; credentials are owned by a realm auth binding for provider = "self_hosted" (auth_method of none, api_key, or static_bearer). See Self-hosting models for the realm binding shape.
The generate_image builtin uses provider-specific image profiles behind a single Meerkat request shape. The active session model does not have to be an image model; image operations can route to a provider default or a forced image target while preserving the original session identity.
Provider
Default image target
Notes
OpenAI
gpt-image-2
Uses the hosted Responses image tool by default. Other OpenAI-owned gpt-image* or dall-e* targets use the Images API path.
Gemini
gemini-3.1-flash-image-preview
Also accepts provider alias google. Gemini image targets run through an internal scoped image-model turn.
Request provider_params are provider-specific and do not replace Meerkat’s universal image fields. Use top-level size, quality, format, and intent; the OpenAI adapter lowers format to provider-side output_format. For the current gpt-image-2 default, public callers should only need background, output_compression, moderation, hosted-tool-only action, hosted-tool-only reasoning_effort, and hosted-tool-only web_search; use background: "auto" or "opaque" (not "transparent"), use output_compression only with format: "jpeg" or "webp", usually omit action, and omit input_fidelity because Meerkat rejects unknown OpenAI image provider params. Gemini accepts aspect_ratio and image_size. See Image generation for the exact request shape and troubleshooting.
Provider-specific options can be passed via the --param CLI flag or provider_params in the SDK:Provider-native web search is on by default for catalog models that support it.
Disable it in config with the matching provider_tools.<provider> search toggle,
or for a single CLI run with rkat run --no-web-search "...".
Anthropic
OpenAI
Gemini
Parameter
Description
thinking_budget
Token budget for extended thinking (integer)
top_k
Top-k sampling parameter (integer)
rkat run --model claude-sonnet-4-6 --param thinking_budget=10000 "Solve this problem"
Parameter
Values
Description
reasoning_effort
none, low, medium, high, xhigh
Reasoning effort for GPT-5-era OpenAI models and other compatible OpenAI reasoning surfaces
seed
Integer
Seed for deterministic outputs
rkat run --model gpt-5.5 --param reasoning_effort=high "Prove this theorem"
Parameter
Description
thinking_budget
Token budget for extended thinking (integer)
top_k
Top-k sampling parameter (integer)
rkat run --model gemini-3.5-flash --param thinking_budget=5000 "Analyze this data"
Meerkat ships a curated built-in model catalog in the meerkat-models crate (meerkat-core owns only the vocabulary types and ModelCatalog mechanics) and merges it with any configured self-hosted aliases into one effective runtime registry used for capability detection, provider resolution, and catalog responses.Query the catalog programmatically from any surface:
CLI: rkat models
RPC: models/catalog
REST: GET /models/catalog
MCP: meerkat_models_catalog
Configured self-hosted aliases appear under the self_hosted provider group and include their backing server_id.For Gemma 4 specifically, prefer chat_completions as the default OpenAI-compatible interface. It is the clearest common path for tool calling across Ollama, LM Studio, and vLLM, while reasoning-trace semantics still vary by server.
The provider for a model is resolved by exact match against the config-backed
model registry (the compiled-in catalog merged with any configured self-hosted
aliases). There is no name-prefix inference: a model id resolves to the
provider recorded for its catalog entry (or its self-hosted alias config), and
an uncatalogued id — even a prefix-shaped one like gpt-unknown-preview or
claude-unknown-preview — is rejected rather than guessed at.A configured self-hosted alias such as gemma-4-31b resolves by exact model ID
match, so it works without --provider.You can still override this with --provider on the CLI or provider in API requests.
Factory-built agents can use an ordered model fallback chain when the active
model reaches a typed, recoverable LLM failure boundary. This protects long
sessions from provider-side disruptions such as removed models, rate limits,
provider overload, auth failures, and context-window overflow.Fallback is enabled by default. With no explicit chain, Meerkat builds a
catalog-owned backup order from the configured provider defaults and the global
catalog default, excluding the active model and duplicate targets. You can
replace that order in realm config:
provider is optional when the model exists in the effective registry.
auth_binding is optional; when omitted, the target resolves through the same
provider-runtime registry used for normal session creation. The same
model/provider with a different auth_binding is a distinct fallback target,
so custom chains can fail over to another credential realm. For the catalog
default chain, Meerkat keeps candidates inside the selected non-env realm when
that realm has a matching provider binding; unavailable catalog-default
candidates are skipped rather than silently bleeding into another realm. For an
explicit custom chain, an unavailable target is a configuration error.When a switch is applied, Meerkat updates the session LLM identity, request
policy, auth lease, provider parameters, output-token ceiling, and capability
base filter before retrying. The agent receives a hidden system notice with the
source model, fallback model, failure reason, skipped targets, active model
limits, and any tools hidden by the fallback model’s capabilities.For structured-output extraction turns, fallback keeps the extraction request
deterministic: Meerkat reapplies the output schema for the new provider and
keeps provider-native web search/grounding disabled on the retry.Capability changes are expected. For example, falling back from a 1M-context
vision model to a 128K local model can clamp output tokens, hide image-result
tools, and skip the local target entirely if the failure was a context overflow
larger than that target’s context window. Later turns remain sticky to the
active fallback model until the session is explicitly hot-swapped or rebuilt.Meerkat does not fallback on call/network timeouts, and it suppresses
cross-model fallback for any retryable error after user-visible text or
reasoning stream output has been emitted. Ordinary same-model retry policy can
still apply when the recovery authority permits it.