> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rkat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# CLI configuration

> Realm selection, storage behavior, environment variables, MCP config, and exit codes for rkat.

## Global runtime scope flags

All CLI commands accept realm scope flags:

* `--realm <id>`
* `--isolated`
* `--instance <id>`
* `--realm-backend <sqlite|jsonl|memory>` (creation hint only)
* `--state-root <path>`
* `--context-root <path>`

These flags decide which realm config/session state is used.

## Default realm behavior

| Command surface                                     | Default when `--realm` is omitted         |
| --------------------------------------------------- | ----------------------------------------- |
| `rkat run`, `rkat run --resume`, `rkat session ...` | Workspace-derived stable realm (`ws-...`) |
| `rkat mob ...`                                      | Workspace-derived stable realm (`ws-...`) |
| `rkat-rpc`                                          | New opaque realm (`realm-...`)            |

If you want CLI + RPC to share the same state, pass the same explicit `--realm` to both.

For CLI commands, the default context root is the current directory. The default
state root is `<context-root>/.rkat/realms`.

Use `--verbose` on `rkat run` to print the active realm, context root, and
physical realm root. See the [realm guide](/guides/realms) for the full
identity-vs-storage model.

## Environment variables

Required API keys (at least one):

| Variable                                            | Provider         |
| --------------------------------------------------- | ---------------- |
| `ANTHROPIC_API_KEY`                                 | Anthropic Claude |
| `OPENAI_API_KEY`                                    | OpenAI GPT       |
| `AZURE_OPENAI_API_KEY` plus `AZURE_OPENAI_ENDPOINT` | Azure OpenAI     |
| `GOOGLE_API_KEY` / `GEMINI_API_KEY`                 | Google Gemini    |

See [providers](/concepts/providers) for full key precedence (`RKAT_*` variants included).

## Config files

Canonical runtime config for CLI commands is realm-scoped
(`<context-root>/.rkat/realms/<realm>/config.toml` by default).

Compatibility files still exist:

| Scope   | Path                  |
| ------- | --------------------- |
| User    | `~/.rkat/config.toml` |
| Project | `.rkat/config.toml`   |

These are useful for templates (`rkat init`) and compatibility workflows, but realm config is the runtime source of truth for CLI/RPC/REST/MCP surfaces.

Common tool gates:

```toml theme={null}
[tools]
builtins_enabled = false
shell_enabled = false
schedule_enabled = true
workgraph_enabled = false
```

## Custom model registry entries

Uncatalogued models served by a first-party API provider are declared once
under `[models.<id>]`, next to the per-provider default model strings. One
definition feeds provider inference, compaction scaling, capability gates, and
call timeouts:

```toml theme={null}
[models]
anthropic = "claude-opus-4-8"

[models.claude-internal-preview]
provider = "anthropic"          # required: anthropic | openai | gemini
display_name = "Claude Internal Preview"
context_window = 500000          # drives compaction scaling
max_output_tokens = 16384
vision = true                    # capability flags default to false
web_search = false
call_timeout_secs = 900
```

`provider` parses into the closed provider vocabulary and fails closed on
unknown names. Self-hosted models belong under `[self_hosted.models]` (below),
not `[models]`. The same `[models.<id>]` table shape is accepted inside a mob
definition (`mob.toml`) for mob-scoped models.

## Self-hosted model config

Self-hosted models are defined directly in realm config:

```toml theme={null}
[self_hosted.servers.local]
transport = "openai_compatible"
base_url = "http://127.0.0.1:11434"
api_style = "chat_completions"

[self_hosted.models.gemma-4-31b]
server = "local"
remote_model = "gemma4:31b"
display_name = "Gemma 4 31B"
family = "gemma-4"
tier = "supported"
context_window = 256000
max_output_tokens = 8192
vision = true
image_tool_results = true
inline_video = false
supports_temperature = true
supports_thinking = true
supports_reasoning = true
call_timeout_secs = 600
```

`transport` is currently `openai_compatible` only.

`api_style` chooses the upstream API shape:

* `chat_completions` is the safest default for Ollama, LM Studio, and vLLM in current Meerkat docs and examples
* `responses` should be treated as an advanced/server-specific path you validate explicitly before depending on it

For Gemma 4, prefer `chat_completions` unless you have verified a server-specific `responses` workflow you want to use.

`supports_thinking` and `supports_reasoning` describe the behavior you intend Meerkat to expose through that configured transport. Gemma 4 models themselves are reasoning-capable, but some servers expose those capabilities with provider-specific conventions.

Server entries carry connection facts only — the legacy `bearer_token` /
`bearer_token_env` server fields are rejected at config parse. Credentials and
the connection itself are owned by a realm binding for
`provider = "self_hosted"` (`auth_method` of `none`, `api_key`, or
`static_bearer`); a self-hosted server with no realm binding fails closed at
run time. See [Self-hosting models](/guides/self-hosting-models) for the
realm binding shape.

## Model fallback config

Model fallback is enabled by default for factory-built agents. If the active
model fails at a recoverable LLM boundary, Meerkat can move through an ordered
backup chain and retry with the next viable model:

```toml theme={null}
[model_fallback]
enabled = true

[[model_fallback.chain]]
model = "claude-opus-4-8"
provider = "anthropic"

[[model_fallback.chain]]
model = "gpt-5.5"
provider = "openai"
auth_binding = { realm = "dev", binding = "openai_oauth" }
```

Omit `chain` to use the catalog default chain. Set `enabled = false` to fail
closed on the original model error. Set `use_catalog_default_chain = true` in a
higher-precedence config layer when you need to undo an inherited disabled or
custom policy.

Fallback targets include the auth binding identity. Use `auth_binding` on an
explicit target to retry the same model/provider through a different configured
credential. The catalog default chain only uses provider bindings available in
the selected non-env realm.

Fallback can change the effective model surface. Smaller backup models can
reduce max output tokens, hide tools gated by model capabilities, and be skipped
for context overflow if their context window is too small. The agent receives a
hidden system notice when a switch happens, including the reason and any hidden
tools.

Fallback is only attempted before user-visible stream output has been emitted.
Structured-output extraction retries keep their schema and provider-native web
search remains disabled after a fallback switch.

## Session storage layout

CLI realm storage lives under the active state root. By default, that is
project-local:

```text theme={null}
<context-root>/.rkat/realms/<realm>/
```

Pass `--state-root <path>` to use a different parent directory.

Important files:

* `realm_manifest.json` (backend pinning)
* `config.toml` (realm config)
* `config_state.json` (config generation CAS state)
* `sessions.sqlite3` (when backend is sqlite)
* `sessions_jsonl/` (when backend is jsonl)
* `mobs/` (per-mob `<mob_id>.db` SQLite storage plus `realm_profiles.db`)

`--realm-backend` is a creation hint. After first realm creation, backend selection is pinned by `realm_manifest.json`.
This applies to both session storage and `rkat mob` command behavior in that realm.

When SQLite support is compiled in, new persistent realms default to `sqlite`.

## MCP configuration

MCP servers are configured separately from realm runtime state:

| Scope   | Path               |
| ------- | ------------------ |
| User    | `~/.rkat/mcp.toml` |
| Project | `.rkat/mcp.toml`   |

Project servers override user servers with the same name.

## Exit codes

| Code | Meaning          |
| ---- | ---------------- |
| 0    | Success          |
| 1    | Internal error   |
| 2    | Budget exhausted |

Most richer session/provider/runtime failures are reported through structured error payloads and stderr text rather than a large exit-code taxonomy. In particular, session persistence/compaction disabled conditions are informational at the CLI transport layer and do not map to dedicated non-zero exit codes.

## See also

* [Realms](/concepts/realms)
* [CLI commands](/cli/commands)
* [Configuration](/concepts/configuration)
