Part I: Architectural Principles
The agent loop is infrastructure, not application
Meerkat treats the LLM execution loop the way a database engine treats storage — as a composable, embeddable primitive with no opinions about what you build on top. The entire architecture flows from this.Trait contracts own the architecture
meerkat-core defines nine trait contracts (AgentLlmClient, AgentToolDispatcher, AgentSessionStore, SessionService, Compactor, MemoryStore, HookEngine, SkillEngine, SkillSource) and contains zero I/O dependencies. Everything else — providers, storage backends, tool registries — lives in separate crates that implement these traits.
This is a hard constraint, not an aspiration. The core agent loop is testable in-memory without mocks, embeddable in any Rust binary, and provably independent of any specific LLM provider or storage backend.
Surfaces are skins, not authorities
CLI, REST, JSON-RPC, and MCP Server all follow the same path:AgentFactory consolidates provider resolution, tool dispatch wiring, comms setup, hook resolution, and skill injection into a single construction pipeline.
The important 0.5 distinction is ownership:
- runtime-backed surfaces own
keep_alive, Queue/Steer routing, comms drain, and commit/cancel semantics SessionServiceis the substrate seam they route throughEphemeralSessionServiceremains useful, but it is not a co-equal product path for runtime semantics
Composition over configuration
Optional capabilities are genuinely optional — not feature-flagged defaults that are always present:CommsRuntimeisOption<Arc<dyn CommsRuntime>>— whenNone, the loop skips comms entirelyHookEngineisOption<Arc<dyn HookEngine>>— no hooks configured, no overheadCompactor,MemoryStore— same pattern
Agent<C, T, S> generic parameters make this explicit at the type level. Tool dispatch uses CompositeDispatcher to layer builtins, MCP, external, and policy filters — each layer is independently testable and swappable.
Sessions are first-class, persistence is optional
SessionService defines the substrate lifecycle. Production surfaces (CLI, REST, RPC, MCP) use PersistentSessionService under a RuntimeSessionAdapter that owns keep_alive, Queue/Steer routing, comms drain, and ingress admission. EphemeralSessionService is the in-memory substrate for testing and embedded use — it supports Queue-only turns and rejects runtime-owned semantics (Steer, render_metadata). The .rkat/sessions/ files are derived projections materialized by SessionProjector — delete them and replay from the event store to get identical content.
Errors separate mechanism from policy
Three-tier typed errors (ToolError → AgentError → SessionError) capture what happened without prescribing what to do about it. LlmFailureReason distinguishes retryable (RateLimited { retry_after }) from fatal (AuthError) at the type level. The agent loop implements retry logic; callers decide whether to resume or abort. Every error variant carries a stable error_code() -> &'static str for SDK wire formats.
Wire types and domain types are separate concerns
meerkat-contracts owns the wire format (WireRunResult, WireEvent), emits JSON schemas via schemars, and feeds SDK codegen for Python and TypeScript. Domain types in meerkat-core are richer and not constrained by serialization compatibility. The conversion is explicit (From impls), lossy where appropriate, and version-locked — ContractVersion::CURRENT must equal workspace.package.version.
Configuration is layered and declarative
Config resolution follows a strict precedence:Config::default() → file load → env overrides (API keys only) → per-request SessionBuildOptions. No cascading merges. No global mutable state. Running agents are not affected by config file changes mid-turn.
Testing is a design constraint, not an afterthought
The three-tier test suite (unit/fast-integration in seconds, integration-real with spawned processes, e2e with live APIs) exists because the architecture makes it possible. Core has no I/O, so unit tests need no mocks. Every trait has a test implementation. The test tiers are named, aliased (cargo rct, cargo unit, cargo int), and enforced by pre-commit hooks.
Part II: Rust Implementation Principles
Ownership topology
The agent loop maintains a clear split: shared immutable infrastructure (Arc<C>, Arc<T>, Arc<S> for client, tools, store) vs exclusively-owned mutable state (session, budget, loop state). Agent::run(&mut self) needs no mutex — the RPC SessionRuntime spawns a dedicated tokio task per session that exclusively owns the Agent. Cancellation works via &mut self state transition, not shared atomics.
Copy-on-write session history
Session messages live inArc<Vec<Message>>. Session::fork() clones the Arc (O(1)), and Arc::make_mut triggers a copy only on first mutation when the refcount exceeds one. Forked conversation branches get independent session copies cheaply — no full history duplication at fork time.
Zero-allocation iteration
ToolCallView<'a> is Copy — it borrows &'a str for id and name, &'a RawValue for args, all from the message buffer. ToolCallIter adapts a slice iterator, filtering for tool-use blocks on the fly. No Vec<ToolCall> is ever materialized. The trait contract returns Arc<[Arc<ToolDef>]> — an arc’d slice, not a Vec — so cloning the tool list is O(1).
Deferred parsing via Box<RawValue>
Tool arguments are preserved as unparsed JSON (Box<RawValue>) from provider response through the core loop to the dispatcher. Parsing happens at most once — when the dispatcher calls parse_args::<T>() — and only if the tool actually executes. A custom deserialize_tool_use_args handles the serde buffering quirk where Box<RawValue> can’t deserialize directly from internally-tagged enum content.
Typed enums over serde_json::Value
Provider metadata is a typed enum, not Option<serde_json::Value>. The compiler enforces exhaustive matching. No runtime “does this object have a signature field?” checks.
#[non_exhaustive] allows future provider variants without breaking downstream consumers.
Newtype discipline
Semantic types get their own newtypes even when wrapping primitives.BlockKey(usize) can’t be confused with a message index. OperationId(Uuid) is distinct from SessionId. SourceUuid and SkillName are separate types in the skills system.
Serde as a design tool
Three distinct tagging strategies, each chosen for the data shape:- Internally tagged (
#[serde(tag = "role")]) forMessage— discriminant in its own field, enables two-stage parsing - Adjacently tagged (
#[serde(tag = "block_type", content = "data")]) forAssistantBlock— compact, supports custom deserializers on nested fields - Externally tagged (
#[serde(tag = "provider")]) forProviderMeta— each variant carries different provider-specific fields
#[serde(skip_serializing_if = "Option::is_none")] keeps wire payloads minimal. Custom Default impls provide semantic defaults (structured_output_retries: 2, not 0). Custom Debug impls hide sensitive fields, showing .is_some() instead of session content.
Streaming block assembly
TheBlockAssembler handles out-of-order streaming events with a slot-based design. Vec<BlockSlot> is append-only — Pending placeholders inserted at start-time are overwritten with Finalized blocks on completion. IndexMap<String, ToolCallBuffer> preserves insertion order for deterministic assembly. The tool call ID is the map key only — never duplicated in the buffer value. Text deltas coalesce with the previous text block via push_str when possible.
DuplicateToolStart are returned to the caller — never swallowed.
Generic type erasure at boundaries
Agent<C, T, S> keeps generics for monomorphization in tests and single-provider deployments. The core loop impl block uses C: AgentLlmClient + ?Sized, so it works with both concrete types and trait objects. Boxing to DynAgent happens only at surface boundaries — inside the agent loop, dispatch is direct.
Async without interior mutability
The agent loop runs as&mut self — exclusive ownership, no mutex. The RPC SessionRuntime achieves this by giving each session a dedicated tokio task that owns its Agent. Commands flow in via channels; events stream out as notifications. Cancellation is a state machine transition on &mut self, not a shared AtomicBool.
The emit_event! macro broadcasts to both an EventTap (local subscribers) and an optional mpsc::Sender<AgentEvent> (external channel). When the receiver drops, it sets a local event_stream_open flag and logs once — no panics, no repeated error spam.
Feature gating at the type level
Optional crates are#[cfg(feature)]-gated at the type level — PersistentSessionService exists only with session-store, DefaultCompactor only with session-compaction. The facade re-exports only what features enable, so downstream code can’t accidentally depend on disabled components. Fallback implementations are always available.
Trait composition with graceful degradation
Optional trait methods provide default implementations that returnErr(Unsupported(...)). Required methods define the minimal contract. This is better than downcasting — the trait is the interface, and optional features degrade at the type level rather than at runtime.
Error propagation across crate boundaries
Three-tier errors withFrom impls for ergonomic ? chaining. Each tier captures minimal context — tool name, timeout duration, not full stack traces. thiserror generates Display implementations. Every variant has a stable error_code() for wire protocols.
