Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.rkat.ai/llms.txt

Use this file to discover all available pages before exploring further.

Live channels in Meerkat provide low-latency audio and text streaming through the live/* JSON-RPC methods. To use a live channel, create a session on a realtime-capable model (for example gpt-realtime-2) and call live/open to start the channel explicitly. Channel creation, status, refresh, input, interruption, truncation, and close are all caller-initiated through the live surface. This guide covers how to open live channels, send input, and observe channel state.

What this guide is for

Use this guide when you want to:
  • open a live audio/text channel on a session
  • send audio or text input to a live channel
  • understand live channel lifecycle and capabilities
  • reason about live channels in the context of the normal session model
ModelCapabilities.realtime remains the capability bit that gates whether live/open succeeds. The --live-ws <addr> flag on rkat-rpc enables the WebSocket listener required for audio transport.

Mental model

A session has exactly one conversation history. The session’s LLM client is the active delivery mechanism for that history; most models deliver via request/response (e.g. Anthropic claude-opus-4-7, OpenAI gpt-5.5), a small class delivers via a persistent bidirectional socket (e.g. OpenAI gpt-realtime-2). The only thing a realtime-capable model changes is how the model is reached — the session still owns history, tools, context, and turn boundaries.
        +--------------------------+
        |   Session (history,      |
        |   tools, context)        |
        +--------------------------+
                     |
                     | session.model = "gpt-realtime-2"
                     v
        +--------------------------+
        | Live channel (caller-    |  audio ingress/egress,
        | initiated via live/open) |  committed into session
        +--------------------------+  at turn boundaries
Key invariants:
  • One canonical history. The session is the source of conversational truth. Audio chunks commit into the same history as non-live turns, at turn boundaries via live/commit_input.
  • Capability gates channel open. ModelCapabilities.realtime is the signal that determines whether live/open succeeds. No channel is opened automatically.
  • Channel lifecycle is caller-initiated. Call live/open to start a channel, live/close to end it, and live/send_input / live/commit_input / live/interrupt / live/truncate to control flow.
  • Refresh without close/reopen. live/refresh applies mutable session config (instructions, tools, audio format) to an open channel without interrupting audio flow. Identity swaps (model/provider) require live/close + live/open.

Realtime-capable models

ModelCapabilities.realtime: bool is set per model in the curated catalog (meerkat_core::model_profile, re-exported by meerkat-models). Capability is catalog data, not prefix inference:
  • OpenAI: gpt-realtime-2 — the only realtime-capable model in the current catalog.
  • Gemini: reserved for future *-live* endpoints — no production models today.
  • Anthropic: no realtime-capable models today.
  • Self-hosted: realtime = false by default.
Use GET /models/catalog (REST) or models/catalog (RPC) to inspect which models advertise realtime == true in the running runtime.

Opening a live channel

A live channel is opened by creating a session on a realtime-capable model and calling live/open. Two paths:

At session creation

Create a session on a realtime-capable model, then call live/open:
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "session/create",
  "params": {
    "prompt": "Let's talk.",
    "model": "gpt-realtime-2",
    "provider": "openai"
  }
}
live/open returns a LiveOpenResult containing the transport bootstrap (e.g. WebSocket URL), WireLiveChannelCapabilities, and WireLiveContinuityMode. The --live-ws <addr> flag must be set on rkat-rpc for WebSocket transport.

Configuration defaults

The session’s default model can be set in config (~/.rkat/config.toml or project-local) via default_model. Any session created without an explicit model parameter inherits the configured default — so setting default_model = "gpt-realtime-2" makes live channels available by default. See the Configuration guide.

Observing channel status

Use live/status to read the current state of a live channel:
{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "live/status",
  "params": {"channel_id": "ch_01936f8a-..."}
}

Live channel methods

MethodPurpose
live/openOpen a live channel on a session; returns LiveOpenResult with transport bootstrap and capabilities
live/statusRead live channel status
live/send_inputSend an input chunk (audio/text) to a live channel
live/commit_inputCommit pending input on a live channel (turn boundary)
live/interruptInterrupt the assistant turn on a live channel (barge-in)
live/truncateTruncate assistant output at a client-tracked playback cursor
live/refreshApply mutable session config (instructions/tools/audio) to an open channel
live/closeClose a live channel
live/open requires rkat-rpc to be started with the --live-ws <addr> flag for WebSocket transport. Without it the live/* methods are not registered.

End-to-end example

from meerkat import MeerkatClient

async with MeerkatClient() as client:
    await client.connect(realm_id="team-alpha")

    # 1. Create a session on a realtime-capable model.
    session = await client.create_session(
        prompt="Ready for live audio.",
        model="gpt-realtime-2",
        provider="openai",
    )

    # 2. Open a live channel (returns channel_id + transport bootstrap).
    result = await client.live_open(session.id)
    channel_id = result["channel_id"]
    # result["transport"] contains the WebSocket URL + token
    # result["capabilities"] describes supported input/output modalities

    # 3. Send audio input and commit at turn boundaries.
    await client.live_send_input_audio(channel_id, audio_data, 24000, 1)
    await client.live_commit_input(channel_id)

    # 4. Close the channel when done.
    await client.live_close(channel_id)

    # 5. The session's history accumulates committed turns — same as a text session.
    history = await client.read_session_history(session.id)

Live channels and mobs

Each mob member has its own session, so live channels are per-member by construction. To make a member live-capable, set its profile’s model to a realtime-capable model (for example in the MobDefinition TOML):
id = "live-demo"

[profiles.host]
model = "gpt-realtime-2"
provider = "openai"
peer_description = "Live audio host"
Open a live channel against the member’s session after the member is spawned.

Limitations and known gaps

  • OpenAI Realtime API only. The shipped provider integration is OpenAI’s Realtime API (gpt-realtime-2). Azure OpenAI (azure_openai) and other providers are not yet wired into the live transport layer.
  • One live channel per session. A session has at most one live channel at a time. For per-member live channels in mobs, open channels against individual member sessions.
  • Idle sessions cannot host a channel. Start a turn or spawn the member first.
  • Identity swaps require close/reopen. live/refresh applies config-only changes (instructions, tools, audio format); model or provider swaps require live/close + live/open.

See also