Files
flynn/docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md
T
2026-02-26 17:01:22 -08:00

129 lines
5.5 KiB
Markdown

# Gateway Sessions and Queueing (Agent Execution Model)
This document explains how the gateway maps WebSocket clients onto durable sessions, and how work is serialised per session so agent execution stays coherent under concurrent requests.
If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
## Key Ideas
- A WebSocket client gets a `connectionId`.
- Each connection is attached to a `sessionId`.
- Agent work is queued per `sessionId` (FIFO), not per connection.
- Sessions persist in SQLite via `SessionManager` even if clients disconnect.
- On the first message of a session, the orchestrator can inject session-start memory (`user/profile` + `user/working`) when `memory.user_namespace` is configured.
- Once dequeued, message routing may execute the native orchestrator path or an optional external backend path (`claude_code`, `opencode`, `codex`, `gemini`, `pi_embedded`) depending on agent/backend config.
- Runtime backend mode can be overridden manually via `/runtime` command fast-path (`status`, `activate pi`, `deactivate pi`, `use config`) and is persisted in preferences (`/backend` remains a compatibility alias).
- `flynn tui` now attaches to this same gateway command path for `/runtime ...` and auto-starts/attaches daemon+gateway when needed.
- Backend routing outcomes are auditable via `backend.route` / `backend.success` / `backend.fallback`, which enables offline canary evaluation without changing gateway protocol methods.
- Run lifecycle/cancel intent and reaction decisions are emitted to audit logs, and aggregated into `system.metrics` counters (runStates, cancelLatencyMs, reactions) for dashboards.
- Reaction matching is deterministic (priority + cooldown + recursion guard) before intent/agent routing.
- `subagent.*` tools create child orchestrators scoped to the parent conversation (`subagent:<parentSessionId>:<childId>`) with idle TTL cleanup, per-child queue mode (`followup|interrupt`), and session budgets (turn/token/timeout); this is tool-loop behavior, not a separate gateway RPC session lane.
- Browser workflow reliability primitives (`browser.wait_for/assert/extract/checkpoint.*`) execute in the same queued session lane and apply browser-config guardrails (domain allowlist/high-risk confirmation, bounded retries, workflow step budget).
- Companion `node.*` registration is per WebSocket connection; reconnects must re-register capabilities before invoking node RPC methods (or use runtime-client reconnect state replay to re-register/status/location/push automatically).
- Canvas artifacts are persisted per session under the gateway data directory for UI recovery across restarts.
- TTS output is best-effort; synthesis failures fall back to text-only responses.
## Component Map
```mermaid
flowchart LR
subgraph CFG[Config + Runtime Policy]
QP[server.queue policy\nmode/cap/overflow/overrides]
BM[backend runtime mode\nconfig_default|force_native|force_pi_embedded]
end
subgraph GW[Gateway Process]
TUI[TUI client\n(runtime command forwarding)]
WS[WebSocket connection\n(connectionId)]
GS[GatewayServer]
LQ[LaneQueue\nper-session FIFO]
SB[SessionBridge\nconnectionId -> sessionId -> AgentOrchestrator]
AQ[AuditLogger\nqueue.preempt events]
MC[MetricsCollector\nrun states + cancel latency + reactions]
UI[Web UI Dashboard]
end
subgraph CORE[Flynn Core]
SM[SessionManager\nin-memory cache + SQLite]
SS[SessionStore\nSQLite tables]
AO[AgentOrchestrator / External Backends]
end
TUI --> WS
WS --> GS
QP --> GS
BM --> GS
GS --> LQ
GS --> SB
LQ --> AQ
GS --> MC
MC --> UI
SB --> AO
SB --> SM
SM --> SS
```
## Session IDs (What Actually Gets Stored)
The durable session ID stored by `SessionManager` is:
`<frontend>:<userId>`
For the gateway:
- `SessionBridge.connect()` assigns a `connectionId` (UUID).
- It defaults the connection's `sessionId` to `ws:<connectionId>`.
- It then calls `SessionManager.getSession('ws', sessionId)`.
That means gateway sessions are stored as:
- `ws:ws:<connectionId>`
This is expected: the gateway adds its own namespace, and the session manager namespaces again by frontend.
Key files:
- `src/gateway/session-bridge.ts`
- `src/session/manager.ts`
## Per-Session FIFO Queueing (LaneQueue)
`agent.send` uses a lane ID derived from the session:
- lane = `SessionBridge.getSessionId(connectionId)` (preferred)
- fallback lane = `connectionId` (only if session lookup fails)
Within a lane:
- Only one request executes at a time.
- Later requests queue (FIFO) and start after the active request finishes.
- `interrupt` mode enforces "latest wins": any queued backlog is superseded and the active run is asked to cancel so the newest request becomes the next (or immediate) execution.
Across lanes:
- Independent sessions run in parallel.
Key files:
- `src/gateway/lane-queue.ts`
- `src/gateway/handlers/agent.ts`
## Cancellation Semantics
`agent.cancel` performs two separate actions:
1. Cancels any queued (not-yet-started) work in the lane (`LaneQueue.cancel(laneId)`).
2. Requests cancellation of the active agent operation (`AgentOrchestrator.cancel()` via `SessionBridge.cancel()`).
Important:
- Cancellation is best-effort for the currently running work: it stops at the next safe point in the agent loop.
- Queued work is deterministically rejected.
- Gateway streams `run_state` events (`start`, `cancel_requested`, `cancelled`, `complete`, `error`) so clients can render lifecycle state without parsing assistant text.
Key files:
- `src/gateway/handlers/agent.ts`
- `src/backends/native/orchestrator.ts`