98f954de0d
Add optional reaction match-drop and skip-increase drift thresholds, expose CLI flags, and enable conservative defaults in cadence package scripts. Includes tests and docs/state sync.
153 lines
10 KiB
Markdown
153 lines
10 KiB
Markdown
# Gateway Sessions and Queueing (Agent Execution Model)
|
|
|
|
This document explains how the gateway maps WebSocket clients onto durable sessions, and how work is serialised per session so agent execution stays coherent under concurrent requests.
|
|
|
|
If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
|
|
|
|
## Key Ideas
|
|
|
|
- A WebSocket client gets a `connectionId`.
|
|
- Each connection is attached to a `sessionId`.
|
|
- Agent work is queued per `sessionId` (FIFO), not per connection.
|
|
- Sessions persist in SQLite via `SessionManager` even if clients disconnect.
|
|
- On the first message of a session, the orchestrator can inject session-start memory (`user/profile` + `user/working`) when `memory.user_namespace` is configured.
|
|
- Once dequeued, message routing may execute the native orchestrator path or an optional external backend path (`claude_code`, `opencode`, `codex`, `gemini`, `pi_embedded`) depending on agent/backend config.
|
|
- Runtime backend mode can be overridden manually via `/runtime` command fast-path (`status`, `activate pi`, `deactivate pi`, `use config`) and is persisted in preferences (`/backend` remains a compatibility alias).
|
|
- `flynn tui` now attaches to this same gateway command path for `/runtime ...` and auto-starts/attaches daemon+gateway when needed.
|
|
- Backend routing outcomes are auditable via `backend.route` / `backend.success` / `backend.fallback`, which enables offline canary evaluation without changing gateway protocol methods.
|
|
- Run lifecycle/cancel intent and reaction decisions are emitted to audit logs, and aggregated into `system.metrics` counters (runStates, cancelLatencyMs, reactions) for dashboards.
|
|
- Reaction matching is deterministic (priority + cooldown + recursion guard) before intent/agent routing.
|
|
- `subagent.*` tools create child orchestrators scoped to the parent conversation (`subagent:<parentSessionId>:<childId>`) with idle TTL cleanup, per-child queue mode (`followup|interrupt`), and session budgets (turn/token/timeout); this is tool-loop behavior, not a separate gateway RPC session lane.
|
|
- Browser workflow reliability primitives (`browser.wait_for/assert/extract/checkpoint.*`) execute in the same queued session lane and apply browser-config guardrails (domain allowlist/high-risk confirmation, bounded retries, workflow step budget).
|
|
- Companion `node.*` registration is per WebSocket connection; reconnects must re-register capabilities before invoking node RPC methods (or use runtime-client reconnect state replay to re-register/status/location/push automatically).
|
|
- Companion packaging/bootstrap can be generated offline via `flynn companion --export-bootstrap <path|->`, which emits resolved gateway/node/runtime settings without opening a WebSocket session.
|
|
- Companion release artifacts can be generated via `flynn companion --export-release-bundle <dir>`, producing bootstrap JSON + launcher + README + `CHECKSUMS.sha256` + `RELEASE_MANIFEST.json` for installable shell distribution workflows.
|
|
- Generated launchers validate `CHECKSUMS.sha256` before invoking `flynn companion`, reducing accidental tampered-bundle launches.
|
|
- Companion release-bundle exports can optionally be signed (`--signing-key`, `--signing-key-id`) to emit `CHECKSUMS.sha256.sig` for distribution trust verification.
|
|
- Companion release bundles can be verified before install via `flynn companion --verify-release-bundle <dir>` with optional signature-key checks.
|
|
- Companion packaging automation is available via `pnpm companion:bundle -- --output <dir> ...`, which builds and verifies the release bundle in one pass.
|
|
- Companion platform starter scaffolds can be generated via `flynn companion --export-shell-template <dir>` for macOS/iOS/Android reference app bootstrapping, including iOS/Android runtime skeletons that issue `node.register`, `node.status.set`, `node.location.set`, `node.push_token.set`, and `agent.send`.
|
|
- Companion reference app directories can be regenerated via `pnpm companion:reference-apps -- --output apps/companion` for repo-shipped starter surfaces, including a runnable `macos-app` Swift Package menu-bar scaffold.
|
|
- Companion reference-app sync can be enforced with `pnpm companion:reference-apps:check` (regenerate + diff fail on drift).
|
|
- CI workflow `.github/workflows/companion-release-bundle.yml` mirrors this pipeline for manual artifact generation/upload.
|
|
- CI workflow `.github/workflows/companion-reference-apps-check.yml` enforces reference-app generator sync on pull requests.
|
|
- Audit phase-0 live telemetry snapshots can be regenerated with `pnpm audit:phase0-baseline:live` (channel-origin anonymized sample JSONL + summary JSON/markdown artifacts).
|
|
- Backend-scoped channel snapshots can be regenerated with `pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native` (`--backend` filtering via `backend.route` timelines).
|
|
- Gateway-origin phase-0 windows (including cancel-path samples) can be captured with `pnpm audit:phase0-baseline:live:gateway` (auto-detect latest cancel window) or `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit bounds.
|
|
- `pnpm audit:phase0-baseline:live:refresh` runs channel + gateway + backend-scoped (`pi_embedded` and `native`) capture paths in one command.
|
|
- `pnpm audit:phase0-baseline:live:drift` checks backend-scoped artifact freshness/drift gates (including optional reaction match/skip rate thresholds) and writes `phase0_baseline_live_backend_drift_<UTC-date>.md/.json`; `pnpm audit:phase0-baseline:live:refresh:drift` chains refresh + drift checks for scheduled cadence runs.
|
|
- `pnpm audit:phase0-baseline:live:refresh:drift:rolling` performs the same chain using one UTC timestamp tag (`YYYY-MM-DD-HHMMSS`) across channel/gateway/backend/drift outputs so each cadence run preserves a distinct comparison point.
|
|
- `pnpm audit:phase0-baseline:live:prune` (dry-run) and `pnpm audit:phase0-baseline:live:prune:apply` (delete) manage retention of rolling-tag artifacts to control artifact growth while preserving newest snapshots per family.
|
|
- `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` combines rolling refresh+drift with retention apply for one-command cron scheduling using a shared `TAG`; adjust retention depth with non-negative integer `KEEP_PER_FAMILY` and use generated `phase0_baseline_live_prune_<tag>.md/.json` artifacts for retention audit traceability.
|
|
- Retention management also covers rolling prune-report artifacts (`phase0_baseline_live_prune_<rolling-tag>.md/.json`) as a first-class family.
|
|
- `audit:phase0-baseline:live*` package scripts now omit fixed tags so scheduled runs automatically roll to current UTC-date artifact tags.
|
|
- Companion CLI supports one-shot shell bootstrap metadata for live sessions (`--app-version`/`--status-text`, `--latitude`/`--longitude`, `--push-token`) so desktop/mobile wrappers can initialize node status/location/push in a single launch flow.
|
|
- Canvas artifacts are persisted per session under the gateway data directory for UI recovery across restarts.
|
|
- TTS output is best-effort with ordered provider fallback + per-provider cooldown tracking; synthesis failures still fall back to text-only responses.
|
|
- Talk mode voice sessions share the same cancel/replace semantics as text lanes (`/stop`, interrupt mode preemption), including spoken `stop`/`cancel` mapping while talk mode is active.
|
|
- Setup/onboarding UX now adds post-save live readiness checks (model/channel/memory/automation) and a guided first-success task flow, improving zero-to-first-automation path reliability before sustained gateway use.
|
|
|
|
## Component Map
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
subgraph CFG[Config + Runtime Policy]
|
|
QP[server.queue policy\nmode/cap/overflow/overrides]
|
|
BM[backend runtime mode\nconfig_default|force_native|force_pi_embedded]
|
|
end
|
|
|
|
subgraph GW[Gateway Process]
|
|
TUI[TUI client\n(runtime command forwarding)]
|
|
WS[WebSocket connection\n(connectionId)]
|
|
GS[GatewayServer]
|
|
LQ[LaneQueue\nper-session FIFO]
|
|
SB[SessionBridge\nconnectionId -> sessionId -> AgentOrchestrator]
|
|
AQ[AuditLogger\nqueue.preempt events]
|
|
MC[MetricsCollector\nrun states + cancel latency + reactions]
|
|
UI[Web UI Dashboard]
|
|
end
|
|
|
|
subgraph CORE[Flynn Core]
|
|
SM[SessionManager\nin-memory cache + SQLite]
|
|
SS[SessionStore\nSQLite tables]
|
|
AO[AgentOrchestrator / External Backends]
|
|
end
|
|
|
|
TUI --> WS
|
|
WS --> GS
|
|
QP --> GS
|
|
BM --> GS
|
|
GS --> LQ
|
|
GS --> SB
|
|
LQ --> AQ
|
|
GS --> MC
|
|
MC --> UI
|
|
|
|
SB --> AO
|
|
SB --> SM
|
|
SM --> SS
|
|
```
|
|
|
|
## Session IDs (What Actually Gets Stored)
|
|
|
|
The durable session ID stored by `SessionManager` is:
|
|
|
|
`<frontend>:<userId>`
|
|
|
|
For the gateway:
|
|
|
|
- `SessionBridge.connect()` assigns a `connectionId` (UUID).
|
|
- It defaults the connection's `sessionId` to `ws:<connectionId>`.
|
|
- It then calls `SessionManager.getSession('ws', sessionId)`.
|
|
|
|
That means gateway sessions are stored as:
|
|
|
|
- `ws:ws:<connectionId>`
|
|
|
|
This is expected: the gateway adds its own namespace, and the session manager namespaces again by frontend.
|
|
|
|
Key files:
|
|
|
|
- `src/gateway/session-bridge.ts`
|
|
- `src/session/manager.ts`
|
|
|
|
## Per-Session FIFO Queueing (LaneQueue)
|
|
|
|
`agent.send` uses a lane ID derived from the session:
|
|
|
|
- lane = `SessionBridge.getSessionId(connectionId)` (preferred)
|
|
- fallback lane = `connectionId` (only if session lookup fails)
|
|
|
|
Within a lane:
|
|
|
|
- Only one request executes at a time.
|
|
- Later requests queue (FIFO) and start after the active request finishes.
|
|
- `interrupt` mode enforces "latest wins": any queued backlog is superseded and the active run is asked to cancel so the newest request becomes the next (or immediate) execution.
|
|
|
|
Across lanes:
|
|
|
|
- Independent sessions run in parallel.
|
|
|
|
Key files:
|
|
|
|
- `src/gateway/lane-queue.ts`
|
|
- `src/gateway/handlers/agent.ts`
|
|
|
|
## Cancellation Semantics
|
|
|
|
`agent.cancel` performs two separate actions:
|
|
|
|
1. Cancels any queued (not-yet-started) work in the lane (`LaneQueue.cancel(laneId)`).
|
|
2. Requests cancellation of the active agent operation (`AgentOrchestrator.cancel()` via `SessionBridge.cancel()`).
|
|
|
|
Important:
|
|
|
|
- Cancellation is best-effort for the currently running work: it stops at the next safe point in the agent loop.
|
|
- Queued work is deterministically rejected.
|
|
- Gateway streams `run_state` events (`start`, `cancel_requested`, `cancelled`, `complete`, `error`) so clients can render lifecycle state without parsing assistant text.
|
|
|
|
Key files:
|
|
|
|
- `src/gateway/handlers/agent.ts`
|
|
- `src/backends/native/orchestrator.ts`
|