docs(observability): document phase-0 telemetry and baseline workflow

This commit is contained in:
William Valentin
2026-02-25 09:22:56 -08:00
parent 0b8f7c7299
commit 8b5266c66c
8 changed files with 179 additions and 1 deletions
+10
View File
@@ -1494,6 +1494,16 @@ Session lifecycle now includes proactive context maintenance events:
- `session.checkpoint` when proactive checkpoint summaries are written to memory
- `session.auto_compact` when proactive critical-threshold auto-compaction runs
Phase 0 baseline observability adds:
- `run.state` (start/complete/cancel_requested/cancelled/error)
- `run.cancel` (cancel intent/ack + latency)
- `reaction.match` / `reaction.skip` (reaction decision outcomes + skip reasons)
Baseline summaries can be generated from audit logs:
```bash
pnpm audit:phase0-baseline --audit ~/.local/share/flynn/audit.log --since 2026-02-25T00:00:00Z --format markdown
```
## Gateway Lock
Single-client mode for the WebSocket gateway. When enabled, only one WebSocket connection is allowed at a time. Additional connections are rejected with close code `4003`.
+53
View File
@@ -352,6 +352,59 @@ Useful for proactive compaction monitoring and operator dashboards.
}
```
#### `system.metrics`
Return aggregated gateway metrics snapshot (used by the dashboard).
Includes run-state counters, cancel latency samples, and reaction decision counters.
**Request:**
```json
{
"id": 11,
"method": "system.metrics"
}
```
**Response:**
```json
{
"id": 11,
"result": {
"messagesProcessed": 120,
"errors": 2,
"activeRequests": 1,
"uptime": 3600,
"modelCalls": {
"total": 15,
"avgLatency": 420,
"errorRate": 0.07,
"recentCalls": []
},
"runStates": {
"start": 25,
"complete": 22,
"cancel_requested": 1,
"cancelled": 1,
"error": 1
},
"cancelLatencyMs": {
"sampleCount": 4,
"samples": [120, 240, 310, 95]
},
"reactions": {
"matched": 6,
"skipped": 3,
"skipReasons": {
"no_match": 2,
"no_rules": 1
}
},
"queueDepth": 0
}
}
```
#### `system.localBackends`
Return status for user-level local LLM backend daemons (for example `ollama.service` and `llama-server.service`).
+3
View File
@@ -27,6 +27,7 @@ flowchart LR
subgraph HOST[Host (Flynn Daemon)]
CA[ChannelAdapters]
GW[Gateway\nHTTP + WS JSON-RPC + Web UI]
MC[MetricsCollector\nrun state + cancel latency + reactions]
RT[Routing\ncreateMessageRouter()]
PF[Preferences\n~/.local/share/flynn/preferences.json\nmodelTier + backendMode]
SM[SessionManager\nSQLite]
@@ -60,6 +61,7 @@ flowchart LR
CH --> CA
GW --> RT
GW --> MC
CA --> RT
RT --> SM
RT --> OR
@@ -147,6 +149,7 @@ Key files:
- Model routing: `src/models/router.ts`
- Tool policy + execution: `src/tools/policy.ts`, `src/tools/executor.ts`
- Canary backend telemetry summarization (offline evaluation): `src/audit/backendCanarySummary.ts`, `scripts/summarize-backend-canary.ts`
- Phase 0 baseline telemetry summarization: `src/audit/phase0BaselineSummary.ts`, `scripts/summarize-phase0-baseline.ts`
## Component Graph (Agent-Safety Boundary)
@@ -14,6 +14,7 @@ If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
- Runtime backend mode can be overridden manually via `/runtime` command fast-path (`status`, `activate pi`, `deactivate pi`, `use config`) and is persisted in preferences (`/backend` remains a compatibility alias).
- `flynn tui` now attaches to this same gateway command path for `/runtime ...` and auto-starts/attaches daemon+gateway when needed.
- Backend routing outcomes are auditable via `backend.route` / `backend.success` / `backend.fallback`, which enables offline canary evaluation without changing gateway protocol methods.
- Run lifecycle/cancel intent and reaction decisions are emitted to audit logs, and aggregated into `system.metrics` counters (runStates, cancelLatencyMs, reactions) for dashboards.
## Component Map
@@ -31,6 +32,8 @@ flowchart LR
LQ[LaneQueue\nper-session FIFO]
SB[SessionBridge\nconnectionId -> sessionId -> AgentOrchestrator]
AQ[AuditLogger\nqueue.preempt events]
MC[MetricsCollector\nrun states + cancel latency + reactions]
UI[Web UI Dashboard]
end
subgraph CORE[Flynn Core]
@@ -46,6 +49,8 @@ flowchart LR
GS --> LQ
GS --> SB
LQ --> AQ
GS --> MC
MC --> UI
SB --> AO
SB --> SM
@@ -149,6 +149,8 @@ Add or extend report tooling to summarize phase-0 telemetry slices:
## Ticket 0.5 — Docs + Diagram + State Sync
Status: completed (2026-02-25)
### Scope
Document new observability fields and baseline workflow:
@@ -0,0 +1,44 @@
{
"generated_at": "2026-02-25T17:20:35.391Z",
"event_count": 0,
"filters": {
"since_ms": 1771977600000
},
"options": {
"maxSessions": 20,
"maxChannels": 20,
"maxSkipReasons": 10
},
"summary": {
"event_counts": {
"run_state": 0,
"run_cancel": 0,
"reaction_match": 0,
"reaction_skip": 0
},
"run_outcomes": {
"overall": {
"total_outcomes": 0,
"complete": 0,
"cancelled": 0,
"error": 0,
"cancel_requested": 0,
"start": 0,
"completion_rate_pct": null,
"cancel_rate_pct": null,
"error_rate_pct": null
},
"by_channel": [],
"by_session": []
},
"cancel_latency_ms": null,
"reactions": {
"matched": 0,
"skipped": 0,
"total": 0,
"match_rate_pct": null,
"skip_rate_pct": null,
"skip_reasons": []
}
}
}
@@ -0,0 +1,43 @@
# Phase 0 Baseline Telemetry Summary
- Run state events: 0
- Run cancel events: 0
- Reaction matches: 0
- Reaction skips: 0
## Run Outcomes (Overall)
- Total outcomes: 0
- Complete: 0 (n/a)
- Cancelled: 0 (n/a)
- Errors: 0 (n/a)
- Cancel requested: 0
- Starts: 0
## Run Outcomes by Channel
| Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| _none_ | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 0 |
## Run Outcomes by Session
| Session | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| _none_ | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 0 |
## Cancel Latency
- No cancel latency samples.
## Reaction Decisions
- Matched: 0 (n/a)
- Skipped: 0 (n/a)
### Skip Reasons
| Reason | Count | Percent |
| --- | ---: | ---: |
| _none_ | 0 | 0.00% |
+19 -1
View File
@@ -63,6 +63,23 @@
],
"test_status": "pnpm test:run src/audit/phase0BaselineSummary.test.ts passing"
},
"phase0-ticket-0.5-docs-diagram-state-sync": {
"status": "completed",
"date": "2026-02-25",
"updated": "2026-02-25",
"summary": "Updated protocol/docs/diagrams for phase-0 telemetry fields, documented baseline workflow, and generated initial phase-0 baseline artifacts (empty sample window).",
"files_modified": [
"README.md",
"docs/api/PROTOCOL.md",
"docs/architecture/AGENT_DIAGRAM.md",
"docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md",
"docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
"docs/plans/artifacts/phase0_baseline_2026-02-25.md",
"docs/plans/artifacts/phase0_baseline_2026-02-25.json",
"docs/plans/state.json"
],
"test_status": "pnpm audit:phase0-baseline --audit ~/.local/share/flynn/audit.log --since 2026-02-25T00:00:00Z --format markdown/json (0 events in window)"
},
"phase0-instrumentation-ticket-checklist": {
"status": "completed",
"date": "2026-02-25",
@@ -6677,7 +6694,8 @@
"deeper_surfaces_phase0_ticket_02": "completed — gateway + daemon routing emit run lifecycle/cancel telemetry and reaction match/skip audit events with filter summaries and cancellation latency, plus focused tests",
"deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters",
"deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output",
"next_up": "Implement Ticket 0.5 from docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, and generated baseline artifacts",
"next_up": "Exercise gateway + channel sessions to emit run/reaction events, then regenerate phase-0 baseline artifacts with real samples",
"pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default",
"pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)",
"pi_embedded_manual_mode": "completed — added persisted runtime backend controls for manual Pi activation/deactivation (`/runtime` preferred, `/backend` alias; `status`, `activate pi`, `deactivate pi`, `use config`) while keeping config-driven default routing",