docs(observability): document phase-0 telemetry and baseline workflow
This commit is contained in:
@@ -1494,6 +1494,16 @@ Session lifecycle now includes proactive context maintenance events:
|
||||
- `session.checkpoint` when proactive checkpoint summaries are written to memory
|
||||
- `session.auto_compact` when proactive critical-threshold auto-compaction runs
|
||||
|
||||
Phase 0 baseline observability adds:
|
||||
- `run.state` (start/complete/cancel_requested/cancelled/error)
|
||||
- `run.cancel` (cancel intent/ack + latency)
|
||||
- `reaction.match` / `reaction.skip` (reaction decision outcomes + skip reasons)
|
||||
|
||||
Baseline summaries can be generated from audit logs:
|
||||
```bash
|
||||
pnpm audit:phase0-baseline --audit ~/.local/share/flynn/audit.log --since 2026-02-25T00:00:00Z --format markdown
|
||||
```
|
||||
|
||||
## Gateway Lock
|
||||
|
||||
Single-client mode for the WebSocket gateway. When enabled, only one WebSocket connection is allowed at a time. Additional connections are rejected with close code `4003`.
|
||||
|
||||
@@ -352,6 +352,59 @@ Useful for proactive compaction monitoring and operator dashboards.
|
||||
}
|
||||
```
|
||||
|
||||
#### `system.metrics`
|
||||
|
||||
Return aggregated gateway metrics snapshot (used by the dashboard).
|
||||
|
||||
Includes run-state counters, cancel latency samples, and reaction decision counters.
|
||||
|
||||
**Request:**
|
||||
```json
|
||||
{
|
||||
"id": 11,
|
||||
"method": "system.metrics"
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"id": 11,
|
||||
"result": {
|
||||
"messagesProcessed": 120,
|
||||
"errors": 2,
|
||||
"activeRequests": 1,
|
||||
"uptime": 3600,
|
||||
"modelCalls": {
|
||||
"total": 15,
|
||||
"avgLatency": 420,
|
||||
"errorRate": 0.07,
|
||||
"recentCalls": []
|
||||
},
|
||||
"runStates": {
|
||||
"start": 25,
|
||||
"complete": 22,
|
||||
"cancel_requested": 1,
|
||||
"cancelled": 1,
|
||||
"error": 1
|
||||
},
|
||||
"cancelLatencyMs": {
|
||||
"sampleCount": 4,
|
||||
"samples": [120, 240, 310, 95]
|
||||
},
|
||||
"reactions": {
|
||||
"matched": 6,
|
||||
"skipped": 3,
|
||||
"skipReasons": {
|
||||
"no_match": 2,
|
||||
"no_rules": 1
|
||||
}
|
||||
},
|
||||
"queueDepth": 0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### `system.localBackends`
|
||||
|
||||
Return status for user-level local LLM backend daemons (for example `ollama.service` and `llama-server.service`).
|
||||
|
||||
@@ -27,6 +27,7 @@ flowchart LR
|
||||
subgraph HOST[Host (Flynn Daemon)]
|
||||
CA[ChannelAdapters]
|
||||
GW[Gateway\nHTTP + WS JSON-RPC + Web UI]
|
||||
MC[MetricsCollector\nrun state + cancel latency + reactions]
|
||||
RT[Routing\ncreateMessageRouter()]
|
||||
PF[Preferences\n~/.local/share/flynn/preferences.json\nmodelTier + backendMode]
|
||||
SM[SessionManager\nSQLite]
|
||||
@@ -60,6 +61,7 @@ flowchart LR
|
||||
|
||||
CH --> CA
|
||||
GW --> RT
|
||||
GW --> MC
|
||||
CA --> RT
|
||||
RT --> SM
|
||||
RT --> OR
|
||||
@@ -147,6 +149,7 @@ Key files:
|
||||
- Model routing: `src/models/router.ts`
|
||||
- Tool policy + execution: `src/tools/policy.ts`, `src/tools/executor.ts`
|
||||
- Canary backend telemetry summarization (offline evaluation): `src/audit/backendCanarySummary.ts`, `scripts/summarize-backend-canary.ts`
|
||||
- Phase 0 baseline telemetry summarization: `src/audit/phase0BaselineSummary.ts`, `scripts/summarize-phase0-baseline.ts`
|
||||
|
||||
## Component Graph (Agent-Safety Boundary)
|
||||
|
||||
|
||||
@@ -14,6 +14,7 @@ If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
|
||||
- Runtime backend mode can be overridden manually via `/runtime` command fast-path (`status`, `activate pi`, `deactivate pi`, `use config`) and is persisted in preferences (`/backend` remains a compatibility alias).
|
||||
- `flynn tui` now attaches to this same gateway command path for `/runtime ...` and auto-starts/attaches daemon+gateway when needed.
|
||||
- Backend routing outcomes are auditable via `backend.route` / `backend.success` / `backend.fallback`, which enables offline canary evaluation without changing gateway protocol methods.
|
||||
- Run lifecycle/cancel intent and reaction decisions are emitted to audit logs, and aggregated into `system.metrics` counters (runStates, cancelLatencyMs, reactions) for dashboards.
|
||||
|
||||
## Component Map
|
||||
|
||||
@@ -31,6 +32,8 @@ flowchart LR
|
||||
LQ[LaneQueue\nper-session FIFO]
|
||||
SB[SessionBridge\nconnectionId -> sessionId -> AgentOrchestrator]
|
||||
AQ[AuditLogger\nqueue.preempt events]
|
||||
MC[MetricsCollector\nrun states + cancel latency + reactions]
|
||||
UI[Web UI Dashboard]
|
||||
end
|
||||
|
||||
subgraph CORE[Flynn Core]
|
||||
@@ -46,6 +49,8 @@ flowchart LR
|
||||
GS --> LQ
|
||||
GS --> SB
|
||||
LQ --> AQ
|
||||
GS --> MC
|
||||
MC --> UI
|
||||
|
||||
SB --> AO
|
||||
SB --> SM
|
||||
|
||||
@@ -149,6 +149,8 @@ Add or extend report tooling to summarize phase-0 telemetry slices:
|
||||
|
||||
## Ticket 0.5 — Docs + Diagram + State Sync
|
||||
|
||||
Status: completed (2026-02-25)
|
||||
|
||||
### Scope
|
||||
|
||||
Document new observability fields and baseline workflow:
|
||||
|
||||
@@ -0,0 +1,44 @@
|
||||
{
|
||||
"generated_at": "2026-02-25T17:20:35.391Z",
|
||||
"event_count": 0,
|
||||
"filters": {
|
||||
"since_ms": 1771977600000
|
||||
},
|
||||
"options": {
|
||||
"maxSessions": 20,
|
||||
"maxChannels": 20,
|
||||
"maxSkipReasons": 10
|
||||
},
|
||||
"summary": {
|
||||
"event_counts": {
|
||||
"run_state": 0,
|
||||
"run_cancel": 0,
|
||||
"reaction_match": 0,
|
||||
"reaction_skip": 0
|
||||
},
|
||||
"run_outcomes": {
|
||||
"overall": {
|
||||
"total_outcomes": 0,
|
||||
"complete": 0,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 0,
|
||||
"completion_rate_pct": null,
|
||||
"cancel_rate_pct": null,
|
||||
"error_rate_pct": null
|
||||
},
|
||||
"by_channel": [],
|
||||
"by_session": []
|
||||
},
|
||||
"cancel_latency_ms": null,
|
||||
"reactions": {
|
||||
"matched": 0,
|
||||
"skipped": 0,
|
||||
"total": 0,
|
||||
"match_rate_pct": null,
|
||||
"skip_rate_pct": null,
|
||||
"skip_reasons": []
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,43 @@
|
||||
# Phase 0 Baseline Telemetry Summary
|
||||
|
||||
- Run state events: 0
|
||||
- Run cancel events: 0
|
||||
- Reaction matches: 0
|
||||
- Reaction skips: 0
|
||||
|
||||
## Run Outcomes (Overall)
|
||||
|
||||
- Total outcomes: 0
|
||||
- Complete: 0 (n/a)
|
||||
- Cancelled: 0 (n/a)
|
||||
- Errors: 0 (n/a)
|
||||
- Cancel requested: 0
|
||||
- Starts: 0
|
||||
|
||||
## Run Outcomes by Channel
|
||||
|
||||
| Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
|
||||
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
|
||||
| _none_ | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 0 |
|
||||
|
||||
## Run Outcomes by Session
|
||||
|
||||
| Session | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
|
||||
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
|
||||
| _none_ | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 0 |
|
||||
|
||||
## Cancel Latency
|
||||
|
||||
- No cancel latency samples.
|
||||
|
||||
## Reaction Decisions
|
||||
|
||||
- Matched: 0 (n/a)
|
||||
- Skipped: 0 (n/a)
|
||||
|
||||
### Skip Reasons
|
||||
|
||||
| Reason | Count | Percent |
|
||||
| --- | ---: | ---: |
|
||||
| _none_ | 0 | 0.00% |
|
||||
|
||||
+19
-1
@@ -63,6 +63,23 @@
|
||||
],
|
||||
"test_status": "pnpm test:run src/audit/phase0BaselineSummary.test.ts passing"
|
||||
},
|
||||
"phase0-ticket-0.5-docs-diagram-state-sync": {
|
||||
"status": "completed",
|
||||
"date": "2026-02-25",
|
||||
"updated": "2026-02-25",
|
||||
"summary": "Updated protocol/docs/diagrams for phase-0 telemetry fields, documented baseline workflow, and generated initial phase-0 baseline artifacts (empty sample window).",
|
||||
"files_modified": [
|
||||
"README.md",
|
||||
"docs/api/PROTOCOL.md",
|
||||
"docs/architecture/AGENT_DIAGRAM.md",
|
||||
"docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md",
|
||||
"docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
|
||||
"docs/plans/artifacts/phase0_baseline_2026-02-25.md",
|
||||
"docs/plans/artifacts/phase0_baseline_2026-02-25.json",
|
||||
"docs/plans/state.json"
|
||||
],
|
||||
"test_status": "pnpm audit:phase0-baseline --audit ~/.local/share/flynn/audit.log --since 2026-02-25T00:00:00Z --format markdown/json (0 events in window)"
|
||||
},
|
||||
"phase0-instrumentation-ticket-checklist": {
|
||||
"status": "completed",
|
||||
"date": "2026-02-25",
|
||||
@@ -6677,7 +6694,8 @@
|
||||
"deeper_surfaces_phase0_ticket_02": "completed — gateway + daemon routing emit run lifecycle/cancel telemetry and reaction match/skip audit events with filter summaries and cancellation latency, plus focused tests",
|
||||
"deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters",
|
||||
"deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output",
|
||||
"next_up": "Implement Ticket 0.5 from docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
|
||||
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, and generated baseline artifacts",
|
||||
"next_up": "Exercise gateway + channel sessions to emit run/reaction events, then regenerate phase-0 baseline artifacts with real samples",
|
||||
"pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default",
|
||||
"pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)",
|
||||
"pi_embedded_manual_mode": "completed — added persisted runtime backend controls for manual Pi activation/deactivation (`/runtime` preferred, `/backend` alias; `status`, `activate pi`, `deactivate pi`, `use config`) while keeping config-driven default routing",
|
||||
|
||||
Reference in New Issue
Block a user