docs(observability): capture gateway-origin phase0 live baseline window

This commit is contained in:
William Valentin
2026-02-26 23:50:28 -08:00
parent 4b07a1f166
commit 5a34e986bf
10 changed files with 200 additions and 12 deletions
+12
View File
@@ -1629,6 +1629,18 @@ Live baseline artifacts (sample JSONL + JSON/Markdown summaries) can be captured
pnpm audit:phase0-baseline:live
```
Gateway-origin windows can be captured separately (for example when validating cancel paths) by restricting source + time window:
```bash
node --import tsx/esm scripts/capture-phase0-live-baseline.ts \
--audit ~/.local/share/flynn/audit.log \
--source gateway \
--since <epoch_ms_or_iso> \
--until <epoch_ms_or_iso> \
--sample-out docs/plans/artifacts/phase0_baseline_live_gateway_<tag>.jsonl \
--summary-json-out docs/plans/artifacts/phase0_baseline_live_gateway_<tag>.json \
--summary-md-out docs/plans/artifacts/phase0_baseline_live_gateway_<tag>.md
```
## Gateway Lock
Single-client mode for the WebSocket gateway. When enabled, only one WebSocket connection is allowed at a time. Additional connections are rejected with close code `4003`.
+1 -1
View File
@@ -23,7 +23,7 @@ The gateway provides:
- **HTTP Server**: Serves static dashboard and handles webhook endpoints
- **Node Capability Negotiation**: Optional companion-node role/capability registration
Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes a live phase-0 baseline capture flow (`pnpm audit:phase0-baseline:live`) that emits anonymized run/reaction sample artifacts.
Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes live phase-0 baseline capture flows: `pnpm audit:phase0-baseline:live` for channel-origin windows and `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for gateway-origin windows.
### Execution Model (Sessions + Per-Session Queue)
+2 -1
View File
@@ -166,7 +166,8 @@ Gateway streaming UX signals:
- `.github/workflows/companion-release-bundle.yml` provides CI artifact generation for companion release bundles using the same build-and-verify pipeline.
- `.github/workflows/companion-reference-apps-check.yml` enforces reference-app generator sync in CI.
- `flynn companion` can bootstrap status/location/push metadata on connect (`node.status.set` + optional `node.location.set`/`node.push_token.set`) so thin companion shells can register operational context in one launch.
- `pnpm audit:phase0-baseline:live` captures anonymized live run/reaction baseline artifacts from real audit logs to replace probe-only telemetry samples.
- `pnpm audit:phase0-baseline:live` captures anonymized channel-origin live run/reaction baseline artifacts from real audit logs.
- `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` captures gateway-origin baseline windows (including cancel-path telemetry) as separate artifacts.
- Canvas artifacts are persisted by the gateway so session UI surfaces can recover after daemon restarts.
- TTS synthesis uses an ordered provider chain with health cooldown tracking; if all providers fail, replies degrade to text-only without dropping the response.
- Talk mode accepts spoken/text `stop`/`cancel` while active and maps it onto the same `/stop` run-control cancellation path used for text sessions.
@@ -31,7 +31,8 @@ If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
- Companion reference-app sync can be enforced with `pnpm companion:reference-apps:check` (regenerate + diff fail on drift).
- CI workflow `.github/workflows/companion-release-bundle.yml` mirrors this pipeline for manual artifact generation/upload.
- CI workflow `.github/workflows/companion-reference-apps-check.yml` enforces reference-app generator sync on pull requests.
- Audit phase-0 live telemetry snapshots can be regenerated with `pnpm audit:phase0-baseline:live` (anonymized sample JSONL + summary JSON/markdown artifacts).
- Audit phase-0 live telemetry snapshots can be regenerated with `pnpm audit:phase0-baseline:live` (channel-origin anonymized sample JSONL + summary JSON/markdown artifacts).
- Gateway-origin phase-0 windows (including cancel-path samples) can be captured with `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...`.
- Companion CLI supports one-shot shell bootstrap metadata for live sessions (`--app-version`/`--status-text`, `--latitude`/`--longitude`, `--push-token`) so desktop/mobile wrappers can initialize node status/location/push in a single launch flow.
- Canvas artifacts are persisted per session under the gateway data directory for UI recovery across restarts.
- TTS output is best-effort with ordered provider fallback + per-provider cooldown tracking; synthesis failures still fall back to text-only responses.
@@ -203,7 +203,7 @@ Phase 0 is complete when:
2. A baseline summary artifact is generated and committed under `docs/plans/artifacts/`.
3. No user-visible response behavior changed compared to pre-phase baseline.
Follow-up status (2026-02-27): live channel-session artifacts now exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs); gateway-origin live samples remain a future slice.
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`.
## Subagent Model Assignment Plan
@@ -1,7 +1,7 @@
{
"generated_at": "2026-02-27T07:39:16.384Z",
"generated_at": "2026-02-27T07:49:58.821Z",
"source_audit_path": "~/.local/share/flynn/audit.log",
"source_event_count": 88,
"source_event_count": 94,
"sampled_event_count": 88,
"filters": {
"sources": [
@@ -0,0 +1,94 @@
{
"generated_at": "2026-02-27T07:47:41.346Z",
"source_audit_path": "~/.local/share/flynn/audit.log",
"source_event_count": 6,
"sampled_event_count": 6,
"filters": {
"since_ms": 1772178440693,
"until_ms": 1772178442694,
"sources": [
"gateway"
],
"exclude_session_substrings": [
"probe"
],
"anonymized_identifiers": true
},
"options": {
"sources": [
"gateway"
],
"maxSessions": 20,
"maxChannels": 20,
"maxSkipReasons": 10
},
"summary": {
"event_counts": {
"run_state": 5,
"run_cancel": 1,
"reaction_match": 0,
"reaction_skip": 0
},
"run_outcomes": {
"overall": {
"total_outcomes": 2,
"complete": 1,
"cancelled": 1,
"error": 0,
"cancel_requested": 1,
"start": 2,
"completion_rate_pct": 50,
"cancel_rate_pct": 50,
"error_rate_pct": 0
},
"by_channel": [
{
"key": "ws",
"stats": {
"total_outcomes": 2,
"complete": 1,
"cancelled": 1,
"error": 0,
"cancel_requested": 1,
"start": 2,
"completion_rate_pct": 50,
"cancel_rate_pct": 50,
"error_rate_pct": 0
}
}
],
"by_session": [
{
"key": "session_67024a716ed2",
"stats": {
"total_outcomes": 2,
"complete": 1,
"cancelled": 1,
"error": 0,
"cancel_requested": 1,
"start": 2,
"completion_rate_pct": 50,
"cancel_rate_pct": 50,
"error_rate_pct": 0
}
}
]
},
"cancel_latency_ms": {
"count": 1,
"avg_ms": 0,
"p50_ms": 0,
"p95_ms": 0,
"min_ms": 0,
"max_ms": 0
},
"reactions": {
"matched": 0,
"skipped": 0,
"total": 0,
"match_rate_pct": null,
"skip_rate_pct": null,
"skip_reasons": []
}
}
}
@@ -0,0 +1,6 @@
{"level":"info","event_type":"run.state","event":{"session_id":"session_67024a716ed2","channel":"ws","sender":"sender_776ae1d44923","source":"gateway","state":"start","request_id":"request_e7102f267cb2"},"timestamp":1772178441693}
{"level":"info","event_type":"run.cancel","event":{"session_id":"session_67024a716ed2","channel":"ws","sender":"sender_776ae1d44923","source":"gateway","requested":true,"acknowledged":true,"request_id":"request_61e1e51ee26b","latency_ms":0},"timestamp":1772178441693}
{"level":"info","event_type":"run.state","event":{"session_id":"session_67024a716ed2","channel":"ws","sender":"sender_776ae1d44923","source":"gateway","state":"cancel_requested","request_id":"request_61e1e51ee26b","duration_ms":0},"timestamp":1772178441694}
{"level":"info","event_type":"run.state","event":{"session_id":"session_67024a716ed2","channel":"ws","sender":"sender_776ae1d44923","source":"gateway","state":"cancelled","request_id":"request_e7102f267cb2","duration_ms":1},"timestamp":1772178441694}
{"level":"info","event_type":"run.state","event":{"session_id":"session_67024a716ed2","channel":"ws","sender":"sender_776ae1d44923","source":"gateway","state":"start","request_id":"request_897903dab9c4"},"timestamp":1772178441694}
{"level":"info","event_type":"run.state","event":{"session_id":"session_67024a716ed2","channel":"ws","sender":"sender_776ae1d44923","source":"gateway","state":"complete","request_id":"request_897903dab9c4","duration_ms":0},"timestamp":1772178441694}
@@ -0,0 +1,50 @@
# Phase 0 Baseline Telemetry Summary
- Run state events: 5
- Run cancel events: 1
- Reaction matches: 0
- Reaction skips: 0
- Sources: gateway
## Run Outcomes (Overall)
- Total outcomes: 2
- Complete: 1 (50.00%)
- Cancelled: 1 (50.00%)
- Errors: 0 (0.00%)
- Cancel requested: 1
- Starts: 2
## Run Outcomes by Channel
| Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| ws | 2 | 1 | 1 | 0 | 50.00% | 50.00% | 0.00% | 1 | 2 |
## Run Outcomes by Session
| Session | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| session_67024a716ed2 | 2 | 1 | 1 | 0 | 50.00% | 50.00% | 0.00% | 1 | 2 |
## Cancel Latency
- Count: 1
- Avg: 0ms
- P50: 0ms
- P95: 0ms
- Min: 0ms
- Max: 0ms
## Reaction Decisions
- Matched: 0 (n/a)
- Skipped: 0 (n/a)
### Skip Reasons
| Reason | Count | Percent |
| --- | ---: | ---: |
| _none_ | 0 | 0.00% |
+30 -6
View File
@@ -67,7 +67,7 @@
"status": "completed",
"date": "2026-02-25",
"updated": "2026-02-27",
"summary": "Updated protocol/docs/diagrams for phase-0 telemetry fields, documented baseline workflow, and replaced the original probe-only baseline workflow with anonymized live channel-session audit artifacts (`phase0_baseline_live_2026-02-27.*`).",
"summary": "Updated protocol/docs/diagrams for phase-0 telemetry fields, documented baseline workflow, replaced the original probe-only baseline workflow with anonymized live channel-session audit artifacts (`phase0_baseline_live_2026-02-27.*`), and added a second gateway-origin live window artifact set (`phase0_baseline_live_gateway_2026-02-27.*`) including cancel-path telemetry.",
"files_modified": [
"README.md",
"docs/api/PROTOCOL.md",
@@ -77,15 +77,18 @@
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.jsonl",
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.json",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.jsonl",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.json",
"docs/plans/state.json"
],
"test_status": "pnpm audit:phase0-baseline:live + pnpm test:run src/audit/phase0LiveBaseline.test.ts src/audit/phase0BaselineSummary.test.ts + pnpm typecheck passing"
"test_status": "pnpm audit:phase0-baseline:live + node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source gateway --since 1772178440693 --until 1772178442694 --sample-out docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.jsonl --summary-json-out docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.json --summary-md-out docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.md + pnpm test:run src/audit/phase0LiveBaseline.test.ts src/audit/phase0BaselineSummary.test.ts + pnpm typecheck passing"
},
"phase0-live-baseline-capture-tooling": {
"status": "completed",
"date": "2026-02-27",
"updated": "2026-02-27",
"summary": "Added a dedicated live phase-0 baseline capture flow that reads audit logs, filters run/reaction telemetry, excludes probe sessions, anonymizes session/sender/request IDs, and writes sample + summary artifacts for operational refreshes.",
"summary": "Added a dedicated live phase-0 baseline capture flow that reads audit logs, filters run/reaction telemetry, excludes probe sessions, anonymizes session/sender/request IDs, and writes sample + summary artifacts for operational refreshes across both channel-origin and gateway-origin windows.",
"files_modified": [
"src/audit/phase0LiveBaseline.ts",
"src/audit/phase0LiveBaseline.test.ts",
@@ -99,9 +102,30 @@
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.jsonl",
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.json",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.jsonl",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.json",
"docs/plans/state.json"
],
"test_status": "pnpm audit:phase0-baseline:live + pnpm test:run src/audit/phase0LiveBaseline.test.ts src/audit/phase0BaselineSummary.test.ts + pnpm typecheck passing"
"test_status": "pnpm audit:phase0-baseline:live + node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source gateway --since 1772178440693 --until 1772178442694 --sample-out docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.jsonl --summary-json-out docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.json --summary-md-out docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.md + pnpm test:run src/audit/phase0LiveBaseline.test.ts src/audit/phase0BaselineSummary.test.ts + pnpm typecheck passing"
},
"phase0-live-baseline-gateway-window": {
"status": "completed",
"date": "2026-02-27",
"updated": "2026-02-27",
"summary": "Captured and committed a gateway-origin live phase-0 baseline window with explicit cancel-path coverage (`run.cancel`, `run.state.cancel_requested`, `run.state.cancelled`) plus a paired completion sample, producing anonymized `phase0_baseline_live_gateway_2026-02-27.*` artifacts.",
"files_modified": [
"README.md",
"docs/api/PROTOCOL.md",
"docs/architecture/AGENT_DIAGRAM.md",
"docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md",
"docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.jsonl",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.json",
"docs/plans/state.json"
],
"test_status": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source gateway --since 1772178440693 --until 1772178442694 --sample-out docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.jsonl --summary-json-out docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.json --summary-md-out docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.md + pnpm test:run src/audit/phase0LiveBaseline.test.ts src/audit/phase0BaselineSummary.test.ts + pnpm typecheck passing"
},
"phase0-instrumentation-ticket-checklist": {
"status": "completed",
@@ -7288,8 +7312,8 @@
"deeper_surfaces_phase0_ticket_02": "completed — gateway + daemon routing emit run lifecycle/cancel telemetry and reaction match/skip audit events with filter summaries and cancellation latency, plus focused tests",
"deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters",
"deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output",
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, and generated anonymized live baseline artifacts from real channel audit traffic (probe-only artifact workflow superseded)",
"next_up": "Capture a gateway-origin live phase-0 baseline sample (including run.cancel/cancelled paths) and append as a second live artifact window alongside the channel sample",
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, and generated anonymized live baseline artifacts for both channel-origin and gateway-origin traffic (including cancel-path coverage)",
"next_up": "Phase-0 baseline windows now cover channel and gateway sources; keep both artifact windows refreshed on cadence before additional run-control/reaction semantic changes.",
"pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default",
"pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)",
"pi_embedded_manual_mode": "completed — added persisted runtime backend controls for manual Pi activation/deactivation (`/runtime` preferred, `/backend` alias; `status`, `activate pi`, `deactivate pi`, `use config`) while keeping config-driven default routing",