feat(audit): refresh all phase0 live windows in cadence run

This commit is contained in:
William Valentin
2026-02-27 09:36:22 -08:00
parent e905fe1d56
commit 55f1a3dd7b
19 changed files with 189 additions and 128 deletions
+1 -1
View File
@@ -1635,7 +1635,7 @@ pnpm audit:phase0-baseline:live:pi
pnpm audit:phase0-baseline:live:native pnpm audit:phase0-baseline:live:native
``` ```
One-shot refresh for both channel + gateway live windows: One-shot refresh for all live baseline windows (channel, gateway, backend-scoped `pi_embedded`, backend-scoped `native`):
```bash ```bash
pnpm audit:phase0-baseline:live:refresh pnpm audit:phase0-baseline:live:refresh
``` ```
+1 -1
View File
@@ -23,7 +23,7 @@ The gateway provides:
- **HTTP Server**: Serves static dashboard and handles webhook endpoints - **HTTP Server**: Serves static dashboard and handles webhook endpoints
- **Node Capability Negotiation**: Optional companion-node role/capability registration - **Node Capability Negotiation**: Optional companion-node role/capability registration
Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes live phase-0 baseline capture flows: `pnpm audit:phase0-baseline:live` for channel-origin windows, backend-scoped variants (`pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`) via `--backend`, `pnpm audit:phase0-baseline:live:gateway` (auto-detected cancel window) for gateway-origin windows, `pnpm audit:phase0-baseline:live:refresh` for one-shot refresh of both windows, and `pnpm audit:phase0-baseline:live:drift` for backend artifact freshness/drift gates (writing `phase0_baseline_live_backend_drift_<UTC-date>.md/.json` reports). These scripts default to current UTC-date tags unless `--tag` is explicitly provided. Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes live phase-0 baseline capture flows: `pnpm audit:phase0-baseline:live` for channel-origin windows, backend-scoped variants (`pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`) via `--backend`, `pnpm audit:phase0-baseline:live:gateway` (auto-detected cancel window) for gateway-origin windows, `pnpm audit:phase0-baseline:live:refresh` for one-shot refresh of all live windows (channel + gateway + backend-scoped), and `pnpm audit:phase0-baseline:live:drift` for backend artifact freshness/drift gates (writing `phase0_baseline_live_backend_drift_<UTC-date>.md/.json` reports). These scripts default to current UTC-date tags unless `--tag` is explicitly provided.
### Execution Model (Sessions + Per-Session Queue) ### Execution Model (Sessions + Per-Session Queue)
+1 -1
View File
@@ -169,7 +169,7 @@ Gateway streaming UX signals:
- `pnpm audit:phase0-baseline:live` captures anonymized channel-origin live run/reaction baseline artifacts from real audit logs. - `pnpm audit:phase0-baseline:live` captures anonymized channel-origin live run/reaction baseline artifacts from real audit logs.
- `pnpm audit:phase0-baseline:live:pi` and `pnpm audit:phase0-baseline:live:native` capture backend-scoped channel windows using `backend.route` timelines. - `pnpm audit:phase0-baseline:live:pi` and `pnpm audit:phase0-baseline:live:native` capture backend-scoped channel windows using `backend.route` timelines.
- `pnpm audit:phase0-baseline:live:gateway` captures gateway-origin baseline windows by auto-selecting the latest cancel/cancelled session window (or use `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit windows). - `pnpm audit:phase0-baseline:live:gateway` captures gateway-origin baseline windows by auto-selecting the latest cancel/cancelled session window (or use `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit windows).
- `pnpm audit:phase0-baseline:live:refresh` runs both channel + gateway capture commands in one step for cadence refreshes. - `pnpm audit:phase0-baseline:live:refresh` runs channel + gateway + backend-scoped (`pi_embedded` and `native`) capture commands in one cadence step.
- `pnpm audit:phase0-baseline:live:drift` evaluates backend-scoped artifact freshness/drift gates and writes `docs/plans/artifacts/phase0_baseline_live_backend_drift_<UTC-date>.md/.json`; `pnpm audit:phase0-baseline:live:refresh:drift` runs capture + drift checks in one cadence step. - `pnpm audit:phase0-baseline:live:drift` evaluates backend-scoped artifact freshness/drift gates and writes `docs/plans/artifacts/phase0_baseline_live_backend_drift_<UTC-date>.md/.json`; `pnpm audit:phase0-baseline:live:refresh:drift` runs capture + drift checks in one cadence step.
- `audit:phase0-baseline:live*` scripts are cadence-safe by default (UTC-date tags auto-generated unless explicitly overridden). - `audit:phase0-baseline:live*` scripts are cadence-safe by default (UTC-date tags auto-generated unless explicitly overridden).
- Canvas artifacts are persisted by the gateway so session UI surfaces can recover after daemon restarts. - Canvas artifacts are persisted by the gateway so session UI surfaces can recover after daemon restarts.
@@ -34,7 +34,7 @@ If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
- Audit phase-0 live telemetry snapshots can be regenerated with `pnpm audit:phase0-baseline:live` (channel-origin anonymized sample JSONL + summary JSON/markdown artifacts). - Audit phase-0 live telemetry snapshots can be regenerated with `pnpm audit:phase0-baseline:live` (channel-origin anonymized sample JSONL + summary JSON/markdown artifacts).
- Backend-scoped channel snapshots can be regenerated with `pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native` (`--backend` filtering via `backend.route` timelines). - Backend-scoped channel snapshots can be regenerated with `pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native` (`--backend` filtering via `backend.route` timelines).
- Gateway-origin phase-0 windows (including cancel-path samples) can be captured with `pnpm audit:phase0-baseline:live:gateway` (auto-detect latest cancel window) or `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit bounds. - Gateway-origin phase-0 windows (including cancel-path samples) can be captured with `pnpm audit:phase0-baseline:live:gateway` (auto-detect latest cancel window) or `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit bounds.
- `pnpm audit:phase0-baseline:live:refresh` runs both capture paths to refresh channel + gateway artifacts in one command. - `pnpm audit:phase0-baseline:live:refresh` runs channel + gateway + backend-scoped (`pi_embedded` and `native`) capture paths in one command.
- `pnpm audit:phase0-baseline:live:drift` checks backend-scoped artifact freshness/drift gates and writes `phase0_baseline_live_backend_drift_<UTC-date>.md/.json`; `pnpm audit:phase0-baseline:live:refresh:drift` chains refresh + drift checks for scheduled cadence runs. - `pnpm audit:phase0-baseline:live:drift` checks backend-scoped artifact freshness/drift gates and writes `phase0_baseline_live_backend_drift_<UTC-date>.md/.json`; `pnpm audit:phase0-baseline:live:refresh:drift` chains refresh + drift checks for scheduled cadence runs.
- `audit:phase0-baseline:live*` package scripts now omit fixed tags so scheduled runs automatically roll to current UTC-date artifact tags. - `audit:phase0-baseline:live*` package scripts now omit fixed tags so scheduled runs automatically roll to current UTC-date artifact tags.
- Companion CLI supports one-shot shell bootstrap metadata for live sessions (`--app-version`/`--status-text`, `--latitude`/`--longitude`, `--push-token`) so desktop/mobile wrappers can initialize node status/location/push in a single launch flow. - Companion CLI supports one-shot shell bootstrap metadata for live sessions (`--app-version`/`--status-text`, `--latitude`/`--longitude`, `--push-token`) so desktop/mobile wrappers can initialize node status/location/push in a single launch flow.
@@ -203,7 +203,7 @@ Phase 0 is complete when:
2. A baseline summary artifact is generated and committed under `docs/plans/artifacts/`. 2. A baseline summary artifact is generated and committed under `docs/plans/artifacts/`.
3. No user-visible response behavior changed compared to pre-phase baseline. 3. No user-visible response behavior changed compared to pre-phase baseline.
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), both windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (scheduling example included in README), backend-scoped channel windows are now available via `pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`, and backend artifact freshness/drift checks are now available via `pnpm audit:phase0-baseline:live:drift` (or chained with `pnpm audit:phase0-baseline:live:refresh:drift`) with drift report artifacts written to `docs/plans/artifacts/phase0_baseline_live_backend_drift_<UTC-date>.{md,json}`. Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), all live windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (channel + gateway + backend-scoped `pi`/`native`; scheduling example included in README), backend artifact freshness/drift checks are now available via `pnpm audit:phase0-baseline:live:drift` (or chained with `pnpm audit:phase0-baseline:live:refresh:drift`) with drift report artifacts written to `docs/plans/artifacts/phase0_baseline_live_backend_drift_<UTC-date>.{md,json}`.
## Subagent Model Assignment Plan ## Subagent Model Assignment Plan
@@ -1,8 +1,8 @@
{ {
"generated_at": "2026-02-27T16:46:42.576Z", "generated_at": "2026-02-27T17:36:01.625Z",
"source_audit_path": "~/.local/share/flynn/audit.log", "source_audit_path": "~/.local/share/flynn/audit.log",
"source_event_count": 110, "source_event_count": 115,
"sampled_event_count": 104, "sampled_event_count": 109,
"filters": { "filters": {
"sources": [ "sources": [
"channel" "channel"
@@ -22,19 +22,19 @@
}, },
"summary": { "summary": {
"event_counts": { "event_counts": {
"run_state": 65, "run_state": 68,
"run_cancel": 0, "run_cancel": 0,
"reaction_match": 0, "reaction_match": 0,
"reaction_skip": 39 "reaction_skip": 41
}, },
"run_outcomes": { "run_outcomes": {
"overall": { "overall": {
"total_outcomes": 27, "total_outcomes": 28,
"complete": 27, "complete": 28,
"cancelled": 0, "cancelled": 0,
"error": 0, "error": 0,
"cancel_requested": 0, "cancel_requested": 0,
"start": 38, "start": 40,
"completion_rate_pct": 100, "completion_rate_pct": 100,
"cancel_rate_pct": 0, "cancel_rate_pct": 0,
"error_rate_pct": 0 "error_rate_pct": 0
@@ -43,12 +43,12 @@
{ {
"key": "gmail", "key": "gmail",
"stats": { "stats": {
"total_outcomes": 25, "total_outcomes": 26,
"complete": 25, "complete": 26,
"cancelled": 0, "cancelled": 0,
"error": 0, "error": 0,
"cancel_requested": 0, "cancel_requested": 0,
"start": 25, "start": 26,
"completion_rate_pct": 100, "completion_rate_pct": 100,
"cancel_rate_pct": 0, "cancel_rate_pct": 0,
"error_rate_pct": 0 "error_rate_pct": 0
@@ -62,7 +62,7 @@
"cancelled": 0, "cancelled": 0,
"error": 0, "error": 0,
"cancel_requested": 0, "cancel_requested": 0,
"start": 13, "start": 14,
"completion_rate_pct": 100, "completion_rate_pct": 100,
"cancel_rate_pct": 0, "cancel_rate_pct": 0,
"error_rate_pct": 0 "error_rate_pct": 0
@@ -112,6 +112,20 @@
"error_rate_pct": 0 "error_rate_pct": 0
} }
}, },
{
"key": "session_f6304f25e43b",
"stats": {
"total_outcomes": 2,
"complete": 2,
"cancelled": 0,
"error": 0,
"cancel_requested": 0,
"start": 2,
"completion_rate_pct": 100,
"cancel_rate_pct": 0,
"error_rate_pct": 0
}
},
{ {
"key": "session_33469de5a1ee", "key": "session_33469de5a1ee",
"stats": { "stats": {
@@ -322,20 +336,6 @@
"error_rate_pct": 0 "error_rate_pct": 0
} }
}, },
{
"key": "session_f6304f25e43b",
"stats": {
"total_outcomes": 1,
"complete": 1,
"cancelled": 0,
"error": 0,
"cancel_requested": 0,
"start": 1,
"completion_rate_pct": 100,
"cancel_rate_pct": 0,
"error_rate_pct": 0
}
},
{ {
"key": "session_fd6536fa5ff4", "key": "session_fd6536fa5ff4",
"stats": { "stats": {
@@ -355,14 +355,14 @@
"cancel_latency_ms": null, "cancel_latency_ms": null,
"reactions": { "reactions": {
"matched": 0, "matched": 0,
"skipped": 39, "skipped": 41,
"total": 39, "total": 41,
"match_rate_pct": 0, "match_rate_pct": 0,
"skip_rate_pct": 100, "skip_rate_pct": 100,
"skip_reasons": [ "skip_reasons": [
{ {
"reason": "no_rules", "reason": "no_rules",
"count": 39, "count": 41,
"pct": 100 "pct": 100
} }
] ]
@@ -102,3 +102,8 @@
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772208000012} {"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772208000012}
{"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"start","request_id":"request_a3bafbb93755"},"timestamp":1772208000013} {"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"start","request_id":"request_a3bafbb93755"},"timestamp":1772208000013}
{"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"complete","request_id":"request_a3bafbb93755","duration_ms":35239},"timestamp":1772208035252} {"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"complete","request_id":"request_a3bafbb93755","duration_ms":35239},"timestamp":1772208035252}
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772211257454}
{"level":"info","event_type":"run.state","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","state":"start","request_id":"request_607c64c2760f"},"timestamp":1772211257454}
{"level":"info","event_type":"run.state","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","state":"complete","request_id":"request_607c64c2760f","duration_ms":3870},"timestamp":1772211261324}
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_534570702ea5","channel":"cron","sender":"sender_552aeb8f1b32","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772211600036}
{"level":"info","event_type":"run.state","event":{"session_id":"session_534570702ea5","channel":"cron","sender":"sender_552aeb8f1b32","source":"channel","state":"start","request_id":"request_c0a9fc76c188"},"timestamp":1772211600036}
@@ -1,27 +1,27 @@
# Phase 0 Baseline Telemetry Summary # Phase 0 Baseline Telemetry Summary
- Run state events: 65 - Run state events: 68
- Run cancel events: 0 - Run cancel events: 0
- Reaction matches: 0 - Reaction matches: 0
- Reaction skips: 39 - Reaction skips: 41
- Sources: channel - Sources: channel
## Run Outcomes (Overall) ## Run Outcomes (Overall)
- Total outcomes: 27 - Total outcomes: 28
- Complete: 27 (100.00%) - Complete: 28 (100.00%)
- Cancelled: 0 (0.00%) - Cancelled: 0 (0.00%)
- Errors: 0 (0.00%) - Errors: 0 (0.00%)
- Cancel requested: 0 - Cancel requested: 0
- Starts: 38 - Starts: 40
## Run Outcomes by Channel ## Run Outcomes by Channel
| Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts | | Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| gmail | 25 | 25 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 25 | | gmail | 26 | 26 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 26 |
| cron | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 13 | | cron | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 14 |
## Run Outcomes by Session ## Run Outcomes by Session
@@ -30,6 +30,7 @@
| session_2f2f1e414e81 | 5 | 5 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 5 | | session_2f2f1e414e81 | 5 | 5 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 5 |
| session_f4d8ddc04194 | 3 | 3 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 3 | | session_f4d8ddc04194 | 3 | 3 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 3 |
| session_eabc3c2a91b9 | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 2 | | session_eabc3c2a91b9 | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 2 |
| session_f6304f25e43b | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 2 |
| session_33469de5a1ee | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_33469de5a1ee | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_3ffb2e631ab1 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_3ffb2e631ab1 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_4d9e843358a3 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_4d9e843358a3 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
@@ -45,7 +46,6 @@
| session_cb9a69d8a362 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_cb9a69d8a362 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_e0a2a17b7329 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_e0a2a17b7329 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_ea839415979e | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_ea839415979e | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_f6304f25e43b | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_fd6536fa5ff4 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_fd6536fa5ff4 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
## Cancel Latency ## Cancel Latency
@@ -55,11 +55,11 @@
## Reaction Decisions ## Reaction Decisions
- Matched: 0 (0.00%) - Matched: 0 (0.00%)
- Skipped: 39 (100.00%) - Skipped: 41 (100.00%)
### Skip Reasons ### Skip Reasons
| Reason | Count | Percent | | Reason | Count | Percent |
| --- | ---: | ---: | | --- | ---: | ---: |
| no_rules | 39 | 100.00% | | no_rules | 41 | 100.00% |
@@ -1,5 +1,5 @@
{ {
"generated_at": "2026-02-27T17:04:49.009Z", "generated_at": "2026-02-27T17:36:02.803Z",
"artifacts_dir": "/home/will/lab/flynn/docs/plans/artifacts", "artifacts_dir": "/home/will/lab/flynn/docs/plans/artifacts",
"backends": [ "backends": [
"pi_embedded", "pi_embedded",
@@ -29,15 +29,15 @@
"candidate": { "candidate": {
"tag": "2026-02-27", "tag": "2026-02-27",
"path": "/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.json", "path": "/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.json",
"generated_at": "2026-02-27T16:45:18.488Z" "generated_at": "2026-02-27T17:36:02.214Z"
}, },
"baseline": null, "baseline": null,
"comparison": { "comparison": {
"baseline": null, "baseline": null,
"candidate": { "candidate": {
"source_event_count": 110, "source_event_count": 115,
"sampled_event_count": 56, "sampled_event_count": 59,
"run_total_outcomes": 25, "run_total_outcomes": 26,
"completion_rate_pct": 100, "completion_rate_pct": 100,
"cancel_rate_pct": 0, "cancel_rate_pct": 0,
"error_rate_pct": 0, "error_rate_pct": 0,
@@ -59,7 +59,7 @@
"freshness": { "freshness": {
"enabled": true, "enabled": true,
"pass": true, "pass": true,
"actual_age_hours": 0.33, "actual_age_hours": 0,
"threshold_hours": 36 "threshold_hours": 36
}, },
"drift_gate": { "drift_gate": {
@@ -68,7 +68,7 @@
{ {
"criterion": "candidate_sampled_events", "criterion": "candidate_sampled_events",
"pass": true, "pass": true,
"actual": "56", "actual": "59",
"threshold": ">= 10" "threshold": ">= 10"
}, },
{ {
@@ -116,21 +116,21 @@
"candidate": { "candidate": {
"tag": "2026-02-27", "tag": "2026-02-27",
"path": "/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.json", "path": "/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.json",
"generated_at": "2026-02-27T16:45:18.490Z" "generated_at": "2026-02-27T17:36:02.514Z"
}, },
"baseline": null, "baseline": null,
"comparison": { "comparison": {
"baseline": null, "baseline": null,
"candidate": { "candidate": {
"source_event_count": 110, "source_event_count": 115,
"sampled_event_count": 13, "sampled_event_count": 15,
"run_total_outcomes": 2, "run_total_outcomes": 2,
"completion_rate_pct": 100, "completion_rate_pct": 100,
"cancel_rate_pct": 0, "cancel_rate_pct": 0,
"error_rate_pct": 0, "error_rate_pct": 0,
"cancel_latency_p95_ms": null, "cancel_latency_p95_ms": null,
"reaction_match_rate_pct": null, "reaction_match_rate_pct": 0,
"reaction_skip_rate_pct": null "reaction_skip_rate_pct": 100
}, },
"deltas": { "deltas": {
"sampled_event_count_pct": null, "sampled_event_count_pct": null,
@@ -146,7 +146,7 @@
"freshness": { "freshness": {
"enabled": true, "enabled": true,
"pass": true, "pass": true,
"actual_age_hours": 0.33, "actual_age_hours": 0,
"threshold_hours": 36 "threshold_hours": 36
}, },
"drift_gate": { "drift_gate": {
@@ -155,7 +155,7 @@
{ {
"criterion": "candidate_sampled_events", "criterion": "candidate_sampled_events",
"pass": true, "pass": true,
"actual": "13", "actual": "15",
"threshold": ">= 10" "threshold": ">= 10"
}, },
{ {
@@ -1,6 +1,6 @@
# Phase-0 Backend Drift Check # Phase-0 Backend Drift Check
Generated at: 2026-02-27T17:04:49.009Z Generated at: 2026-02-27T17:36:02.803Z
Artifacts: /home/will/lab/flynn/docs/plans/artifacts Artifacts: /home/will/lab/flynn/docs/plans/artifacts
Backends: pi_embedded, native Backends: pi_embedded, native
Freshness max age (hours): 36 Freshness max age (hours): 36
@@ -19,9 +19,9 @@ Overall gate: PASS
## pi_embedded ## pi_embedded
- status: PASS - status: PASS
- candidate: tag=2026-02-27 file=/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.json - candidate: tag=2026-02-27 file=/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.json
- candidate generated_at: 2026-02-27T16:45:18.488Z - candidate generated_at: 2026-02-27T17:36:02.214Z
- baseline: none - baseline: none
- candidate snapshot: sampled=56 outcomes=25 completion=100% cancel=0% error=0% cancel_p95_ms=n/a - candidate snapshot: sampled=59 outcomes=26 completion=100% cancel=0% error=0% cancel_p95_ms=n/a
- deltas: - deltas:
sampled_event_count_pct=n/a sampled_event_count_pct=n/a
run_total_outcomes_pct=n/a run_total_outcomes_pct=n/a
@@ -31,9 +31,9 @@ Overall gate: PASS
cancel_latency_p95_ms=n/a cancel_latency_p95_ms=n/a
reaction_match_rate_pp=n/a reaction_match_rate_pp=n/a
reaction_skip_rate_pp=n/a reaction_skip_rate_pp=n/a
- freshness gate: PASS (age_hours=0.33 threshold=36) - freshness gate: PASS (age_hours=0 threshold=36)
- drift gate: PASS - drift gate: PASS
PASS candidate_sampled_events actual=56 threshold=>= 10 PASS candidate_sampled_events actual=59 threshold=>= 10
PASS sampled_events_drop_pct actual=n/a threshold=<= 80 PASS sampled_events_drop_pct actual=n/a threshold=<= 80
PASS run_outcomes_drop_pct actual=n/a threshold=<= 80 PASS run_outcomes_drop_pct actual=n/a threshold=<= 80
PASS completion_rate_drop_pp actual=n/a threshold=<= 35 PASS completion_rate_drop_pp actual=n/a threshold=<= 35
@@ -44,9 +44,9 @@ Overall gate: PASS
## native ## native
- status: PASS - status: PASS
- candidate: tag=2026-02-27 file=/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.json - candidate: tag=2026-02-27 file=/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.json
- candidate generated_at: 2026-02-27T16:45:18.490Z - candidate generated_at: 2026-02-27T17:36:02.514Z
- baseline: none - baseline: none
- candidate snapshot: sampled=13 outcomes=2 completion=100% cancel=0% error=0% cancel_p95_ms=n/a - candidate snapshot: sampled=15 outcomes=2 completion=100% cancel=0% error=0% cancel_p95_ms=n/a
- deltas: - deltas:
sampled_event_count_pct=n/a sampled_event_count_pct=n/a
run_total_outcomes_pct=n/a run_total_outcomes_pct=n/a
@@ -56,9 +56,9 @@ Overall gate: PASS
cancel_latency_p95_ms=n/a cancel_latency_p95_ms=n/a
reaction_match_rate_pp=n/a reaction_match_rate_pp=n/a
reaction_skip_rate_pp=n/a reaction_skip_rate_pp=n/a
- freshness gate: PASS (age_hours=0.33 threshold=36) - freshness gate: PASS (age_hours=0 threshold=36)
- drift gate: PASS - drift gate: PASS
PASS candidate_sampled_events actual=13 threshold=>= 10 PASS candidate_sampled_events actual=15 threshold=>= 10
PASS sampled_events_drop_pct actual=n/a threshold=<= 80 PASS sampled_events_drop_pct actual=n/a threshold=<= 80
PASS run_outcomes_drop_pct actual=n/a threshold=<= 80 PASS run_outcomes_drop_pct actual=n/a threshold=<= 80
PASS completion_rate_drop_pp actual=n/a threshold=<= 35 PASS completion_rate_drop_pp actual=n/a threshold=<= 35
@@ -1,8 +1,8 @@
{ {
"generated_at": "2026-02-27T16:45:18.490Z", "generated_at": "2026-02-27T17:36:02.514Z",
"source_audit_path": "~/.local/share/flynn/audit.log", "source_audit_path": "~/.local/share/flynn/audit.log",
"source_event_count": 110, "source_event_count": 115,
"sampled_event_count": 13, "sampled_event_count": 15,
"filters": { "filters": {
"sources": [ "sources": [
"channel" "channel"
@@ -14,7 +14,7 @@
"probe" "probe"
], ],
"anonymized_identifiers": true, "anonymized_identifiers": true,
"backend_route_event_count": 127 "backend_route_event_count": 129
}, },
"options": { "options": {
"sources": [ "sources": [
@@ -26,10 +26,10 @@
}, },
"summary": { "summary": {
"event_counts": { "event_counts": {
"run_state": 13, "run_state": 14,
"run_cancel": 0, "run_cancel": 0,
"reaction_match": 0, "reaction_match": 0,
"reaction_skip": 0 "reaction_skip": 1
}, },
"run_outcomes": { "run_outcomes": {
"overall": { "overall": {
@@ -38,7 +38,7 @@
"cancelled": 0, "cancelled": 0,
"error": 0, "error": 0,
"cancel_requested": 0, "cancel_requested": 0,
"start": 11, "start": 12,
"completion_rate_pct": 100, "completion_rate_pct": 100,
"cancel_rate_pct": 0, "cancel_rate_pct": 0,
"error_rate_pct": 0 "error_rate_pct": 0
@@ -52,7 +52,7 @@
"cancelled": 0, "cancelled": 0,
"error": 0, "error": 0,
"cancel_requested": 0, "cancel_requested": 0,
"start": 11, "start": 12,
"completion_rate_pct": 100, "completion_rate_pct": 100,
"cancel_rate_pct": 0, "cancel_rate_pct": 0,
"error_rate_pct": 0 "error_rate_pct": 0
@@ -172,6 +172,20 @@
"error_rate_pct": null "error_rate_pct": null
} }
}, },
{
"key": "session_534570702ea5",
"stats": {
"total_outcomes": 0,
"complete": 0,
"cancelled": 0,
"error": 0,
"cancel_requested": 0,
"start": 1,
"completion_rate_pct": null,
"cancel_rate_pct": null,
"error_rate_pct": null
}
},
{ {
"key": "session_683372f346c3", "key": "session_683372f346c3",
"stats": { "stats": {
@@ -219,11 +233,17 @@
"cancel_latency_ms": null, "cancel_latency_ms": null,
"reactions": { "reactions": {
"matched": 0, "matched": 0,
"skipped": 0, "skipped": 1,
"total": 0, "total": 1,
"match_rate_pct": null, "match_rate_pct": 0,
"skip_rate_pct": null, "skip_rate_pct": 100,
"skip_reasons": [] "skip_reasons": [
{
"reason": "no_rules",
"count": 1,
"pct": 100
}
]
} }
} }
} }
@@ -11,3 +11,5 @@
{"level":"info","event_type":"run.state","event":{"session_id":"session_a3f64a8e3c1e","channel":"cron","sender":"sender_a31bd6d4a95a","source":"channel","state":"start","request_id":"request_fc572d83d4c6"},"timestamp":1772182800034} {"level":"info","event_type":"run.state","event":{"session_id":"session_a3f64a8e3c1e","channel":"cron","sender":"sender_a31bd6d4a95a","source":"channel","state":"start","request_id":"request_fc572d83d4c6"},"timestamp":1772182800034}
{"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"start","request_id":"request_a3bafbb93755"},"timestamp":1772208000013} {"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"start","request_id":"request_a3bafbb93755"},"timestamp":1772208000013}
{"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"complete","request_id":"request_a3bafbb93755","duration_ms":35239},"timestamp":1772208035252} {"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"complete","request_id":"request_a3bafbb93755","duration_ms":35239},"timestamp":1772208035252}
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_534570702ea5","channel":"cron","sender":"sender_552aeb8f1b32","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772211600036}
{"level":"info","event_type":"run.state","event":{"session_id":"session_534570702ea5","channel":"cron","sender":"sender_552aeb8f1b32","source":"channel","state":"start","request_id":"request_c0a9fc76c188"},"timestamp":1772211600036}
@@ -1,9 +1,9 @@
# Phase 0 Baseline Telemetry Summary # Phase 0 Baseline Telemetry Summary
- Run state events: 13 - Run state events: 14
- Run cancel events: 0 - Run cancel events: 0
- Reaction matches: 0 - Reaction matches: 0
- Reaction skips: 0 - Reaction skips: 1
- Sources: channel - Sources: channel
@@ -14,13 +14,13 @@
- Cancelled: 0 (0.00%) - Cancelled: 0 (0.00%)
- Errors: 0 (0.00%) - Errors: 0 (0.00%)
- Cancel requested: 0 - Cancel requested: 0
- Starts: 11 - Starts: 12
## Run Outcomes by Channel ## Run Outcomes by Channel
| Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts | | Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| cron | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 11 | | cron | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 12 |
## Run Outcomes by Session ## Run Outcomes by Session
@@ -34,6 +34,7 @@
| session_494cb3b392af | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 | | session_494cb3b392af | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
| session_49b700741e03 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 | | session_49b700741e03 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
| session_4cd8ba5e6df5 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 | | session_4cd8ba5e6df5 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
| session_534570702ea5 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
| session_683372f346c3 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 | | session_683372f346c3 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
| session_a3f64a8e3c1e | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 | | session_a3f64a8e3c1e | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
| session_ffcee254d546 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 | | session_ffcee254d546 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
@@ -44,12 +45,12 @@
## Reaction Decisions ## Reaction Decisions
- Matched: 0 (n/a) - Matched: 0 (0.00%)
- Skipped: 0 (n/a) - Skipped: 1 (100.00%)
### Skip Reasons ### Skip Reasons
| Reason | Count | Percent | | Reason | Count | Percent |
| --- | ---: | ---: | | --- | ---: | ---: |
| _none_ | 0 | 0.00% | | no_rules | 1 | 100.00% |
@@ -1,8 +1,8 @@
{ {
"generated_at": "2026-02-27T16:45:18.488Z", "generated_at": "2026-02-27T17:36:02.214Z",
"source_audit_path": "~/.local/share/flynn/audit.log", "source_audit_path": "~/.local/share/flynn/audit.log",
"source_event_count": 110, "source_event_count": 115,
"sampled_event_count": 56, "sampled_event_count": 59,
"filters": { "filters": {
"sources": [ "sources": [
"channel" "channel"
@@ -14,7 +14,7 @@
"probe" "probe"
], ],
"anonymized_identifiers": true, "anonymized_identifiers": true,
"backend_route_event_count": 127 "backend_route_event_count": 129
}, },
"options": { "options": {
"sources": [ "sources": [
@@ -26,19 +26,19 @@
}, },
"summary": { "summary": {
"event_counts": { "event_counts": {
"run_state": 42, "run_state": 44,
"run_cancel": 0, "run_cancel": 0,
"reaction_match": 0, "reaction_match": 0,
"reaction_skip": 14 "reaction_skip": 15
}, },
"run_outcomes": { "run_outcomes": {
"overall": { "overall": {
"total_outcomes": 25, "total_outcomes": 26,
"complete": 25, "complete": 26,
"cancelled": 0, "cancelled": 0,
"error": 0, "error": 0,
"cancel_requested": 0, "cancel_requested": 0,
"start": 17, "start": 18,
"completion_rate_pct": 100, "completion_rate_pct": 100,
"cancel_rate_pct": 0, "cancel_rate_pct": 0,
"error_rate_pct": 0 "error_rate_pct": 0
@@ -47,12 +47,12 @@
{ {
"key": "gmail", "key": "gmail",
"stats": { "stats": {
"total_outcomes": 25, "total_outcomes": 26,
"complete": 25, "complete": 26,
"cancelled": 0, "cancelled": 0,
"error": 0, "error": 0,
"cancel_requested": 0, "cancel_requested": 0,
"start": 17, "start": 18,
"completion_rate_pct": 100, "completion_rate_pct": 100,
"cancel_rate_pct": 0, "cancel_rate_pct": 0,
"error_rate_pct": 0 "error_rate_pct": 0
@@ -102,6 +102,20 @@
"error_rate_pct": 0 "error_rate_pct": 0
} }
}, },
{
"key": "session_f6304f25e43b",
"stats": {
"total_outcomes": 2,
"complete": 2,
"cancelled": 0,
"error": 0,
"cancel_requested": 0,
"start": 2,
"completion_rate_pct": 100,
"cancel_rate_pct": 0,
"error_rate_pct": 0
}
},
{ {
"key": "session_33469de5a1ee", "key": "session_33469de5a1ee",
"stats": { "stats": {
@@ -284,20 +298,6 @@
"error_rate_pct": 0 "error_rate_pct": 0
} }
}, },
{
"key": "session_f6304f25e43b",
"stats": {
"total_outcomes": 1,
"complete": 1,
"cancelled": 0,
"error": 0,
"cancel_requested": 0,
"start": 1,
"completion_rate_pct": 100,
"cancel_rate_pct": 0,
"error_rate_pct": 0
}
},
{ {
"key": "session_fd6536fa5ff4", "key": "session_fd6536fa5ff4",
"stats": { "stats": {
@@ -317,14 +317,14 @@
"cancel_latency_ms": null, "cancel_latency_ms": null,
"reactions": { "reactions": {
"matched": 0, "matched": 0,
"skipped": 14, "skipped": 15,
"total": 14, "total": 15,
"match_rate_pct": 0, "match_rate_pct": 0,
"skip_rate_pct": 100, "skip_rate_pct": 100,
"skip_reasons": [ "skip_reasons": [
{ {
"reason": "no_rules", "reason": "no_rules",
"count": 14, "count": 15,
"pct": 100 "pct": 100
} }
] ]
@@ -54,3 +54,6 @@
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_2f2f1e414e81","channel":"gmail","sender":"sender_323cedc3233a","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772206157229} {"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_2f2f1e414e81","channel":"gmail","sender":"sender_323cedc3233a","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772206157229}
{"level":"info","event_type":"run.state","event":{"session_id":"session_2f2f1e414e81","channel":"gmail","sender":"sender_323cedc3233a","source":"channel","state":"start","request_id":"request_ab73d670c119"},"timestamp":1772206157229} {"level":"info","event_type":"run.state","event":{"session_id":"session_2f2f1e414e81","channel":"gmail","sender":"sender_323cedc3233a","source":"channel","state":"start","request_id":"request_ab73d670c119"},"timestamp":1772206157229}
{"level":"info","event_type":"run.state","event":{"session_id":"session_2f2f1e414e81","channel":"gmail","sender":"sender_323cedc3233a","source":"channel","state":"complete","request_id":"request_ab73d670c119","duration_ms":3850},"timestamp":1772206161079} {"level":"info","event_type":"run.state","event":{"session_id":"session_2f2f1e414e81","channel":"gmail","sender":"sender_323cedc3233a","source":"channel","state":"complete","request_id":"request_ab73d670c119","duration_ms":3850},"timestamp":1772206161079}
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772211257454}
{"level":"info","event_type":"run.state","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","state":"start","request_id":"request_607c64c2760f"},"timestamp":1772211257454}
{"level":"info","event_type":"run.state","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","state":"complete","request_id":"request_607c64c2760f","duration_ms":3870},"timestamp":1772211261324}
@@ -1,26 +1,26 @@
# Phase 0 Baseline Telemetry Summary # Phase 0 Baseline Telemetry Summary
- Run state events: 42 - Run state events: 44
- Run cancel events: 0 - Run cancel events: 0
- Reaction matches: 0 - Reaction matches: 0
- Reaction skips: 14 - Reaction skips: 15
- Sources: channel - Sources: channel
## Run Outcomes (Overall) ## Run Outcomes (Overall)
- Total outcomes: 25 - Total outcomes: 26
- Complete: 25 (100.00%) - Complete: 26 (100.00%)
- Cancelled: 0 (0.00%) - Cancelled: 0 (0.00%)
- Errors: 0 (0.00%) - Errors: 0 (0.00%)
- Cancel requested: 0 - Cancel requested: 0
- Starts: 17 - Starts: 18
## Run Outcomes by Channel ## Run Outcomes by Channel
| Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts | | Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| gmail | 25 | 25 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 17 | | gmail | 26 | 26 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 18 |
## Run Outcomes by Session ## Run Outcomes by Session
@@ -29,6 +29,7 @@
| session_2f2f1e414e81 | 5 | 5 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 5 | | session_2f2f1e414e81 | 5 | 5 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 5 |
| session_f4d8ddc04194 | 3 | 3 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 3 | | session_f4d8ddc04194 | 3 | 3 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 3 |
| session_eabc3c2a91b9 | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_eabc3c2a91b9 | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_f6304f25e43b | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 2 |
| session_33469de5a1ee | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_33469de5a1ee | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_3ffb2e631ab1 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 0 | | session_3ffb2e631ab1 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 0 |
| session_4d9e843358a3 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 0 | | session_4d9e843358a3 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 0 |
@@ -42,7 +43,6 @@
| session_cb9a69d8a362 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_cb9a69d8a362 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_e0a2a17b7329 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_e0a2a17b7329 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_ea839415979e | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_ea839415979e | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_f6304f25e43b | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
| session_fd6536fa5ff4 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 | | session_fd6536fa5ff4 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
## Cancel Latency ## Cancel Latency
@@ -52,11 +52,11 @@
## Reaction Decisions ## Reaction Decisions
- Matched: 0 (0.00%) - Matched: 0 (0.00%)
- Skipped: 14 (100.00%) - Skipped: 15 (100.00%)
### Skip Reasons ### Skip Reasons
| Reason | Count | Percent | | Reason | Count | Percent |
| --- | ---: | ---: | | --- | ---: | ---: |
| no_rules | 14 | 100.00% | | no_rules | 15 | 100.00% |
@@ -1,5 +1,5 @@
{ {
"generated_at": "2026-02-27T16:46:42.880Z", "generated_at": "2026-02-27T17:36:01.922Z",
"source_audit_path": "~/.local/share/flynn/audit.log", "source_audit_path": "~/.local/share/flynn/audit.log",
"source_event_count": 6, "source_event_count": 6,
"sampled_event_count": 6, "sampled_event_count": 6,
+31 -1
View File
@@ -234,6 +234,36 @@
], ],
"test_status": "pnpm audit:phase0-baseline:live:drift + pnpm test:run src/audit/phase0BaselineDrift.test.ts + pnpm typecheck passing" "test_status": "pnpm audit:phase0-baseline:live:drift + pnpm test:run src/audit/phase0BaselineDrift.test.ts + pnpm typecheck passing"
}, },
"phase0-live-baseline-refresh-full-window": {
"status": "completed",
"date": "2026-02-27",
"updated": "2026-02-27",
"summary": "Expanded `pnpm audit:phase0-baseline:live:refresh` to regenerate all live windows in one command (channel, gateway, backend-scoped `pi_embedded`, backend-scoped `native`) so scheduled `refresh:drift` runs keep backend artifacts fresh for baseline-vs-prior comparisons.",
"files_modified": [
"package.json",
"README.md",
"docs/api/PROTOCOL.md",
"docs/architecture/AGENT_DIAGRAM.md",
"docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md",
"docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.jsonl",
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.json",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.jsonl",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.json",
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.jsonl",
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.json",
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.jsonl",
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.json",
"docs/plans/artifacts/phase0_baseline_live_backend_drift_2026-02-27.md",
"docs/plans/artifacts/phase0_baseline_live_backend_drift_2026-02-27.json",
"docs/plans/state.json"
],
"test_status": "pnpm audit:phase0-baseline:live:refresh:drift + pnpm test:run src/audit/phase0BaselineDrift.test.ts + pnpm typecheck passing"
},
"phase0-instrumentation-ticket-checklist": { "phase0-instrumentation-ticket-checklist": {
"status": "completed", "status": "completed",
"date": "2026-02-25", "date": "2026-02-25",
@@ -7420,7 +7450,7 @@
"deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters", "deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters",
"deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output", "deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output",
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, generated anonymized live baseline artifacts for channel/gateway/backend-scoped (pi/native) windows, and added backend artifact freshness/drift gates with persisted drift reports (`phase0_baseline_live_backend_drift_<UTC-date>.{md,json}`)", "deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, generated anonymized live baseline artifacts for channel/gateway/backend-scoped (pi/native) windows, and added backend artifact freshness/drift gates with persisted drift reports (`phase0_baseline_live_backend_drift_<UTC-date>.{md,json}`)",
"next_up": "Run scheduled `pnpm audit:phase0-baseline:live:refresh:drift` in each active environment and collect at least one additional UTC-date drift artifact so baseline-vs-prior comparisons become active before tightening thresholds or changing additional run-control/reaction semantics.", "next_up": "Run scheduled `pnpm audit:phase0-baseline:live:refresh:drift` in each active environment (now refreshing channel + gateway + backend-scoped windows together) and collect at least one additional UTC-date drift artifact so baseline-vs-prior comparisons become active before tightening thresholds or changing additional run-control/reaction semantics.",
"pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default", "pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default",
"pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)", "pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)",
"pi_embedded_manual_mode": "completed — added persisted runtime backend controls for manual Pi activation/deactivation (`/runtime` preferred, `/backend` alias; `status`, `activate pi`, `deactivate pi`, `use config`) while keeping config-driven default routing", "pi_embedded_manual_mode": "completed — added persisted runtime backend controls for manual Pi activation/deactivation (`/runtime` preferred, `/backend` alias; `status`, `activate pi`, `deactivate pi`, `use config`) while keeping config-driven default routing",
+1 -1
View File
@@ -26,7 +26,7 @@
"audit:phase0-baseline:live:pi": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source channel --backend pi_embedded --exclude-session-substring probe", "audit:phase0-baseline:live:pi": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source channel --backend pi_embedded --exclude-session-substring probe",
"audit:phase0-baseline:live:native": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source channel --backend native --exclude-session-substring probe", "audit:phase0-baseline:live:native": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source channel --backend native --exclude-session-substring probe",
"audit:phase0-baseline:live:gateway": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source gateway --auto-gateway-cancel-window", "audit:phase0-baseline:live:gateway": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source gateway --auto-gateway-cancel-window",
"audit:phase0-baseline:live:refresh": "pnpm audit:phase0-baseline:live && pnpm audit:phase0-baseline:live:gateway", "audit:phase0-baseline:live:refresh": "pnpm audit:phase0-baseline:live && pnpm audit:phase0-baseline:live:gateway && pnpm audit:phase0-baseline:live:pi && pnpm audit:phase0-baseline:live:native",
"audit:phase0-baseline:live:drift": "node --import tsx/esm scripts/check-phase0-baseline-backend-drift.ts --artifacts-dir docs/plans/artifacts --backend pi_embedded,native --max-age-hours 36 --min-candidate-sampled-events 10 --max-sampled-events-drop-pct 80 --max-run-outcomes-drop-pct 80 --max-completion-rate-drop-pp 35 --max-cancel-rate-increase-pp 25 --max-error-rate-increase-pp 25 --max-cancel-latency-p95-increase-ms 6000 --write-default-artifacts", "audit:phase0-baseline:live:drift": "node --import tsx/esm scripts/check-phase0-baseline-backend-drift.ts --artifacts-dir docs/plans/artifacts --backend pi_embedded,native --max-age-hours 36 --min-candidate-sampled-events 10 --max-sampled-events-drop-pct 80 --max-run-outcomes-drop-pct 80 --max-completion-rate-drop-pp 35 --max-cancel-rate-increase-pp 25 --max-error-rate-increase-pp 25 --max-cancel-latency-p95-increase-ms 6000 --write-default-artifacts",
"audit:phase0-baseline:live:refresh:drift": "pnpm audit:phase0-baseline:live:refresh && pnpm audit:phase0-baseline:live:drift", "audit:phase0-baseline:live:refresh:drift": "pnpm audit:phase0-baseline:live:refresh && pnpm audit:phase0-baseline:live:drift",
"audit:backend-canary:probes": "node --import tsx/esm scripts/run-pi-canary-guard-probes.ts", "audit:backend-canary:probes": "node --import tsx/esm scripts/run-pi-canary-guard-probes.ts",