feat(audit): add phase0 backend drift and freshness gates

This commit is contained in:
William Valentin
2026-02-27 09:01:43 -08:00
parent 68cdc2cf8b
commit 20224f1601
10 changed files with 1157 additions and 7 deletions
@@ -203,7 +203,7 @@ Phase 0 is complete when:
2. A baseline summary artifact is generated and committed under `docs/plans/artifacts/`.
3. No user-visible response behavior changed compared to pre-phase baseline.
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), both windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (scheduling example included in README), and backend-scoped channel windows are now available via `pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`.
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), both windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (scheduling example included in README), backend-scoped channel windows are now available via `pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`, and backend artifact freshness/drift checks are now available via `pnpm audit:phase0-baseline:live:drift` (or chained with `pnpm audit:phase0-baseline:live:refresh:drift`).
## Subagent Model Assignment Plan
+22 -3
View File
@@ -196,6 +196,25 @@
],
"test_status": "pnpm audit:phase0-baseline:live:pi + pnpm audit:phase0-baseline:live:native + pnpm test:run src/audit/phase0LiveBaseline.test.ts src/audit/phase0BaselineSummary.test.ts + pnpm typecheck passing"
},
"phase0-live-baseline-backend-drift-monitoring": {
"status": "completed",
"date": "2026-02-27",
"updated": "2026-02-27",
"summary": "Implemented backend-scoped phase-0 baseline drift/freshness gates for cadence monitoring by adding artifact comparison helpers, threshold evaluation, and an auto-discovery CLI (`check-phase0-baseline-backend-drift.ts`) with package scripts for standalone checks and chained refresh+drift runs.",
"files_modified": [
"src/audit/phase0BaselineDrift.ts",
"src/audit/phase0BaselineDrift.test.ts",
"scripts/check-phase0-baseline-backend-drift.ts",
"package.json",
"README.md",
"docs/api/PROTOCOL.md",
"docs/architecture/AGENT_DIAGRAM.md",
"docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md",
"docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
"docs/plans/state.json"
],
"test_status": "pnpm test:run src/audit/phase0BaselineDrift.test.ts + pnpm audit:phase0-baseline:live:drift + pnpm typecheck passing"
},
"phase0-instrumentation-ticket-checklist": {
"status": "completed",
"date": "2026-02-25",
@@ -7345,7 +7364,7 @@
}
},
"overall_progress": {
"total_test_count": 2589,
"total_test_count": 2590,
"all_tests_passing": true,
"p0_completion": "3/3 (100%)",
"p1_completion": "4/4 (100%)",
@@ -7381,8 +7400,8 @@
"deeper_surfaces_phase0_ticket_02": "completed — gateway + daemon routing emit run lifecycle/cancel telemetry and reaction match/skip audit events with filter summaries and cancellation latency, plus focused tests",
"deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters",
"deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output",
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, and generated anonymized live baseline artifacts for channel, gateway, and backend-scoped (pi/native) traffic windows",
"next_up": "Apply scheduled `pnpm audit:phase0-baseline:live:refresh` in each active environment and monitor backend-scoped (`pi_embedded` vs `native`) artifact freshness/drift over at least one full cadence cycle before additional run-control/reaction semantic changes.",
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, generated anonymized live baseline artifacts for channel/gateway/backend-scoped (pi/native) windows, and added backend artifact freshness/drift gates (`pnpm audit:phase0-baseline:live:drift`)",
"next_up": "Run scheduled `pnpm audit:phase0-baseline:live:refresh:drift` in each active environment and observe at least one full cadence cycle before tightening drift thresholds or changing additional run-control/reaction semantics.",
"pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default",
"pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)",
"pi_embedded_manual_mode": "completed — added persisted runtime backend controls for manual Pi activation/deactivation (`/runtime` preferred, `/backend` alias; `status`, `activate pi`, `deactivate pi`, `use config`) while keeping config-driven default routing",