feat(audit): retain rolling phase0 prune reports
This commit is contained in:
@@ -203,7 +203,7 @@ Phase 0 is complete when:
|
||||
2. A baseline summary artifact is generated and committed under `docs/plans/artifacts/`.
|
||||
3. No user-visible response behavior changed compared to pre-phase baseline.
|
||||
|
||||
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), all live windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (channel + gateway + backend-scoped `pi`/`native`; scheduling example included in README), backend artifact freshness/drift checks are now available via `pnpm audit:phase0-baseline:live:drift` (or chained with `pnpm audit:phase0-baseline:live:refresh:drift`) with drift report artifacts written to `docs/plans/artifacts/phase0_baseline_live_backend_drift_<tag>.{md,json}`, cadence runs can preserve distinct timestamped comparison points via `pnpm audit:phase0-baseline:live:refresh:drift:rolling`, rolling-tag retention can be managed via `pnpm audit:phase0-baseline:live:prune` (dry-run) / `pnpm audit:phase0-baseline:live:prune:apply` with prune report artifacts written to `phase0_baseline_live_prune_<tag>.{md,json}`, and one-command cadence scheduling is available via `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` (`KEEP_PER_FAMILY` optional override).
|
||||
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), all live windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (channel + gateway + backend-scoped `pi`/`native`; scheduling example included in README), backend artifact freshness/drift checks are now available via `pnpm audit:phase0-baseline:live:drift` (or chained with `pnpm audit:phase0-baseline:live:refresh:drift`) with drift report artifacts written to `docs/plans/artifacts/phase0_baseline_live_backend_drift_<tag>.{md,json}`, cadence runs can preserve distinct timestamped comparison points via `pnpm audit:phase0-baseline:live:refresh:drift:rolling`, rolling-tag retention can be managed via `pnpm audit:phase0-baseline:live:prune` (dry-run) / `pnpm audit:phase0-baseline:live:prune:apply` with prune report artifacts written to `phase0_baseline_live_prune_<tag>.{md,json}` (and retained as a managed rolling family), and one-command cadence scheduling is available via `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` (`KEEP_PER_FAMILY` optional override).
|
||||
|
||||
## Subagent Model Assignment Plan
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
{
|
||||
"generated_at": "2026-02-27T18:48:13.082Z",
|
||||
"generated_at": "2026-02-27T18:55:02.289Z",
|
||||
"artifacts_dir": "/home/will/lab/flynn/docs/plans/artifacts",
|
||||
"keep_per_family": 8,
|
||||
"apply": false,
|
||||
@@ -261,6 +261,18 @@
|
||||
"family": "gateway",
|
||||
"tag": "2026-02-27-175943",
|
||||
"tag_timestamp_ms": 1772215183000
|
||||
},
|
||||
{
|
||||
"file_name": "phase0_baseline_live_prune_2026-02-27-184726.json",
|
||||
"family": "prune",
|
||||
"tag": "2026-02-27-184726",
|
||||
"tag_timestamp_ms": 1772218046000
|
||||
},
|
||||
{
|
||||
"file_name": "phase0_baseline_live_prune_2026-02-27-184726.md",
|
||||
"family": "prune",
|
||||
"tag": "2026-02-27-184726",
|
||||
"tag_timestamp_ms": 1772218046000
|
||||
}
|
||||
],
|
||||
"remove": [],
|
||||
@@ -294,6 +306,12 @@
|
||||
"total_tags": 3,
|
||||
"keep_tags": 3,
|
||||
"remove_tags": 0
|
||||
},
|
||||
{
|
||||
"family": "prune",
|
||||
"total_tags": 1,
|
||||
"keep_tags": 1,
|
||||
"remove_tags": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
Artifacts dir: /home/will/lab/flynn/docs/plans/artifacts
|
||||
Keep per family: 8
|
||||
Mode: dry-run
|
||||
Keep files: 42
|
||||
Keep files: 44
|
||||
Remove files: 0
|
||||
|
||||
## Families
|
||||
@@ -12,6 +12,7 @@ Remove files: 0
|
||||
- backend_pi_embedded: tags total=3 keep=3 remove=0
|
||||
- backend_native: tags total=3 keep=3 remove=0
|
||||
- backend_drift: tags total=3 keep=3 remove=0
|
||||
- prune: tags total=1 keep=1 remove=0
|
||||
|
||||
## Remove List
|
||||
- none
|
||||
|
||||
+38
-1
@@ -348,6 +348,43 @@
|
||||
],
|
||||
"test_status": "pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune + pnpm audit:phase0-baseline:live:prune + pnpm test:run src/audit/phase0BaselineArtifactRetention.test.ts + pnpm typecheck passing"
|
||||
},
|
||||
"phase0-live-baseline-prune-report-retention": {
|
||||
"status": "completed",
|
||||
"date": "2026-02-27",
|
||||
"updated": "2026-02-27",
|
||||
"summary": "Extended rolling artifact retention to include rolling prune-report artifacts (`phase0_baseline_live_prune_<rolling-tag>.md/.json`) so retention fully covers generated cadence outputs and prevents prune-report accumulation.",
|
||||
"files_modified": [
|
||||
"src/audit/phase0BaselineArtifactRetention.ts",
|
||||
"src/audit/phase0BaselineArtifactRetention.test.ts",
|
||||
"scripts/prune-phase0-baseline-artifacts.ts",
|
||||
"package.json",
|
||||
"README.md",
|
||||
"docs/api/PROTOCOL.md",
|
||||
"docs/architecture/AGENT_DIAGRAM.md",
|
||||
"docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md",
|
||||
"docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_2026-02-27-184726.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27-184726.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27-184726.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27-184726.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_drift_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_drift_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_prune_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_prune_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_prune_2026-02-27.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_prune_2026-02-27.json",
|
||||
"docs/plans/state.json"
|
||||
],
|
||||
"test_status": "pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune + pnpm audit:phase0-baseline:live:prune + pnpm test:run src/audit/phase0BaselineArtifactRetention.test.ts + pnpm typecheck passing"
|
||||
},
|
||||
"phase0-instrumentation-ticket-checklist": {
|
||||
"status": "completed",
|
||||
"date": "2026-02-25",
|
||||
@@ -7533,7 +7570,7 @@
|
||||
"deeper_surfaces_phase0_ticket_02": "completed — gateway + daemon routing emit run lifecycle/cancel telemetry and reaction match/skip audit events with filter summaries and cancellation latency, plus focused tests",
|
||||
"deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters",
|
||||
"deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output",
|
||||
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, generated anonymized live baseline artifacts for channel/gateway/backend-scoped (pi/native) windows, added backend freshness/drift gates with persisted drift reports, added rolling timestamp-tag cadence runs for immediate baseline-vs-prior drift comparisons, added rolling artifact retention tooling (`live:prune` with prune reports), and added one-command cadence refresh+drift+retention apply (`live:refresh:drift:rolling:prune`)",
|
||||
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, generated anonymized live baseline artifacts for channel/gateway/backend-scoped (pi/native) windows, added backend freshness/drift gates with persisted drift reports, added rolling timestamp-tag cadence runs for immediate baseline-vs-prior drift comparisons, added rolling artifact retention tooling (`live:prune` with prune reports retained as a managed family), and added one-command cadence refresh+drift+retention apply (`live:refresh:drift:rolling:prune`)",
|
||||
"next_up": "Run scheduled `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` in each active environment for at least one full cadence cycle (tuning `KEEP_PER_FAMILY` as needed), then tighten drift thresholds based on observed variance before additional run-control/reaction semantic changes.",
|
||||
"pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default",
|
||||
"pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)",
|
||||
|
||||
Reference in New Issue
Block a user