feat(audit): retain rolling phase0 prune reports
This commit is contained in:
@@ -23,7 +23,7 @@ The gateway provides:
|
||||
- **HTTP Server**: Serves static dashboard and handles webhook endpoints
|
||||
- **Node Capability Negotiation**: Optional companion-node role/capability registration
|
||||
|
||||
Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes live phase-0 baseline capture flows: `pnpm audit:phase0-baseline:live` for channel-origin windows, backend-scoped variants (`pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`) via `--backend`, `pnpm audit:phase0-baseline:live:gateway` (auto-detected cancel window) for gateway-origin windows, `pnpm audit:phase0-baseline:live:refresh` for one-shot refresh of all live windows (channel + gateway + backend-scoped), `pnpm audit:phase0-baseline:live:drift` for backend artifact freshness/drift gates (writing `phase0_baseline_live_backend_drift_<tag>.md/.json` reports), `pnpm audit:phase0-baseline:live:refresh:drift:rolling` for cadence runs that stamp each capture with a unique UTC timestamp tag (`YYYY-MM-DD-HHMMSS`) so drift comparisons can immediately use a prior snapshot, `pnpm audit:phase0-baseline:live:prune` / `pnpm audit:phase0-baseline:live:prune:apply` for rolling-tag artifact retention management (writing `phase0_baseline_live_prune_<tag>.md/.json` reports), and `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` for one-command cadence refresh+drift+retention apply (`KEEP_PER_FAMILY` override supported for retention depth). These scripts default to current UTC-date tags unless `--tag` is explicitly provided.
|
||||
Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes live phase-0 baseline capture flows: `pnpm audit:phase0-baseline:live` for channel-origin windows, backend-scoped variants (`pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`) via `--backend`, `pnpm audit:phase0-baseline:live:gateway` (auto-detected cancel window) for gateway-origin windows, `pnpm audit:phase0-baseline:live:refresh` for one-shot refresh of all live windows (channel + gateway + backend-scoped), `pnpm audit:phase0-baseline:live:drift` for backend artifact freshness/drift gates (writing `phase0_baseline_live_backend_drift_<tag>.md/.json` reports), `pnpm audit:phase0-baseline:live:refresh:drift:rolling` for cadence runs that stamp each capture with a unique UTC timestamp tag (`YYYY-MM-DD-HHMMSS`) so drift comparisons can immediately use a prior snapshot, `pnpm audit:phase0-baseline:live:prune` / `pnpm audit:phase0-baseline:live:prune:apply` for rolling-tag artifact retention management (writing `phase0_baseline_live_prune_<tag>.md/.json` reports and retaining those prune reports as part of managed rolling families), and `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` for one-command cadence refresh+drift+retention apply (`KEEP_PER_FAMILY` override supported for retention depth). These scripts default to current UTC-date tags unless `--tag` is explicitly provided.
|
||||
|
||||
### Execution Model (Sessions + Per-Session Queue)
|
||||
|
||||
|
||||
@@ -174,6 +174,7 @@ Gateway streaming UX signals:
|
||||
- `pnpm audit:phase0-baseline:live:refresh:drift:rolling` runs the same full refresh+drift flow with a shared UTC timestamp tag (`YYYY-MM-DD-HHMMSS`) so each cadence run keeps distinct backend/drift artifacts for immediate baseline-vs-prior comparisons.
|
||||
- `pnpm audit:phase0-baseline:live:prune` provides dry-run retention planning for rolling-tag artifacts; `pnpm audit:phase0-baseline:live:prune:apply` deletes older rolling snapshots while keeping the newest tags per artifact family.
|
||||
- `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` chains rolling refresh+drift with retention apply for one-command scheduled cadence runs (`KEEP_PER_FAMILY` controls retention depth), and now writes prune reports tagged to the same rolling run (`phase0_baseline_live_prune_<tag>.md/.json`).
|
||||
- Rolling retention families now include cadence prune reports themselves (`phase0_baseline_live_prune_<rolling-tag>.md/.json`) to prevent unbounded prune-report growth.
|
||||
- `audit:phase0-baseline:live*` scripts are cadence-safe by default (UTC-date tags auto-generated unless explicitly overridden).
|
||||
- Canvas artifacts are persisted by the gateway so session UI surfaces can recover after daemon restarts.
|
||||
- TTS synthesis uses an ordered provider chain with health cooldown tracking; if all providers fail, replies degrade to text-only without dropping the response.
|
||||
|
||||
@@ -39,6 +39,7 @@ If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
|
||||
- `pnpm audit:phase0-baseline:live:refresh:drift:rolling` performs the same chain using one UTC timestamp tag (`YYYY-MM-DD-HHMMSS`) across channel/gateway/backend/drift outputs so each cadence run preserves a distinct comparison point.
|
||||
- `pnpm audit:phase0-baseline:live:prune` (dry-run) and `pnpm audit:phase0-baseline:live:prune:apply` (delete) manage retention of rolling-tag artifacts to control artifact growth while preserving newest snapshots per family.
|
||||
- `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` combines rolling refresh+drift with retention apply for one-command cron scheduling; adjust retention depth with `KEEP_PER_FAMILY` and use generated `phase0_baseline_live_prune_<tag>.md/.json` artifacts for retention audit traceability.
|
||||
- Retention management also covers rolling prune-report artifacts (`phase0_baseline_live_prune_<rolling-tag>.md/.json`) as a first-class family.
|
||||
- `audit:phase0-baseline:live*` package scripts now omit fixed tags so scheduled runs automatically roll to current UTC-date artifact tags.
|
||||
- Companion CLI supports one-shot shell bootstrap metadata for live sessions (`--app-version`/`--status-text`, `--latitude`/`--longitude`, `--push-token`) so desktop/mobile wrappers can initialize node status/location/push in a single launch flow.
|
||||
- Canvas artifacts are persisted per session under the gateway data directory for UI recovery across restarts.
|
||||
|
||||
@@ -203,7 +203,7 @@ Phase 0 is complete when:
|
||||
2. A baseline summary artifact is generated and committed under `docs/plans/artifacts/`.
|
||||
3. No user-visible response behavior changed compared to pre-phase baseline.
|
||||
|
||||
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), all live windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (channel + gateway + backend-scoped `pi`/`native`; scheduling example included in README), backend artifact freshness/drift checks are now available via `pnpm audit:phase0-baseline:live:drift` (or chained with `pnpm audit:phase0-baseline:live:refresh:drift`) with drift report artifacts written to `docs/plans/artifacts/phase0_baseline_live_backend_drift_<tag>.{md,json}`, cadence runs can preserve distinct timestamped comparison points via `pnpm audit:phase0-baseline:live:refresh:drift:rolling`, rolling-tag retention can be managed via `pnpm audit:phase0-baseline:live:prune` (dry-run) / `pnpm audit:phase0-baseline:live:prune:apply` with prune report artifacts written to `phase0_baseline_live_prune_<tag>.{md,json}`, and one-command cadence scheduling is available via `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` (`KEEP_PER_FAMILY` optional override).
|
||||
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), all live windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (channel + gateway + backend-scoped `pi`/`native`; scheduling example included in README), backend artifact freshness/drift checks are now available via `pnpm audit:phase0-baseline:live:drift` (or chained with `pnpm audit:phase0-baseline:live:refresh:drift`) with drift report artifacts written to `docs/plans/artifacts/phase0_baseline_live_backend_drift_<tag>.{md,json}`, cadence runs can preserve distinct timestamped comparison points via `pnpm audit:phase0-baseline:live:refresh:drift:rolling`, rolling-tag retention can be managed via `pnpm audit:phase0-baseline:live:prune` (dry-run) / `pnpm audit:phase0-baseline:live:prune:apply` with prune report artifacts written to `phase0_baseline_live_prune_<tag>.{md,json}` (and retained as a managed rolling family), and one-command cadence scheduling is available via `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` (`KEEP_PER_FAMILY` optional override).
|
||||
|
||||
## Subagent Model Assignment Plan
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
{
|
||||
"generated_at": "2026-02-27T18:48:13.082Z",
|
||||
"generated_at": "2026-02-27T18:55:02.289Z",
|
||||
"artifacts_dir": "/home/will/lab/flynn/docs/plans/artifacts",
|
||||
"keep_per_family": 8,
|
||||
"apply": false,
|
||||
@@ -261,6 +261,18 @@
|
||||
"family": "gateway",
|
||||
"tag": "2026-02-27-175943",
|
||||
"tag_timestamp_ms": 1772215183000
|
||||
},
|
||||
{
|
||||
"file_name": "phase0_baseline_live_prune_2026-02-27-184726.json",
|
||||
"family": "prune",
|
||||
"tag": "2026-02-27-184726",
|
||||
"tag_timestamp_ms": 1772218046000
|
||||
},
|
||||
{
|
||||
"file_name": "phase0_baseline_live_prune_2026-02-27-184726.md",
|
||||
"family": "prune",
|
||||
"tag": "2026-02-27-184726",
|
||||
"tag_timestamp_ms": 1772218046000
|
||||
}
|
||||
],
|
||||
"remove": [],
|
||||
@@ -294,6 +306,12 @@
|
||||
"total_tags": 3,
|
||||
"keep_tags": 3,
|
||||
"remove_tags": 0
|
||||
},
|
||||
{
|
||||
"family": "prune",
|
||||
"total_tags": 1,
|
||||
"keep_tags": 1,
|
||||
"remove_tags": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
Artifacts dir: /home/will/lab/flynn/docs/plans/artifacts
|
||||
Keep per family: 8
|
||||
Mode: dry-run
|
||||
Keep files: 42
|
||||
Keep files: 44
|
||||
Remove files: 0
|
||||
|
||||
## Families
|
||||
@@ -12,6 +12,7 @@ Remove files: 0
|
||||
- backend_pi_embedded: tags total=3 keep=3 remove=0
|
||||
- backend_native: tags total=3 keep=3 remove=0
|
||||
- backend_drift: tags total=3 keep=3 remove=0
|
||||
- prune: tags total=1 keep=1 remove=0
|
||||
|
||||
## Remove List
|
||||
- none
|
||||
|
||||
+38
-1
@@ -348,6 +348,43 @@
|
||||
],
|
||||
"test_status": "pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune + pnpm audit:phase0-baseline:live:prune + pnpm test:run src/audit/phase0BaselineArtifactRetention.test.ts + pnpm typecheck passing"
|
||||
},
|
||||
"phase0-live-baseline-prune-report-retention": {
|
||||
"status": "completed",
|
||||
"date": "2026-02-27",
|
||||
"updated": "2026-02-27",
|
||||
"summary": "Extended rolling artifact retention to include rolling prune-report artifacts (`phase0_baseline_live_prune_<rolling-tag>.md/.json`) so retention fully covers generated cadence outputs and prevents prune-report accumulation.",
|
||||
"files_modified": [
|
||||
"src/audit/phase0BaselineArtifactRetention.ts",
|
||||
"src/audit/phase0BaselineArtifactRetention.test.ts",
|
||||
"scripts/prune-phase0-baseline-artifacts.ts",
|
||||
"package.json",
|
||||
"README.md",
|
||||
"docs/api/PROTOCOL.md",
|
||||
"docs/architecture/AGENT_DIAGRAM.md",
|
||||
"docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md",
|
||||
"docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_2026-02-27-184726.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27-184726.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27-184726.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27-184726.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_drift_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_drift_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_prune_2026-02-27-184726.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_prune_2026-02-27-184726.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_prune_2026-02-27.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_prune_2026-02-27.json",
|
||||
"docs/plans/state.json"
|
||||
],
|
||||
"test_status": "pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune + pnpm audit:phase0-baseline:live:prune + pnpm test:run src/audit/phase0BaselineArtifactRetention.test.ts + pnpm typecheck passing"
|
||||
},
|
||||
"phase0-instrumentation-ticket-checklist": {
|
||||
"status": "completed",
|
||||
"date": "2026-02-25",
|
||||
@@ -7533,7 +7570,7 @@
|
||||
"deeper_surfaces_phase0_ticket_02": "completed — gateway + daemon routing emit run lifecycle/cancel telemetry and reaction match/skip audit events with filter summaries and cancellation latency, plus focused tests",
|
||||
"deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters",
|
||||
"deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output",
|
||||
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, generated anonymized live baseline artifacts for channel/gateway/backend-scoped (pi/native) windows, added backend freshness/drift gates with persisted drift reports, added rolling timestamp-tag cadence runs for immediate baseline-vs-prior drift comparisons, added rolling artifact retention tooling (`live:prune` with prune reports), and added one-command cadence refresh+drift+retention apply (`live:refresh:drift:rolling:prune`)",
|
||||
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, generated anonymized live baseline artifacts for channel/gateway/backend-scoped (pi/native) windows, added backend freshness/drift gates with persisted drift reports, added rolling timestamp-tag cadence runs for immediate baseline-vs-prior drift comparisons, added rolling artifact retention tooling (`live:prune` with prune reports retained as a managed family), and added one-command cadence refresh+drift+retention apply (`live:refresh:drift:rolling:prune`)",
|
||||
"next_up": "Run scheduled `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` in each active environment for at least one full cadence cycle (tuning `KEEP_PER_FAMILY` as needed), then tighten drift thresholds based on observed variance before additional run-control/reaction semantic changes.",
|
||||
"pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default",
|
||||
"pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)",
|
||||
|
||||
Reference in New Issue
Block a user