docs(audit): add phase0 baseline cadence scheduling runbook

This commit is contained in:
William Valentin
2026-02-27 00:43:37 -08:00
parent 826df1d35b
commit 4880d757c5
9 changed files with 31 additions and 8 deletions
+6
View File
@@ -1634,6 +1634,12 @@ One-shot refresh for both channel + gateway live windows:
pnpm audit:phase0-baseline:live:refresh pnpm audit:phase0-baseline:live:refresh
``` ```
Cadence scheduling (example: every 6 hours via host cron):
```bash
0 */6 * * * cd /path/to/flynn && pnpm audit:phase0-baseline:live:refresh >> ~/.local/share/flynn/phase0_baseline_refresh.log 2>&1
```
`audit:phase0-baseline:live*` scripts now default to the current UTC date tag when `--tag` is omitted.
Gateway-origin windows can be captured separately (for example when validating cancel paths): Gateway-origin windows can be captured separately (for example when validating cancel paths):
```bash ```bash
pnpm audit:phase0-baseline:live:gateway pnpm audit:phase0-baseline:live:gateway
+1 -1
View File
@@ -23,7 +23,7 @@ The gateway provides:
- **HTTP Server**: Serves static dashboard and handles webhook endpoints - **HTTP Server**: Serves static dashboard and handles webhook endpoints
- **Node Capability Negotiation**: Optional companion-node role/capability registration - **Node Capability Negotiation**: Optional companion-node role/capability registration
Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes live phase-0 baseline capture flows: `pnpm audit:phase0-baseline:live` for channel-origin windows, `pnpm audit:phase0-baseline:live:gateway` (auto-detected cancel window) for gateway-origin windows, and `pnpm audit:phase0-baseline:live:refresh` for one-shot refresh of both windows. Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes live phase-0 baseline capture flows: `pnpm audit:phase0-baseline:live` for channel-origin windows, `pnpm audit:phase0-baseline:live:gateway` (auto-detected cancel window) for gateway-origin windows, and `pnpm audit:phase0-baseline:live:refresh` for one-shot refresh of both windows. These scripts default to current UTC-date tags unless `--tag` is explicitly provided.
### Execution Model (Sessions + Per-Session Queue) ### Execution Model (Sessions + Per-Session Queue)
+1
View File
@@ -169,6 +169,7 @@ Gateway streaming UX signals:
- `pnpm audit:phase0-baseline:live` captures anonymized channel-origin live run/reaction baseline artifacts from real audit logs. - `pnpm audit:phase0-baseline:live` captures anonymized channel-origin live run/reaction baseline artifacts from real audit logs.
- `pnpm audit:phase0-baseline:live:gateway` captures gateway-origin baseline windows by auto-selecting the latest cancel/cancelled session window (or use `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit windows). - `pnpm audit:phase0-baseline:live:gateway` captures gateway-origin baseline windows by auto-selecting the latest cancel/cancelled session window (or use `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit windows).
- `pnpm audit:phase0-baseline:live:refresh` runs both channel + gateway capture commands in one step for cadence refreshes. - `pnpm audit:phase0-baseline:live:refresh` runs both channel + gateway capture commands in one step for cadence refreshes.
- `audit:phase0-baseline:live*` scripts are cadence-safe by default (UTC-date tags auto-generated unless explicitly overridden).
- Canvas artifacts are persisted by the gateway so session UI surfaces can recover after daemon restarts. - Canvas artifacts are persisted by the gateway so session UI surfaces can recover after daemon restarts.
- TTS synthesis uses an ordered provider chain with health cooldown tracking; if all providers fail, replies degrade to text-only without dropping the response. - TTS synthesis uses an ordered provider chain with health cooldown tracking; if all providers fail, replies degrade to text-only without dropping the response.
- Talk mode accepts spoken/text `stop`/`cancel` while active and maps it onto the same `/stop` run-control cancellation path used for text sessions. - Talk mode accepts spoken/text `stop`/`cancel` while active and maps it onto the same `/stop` run-control cancellation path used for text sessions.
@@ -34,6 +34,7 @@ If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
- Audit phase-0 live telemetry snapshots can be regenerated with `pnpm audit:phase0-baseline:live` (channel-origin anonymized sample JSONL + summary JSON/markdown artifacts). - Audit phase-0 live telemetry snapshots can be regenerated with `pnpm audit:phase0-baseline:live` (channel-origin anonymized sample JSONL + summary JSON/markdown artifacts).
- Gateway-origin phase-0 windows (including cancel-path samples) can be captured with `pnpm audit:phase0-baseline:live:gateway` (auto-detect latest cancel window) or `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit bounds. - Gateway-origin phase-0 windows (including cancel-path samples) can be captured with `pnpm audit:phase0-baseline:live:gateway` (auto-detect latest cancel window) or `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit bounds.
- `pnpm audit:phase0-baseline:live:refresh` runs both capture paths to refresh channel + gateway artifacts in one command. - `pnpm audit:phase0-baseline:live:refresh` runs both capture paths to refresh channel + gateway artifacts in one command.
- `audit:phase0-baseline:live*` package scripts now omit fixed tags so scheduled runs automatically roll to current UTC-date artifact tags.
- Companion CLI supports one-shot shell bootstrap metadata for live sessions (`--app-version`/`--status-text`, `--latitude`/`--longitude`, `--push-token`) so desktop/mobile wrappers can initialize node status/location/push in a single launch flow. - Companion CLI supports one-shot shell bootstrap metadata for live sessions (`--app-version`/`--status-text`, `--latitude`/`--longitude`, `--push-token`) so desktop/mobile wrappers can initialize node status/location/push in a single launch flow.
- Canvas artifacts are persisted per session under the gateway data directory for UI recovery across restarts. - Canvas artifacts are persisted per session under the gateway data directory for UI recovery across restarts.
- TTS output is best-effort with ordered provider fallback + per-provider cooldown tracking; synthesis failures still fall back to text-only responses. - TTS output is best-effort with ordered provider fallback + per-provider cooldown tracking; synthesis failures still fall back to text-only responses.
@@ -203,7 +203,7 @@ Phase 0 is complete when:
2. A baseline summary artifact is generated and committed under `docs/plans/artifacts/`. 2. A baseline summary artifact is generated and committed under `docs/plans/artifacts/`.
3. No user-visible response behavior changed compared to pre-phase baseline. 3. No user-visible response behavior changed compared to pre-phase baseline.
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), and both windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh`. Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), and both windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (scheduling example included in README).
## Subagent Model Assignment Plan ## Subagent Model Assignment Plan
@@ -1,5 +1,5 @@
{ {
"generated_at": "2026-02-27T07:55:30.862Z", "generated_at": "2026-02-27T08:43:10.518Z",
"source_audit_path": "~/.local/share/flynn/audit.log", "source_audit_path": "~/.local/share/flynn/audit.log",
"source_event_count": 94, "source_event_count": 94,
"sampled_event_count": 88, "sampled_event_count": 88,
@@ -1,5 +1,5 @@
{ {
"generated_at": "2026-02-27T07:55:31.178Z", "generated_at": "2026-02-27T08:43:10.946Z",
"source_audit_path": "~/.local/share/flynn/audit.log", "source_audit_path": "~/.local/share/flynn/audit.log",
"source_event_count": 6, "source_event_count": 6,
"sampled_event_count": 6, "sampled_event_count": 6,
+17 -2
View File
@@ -133,7 +133,7 @@
"status": "completed", "status": "completed",
"date": "2026-02-27", "date": "2026-02-27",
"updated": "2026-02-27", "updated": "2026-02-27",
"summary": "Automated gateway live-window capture by adding auto-detection of the latest gateway cancel/cancelled window (`--auto-gateway-cancel-window`) plus a one-shot refresh command that regenerates both channel and gateway artifacts together (`pnpm audit:phase0-baseline:live:refresh`).", "summary": "Automated gateway live-window capture by adding auto-detection of the latest gateway cancel/cancelled window (`--auto-gateway-cancel-window`) plus a one-shot refresh command that regenerates both channel and gateway artifacts together (`pnpm audit:phase0-baseline:live:refresh`). Phase-0 live package scripts now omit fixed tags so scheduled runs default to UTC-date artifact tags.",
"files_modified": [ "files_modified": [
"src/audit/phase0GatewayWindow.ts", "src/audit/phase0GatewayWindow.ts",
"src/audit/phase0GatewayWindow.test.ts", "src/audit/phase0GatewayWindow.test.ts",
@@ -150,6 +150,21 @@
], ],
"test_status": "pnpm audit:phase0-baseline:live:refresh + pnpm test:run src/audit/phase0GatewayWindow.test.ts src/audit/phase0LiveBaseline.test.ts src/audit/phase0BaselineSummary.test.ts + pnpm typecheck passing" "test_status": "pnpm audit:phase0-baseline:live:refresh + pnpm test:run src/audit/phase0GatewayWindow.test.ts src/audit/phase0LiveBaseline.test.ts src/audit/phase0BaselineSummary.test.ts + pnpm typecheck passing"
}, },
"phase0-live-baseline-cadence-runbook": {
"status": "completed",
"date": "2026-02-27",
"updated": "2026-02-27",
"summary": "Documented host-level operational cadence scheduling for phase-0 baseline refresh with a concrete cron example and clarified that package scripts auto-generate UTC-date tags when `--tag` is omitted.",
"files_modified": [
"README.md",
"docs/api/PROTOCOL.md",
"docs/architecture/AGENT_DIAGRAM.md",
"docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md",
"docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
"docs/plans/state.json"
],
"test_status": "documentation/package-script runbook update only; validated via pnpm audit:phase0-baseline:live:refresh + pnpm typecheck"
},
"phase0-instrumentation-ticket-checklist": { "phase0-instrumentation-ticket-checklist": {
"status": "completed", "status": "completed",
"date": "2026-02-25", "date": "2026-02-25",
@@ -7336,7 +7351,7 @@
"deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters", "deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters",
"deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output", "deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output",
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, and generated anonymized live baseline artifacts for both channel-origin and gateway-origin traffic (including cancel-path coverage)", "deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, and generated anonymized live baseline artifacts for both channel-origin and gateway-origin traffic (including cancel-path coverage)",
"next_up": "Phase-0 baseline refresh flow is now automated for channel + gateway windows (`pnpm audit:phase0-baseline:live:refresh`); next step is scheduling this command on an operational cadence before additional run-control/reaction semantic changes.", "next_up": "Apply `pnpm audit:phase0-baseline:live:refresh` to the host scheduler (cron/systemd timer) in each active environment and monitor artifact freshness over at least one full cadence cycle before additional run-control/reaction semantic changes.",
"pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default", "pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default",
"pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)", "pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)",
"pi_embedded_manual_mode": "completed — added persisted runtime backend controls for manual Pi activation/deactivation (`/runtime` preferred, `/backend` alias; `status`, `activate pi`, `deactivate pi`, `use config`) while keeping config-driven default routing", "pi_embedded_manual_mode": "completed — added persisted runtime backend controls for manual Pi activation/deactivation (`/runtime` preferred, `/backend` alias; `status`, `activate pi`, `deactivate pi`, `use config`) while keeping config-driven default routing",
+2 -2
View File
@@ -22,8 +22,8 @@
"config:profiles:check": "node scripts/generate-config-profiles.mjs --check", "config:profiles:check": "node scripts/generate-config-profiles.mjs --check",
"audit:backend-canary": "node --import tsx/esm scripts/summarize-backend-canary.ts", "audit:backend-canary": "node --import tsx/esm scripts/summarize-backend-canary.ts",
"audit:phase0-baseline": "node --import tsx/esm scripts/summarize-phase0-baseline.ts", "audit:phase0-baseline": "node --import tsx/esm scripts/summarize-phase0-baseline.ts",
"audit:phase0-baseline:live": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source channel --exclude-session-substring probe --tag 2026-02-27", "audit:phase0-baseline:live": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source channel --exclude-session-substring probe",
"audit:phase0-baseline:live:gateway": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source gateway --auto-gateway-cancel-window --tag 2026-02-27", "audit:phase0-baseline:live:gateway": "node --import tsx/esm scripts/capture-phase0-live-baseline.ts --audit ~/.local/share/flynn/audit.log --source gateway --auto-gateway-cancel-window",
"audit:phase0-baseline:live:refresh": "pnpm audit:phase0-baseline:live && pnpm audit:phase0-baseline:live:gateway", "audit:phase0-baseline:live:refresh": "pnpm audit:phase0-baseline:live && pnpm audit:phase0-baseline:live:gateway",
"audit:backend-canary:probes": "node --import tsx/esm scripts/run-pi-canary-guard-probes.ts", "audit:backend-canary:probes": "node --import tsx/esm scripts/run-pi-canary-guard-probes.ts",
"companion:bundle": "node --import tsx/esm scripts/build-companion-release-bundle.ts", "companion:bundle": "node --import tsx/esm scripts/build-companion-release-bundle.ts",