feat(audit): refresh all phase0 live windows in cadence run
This commit is contained in:
@@ -23,7 +23,7 @@ The gateway provides:
|
||||
- **HTTP Server**: Serves static dashboard and handles webhook endpoints
|
||||
- **Node Capability Negotiation**: Optional companion-node role/capability registration
|
||||
|
||||
Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes live phase-0 baseline capture flows: `pnpm audit:phase0-baseline:live` for channel-origin windows, backend-scoped variants (`pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`) via `--backend`, `pnpm audit:phase0-baseline:live:gateway` (auto-detected cancel window) for gateway-origin windows, `pnpm audit:phase0-baseline:live:refresh` for one-shot refresh of both windows, and `pnpm audit:phase0-baseline:live:drift` for backend artifact freshness/drift gates (writing `phase0_baseline_live_backend_drift_<UTC-date>.md/.json` reports). These scripts default to current UTC-date tags unless `--tag` is explicitly provided.
|
||||
Operational note: onboarding (`flynn setup` / `flynn onboard`) now runs post-save live readiness checks (model/channel/memory/automation) and prints a guided first-success task flow. Companion CLI now also supports bootstrap-manifest export (`flynn companion --export-bootstrap <path|->`), release-bundle export (`--export-release-bundle <dir>` with optional `--signing-key`/`--signing-key-id` signature output), release-bundle verification (`--verify-release-bundle <dir>` with optional `--verify-signing-key`/`--verify-signing-key-id`/`--require-signature`), platform shell-template export (`--export-shell-template <dir>`), plus richer shell bootstrap flags for status/location/push (`--app-version`, `--latitude/--longitude`, `--push-token`, etc.) for desktop/mobile app packaging without changing JSON-RPC method/event shapes. Audit observability now includes live phase-0 baseline capture flows: `pnpm audit:phase0-baseline:live` for channel-origin windows, backend-scoped variants (`pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`) via `--backend`, `pnpm audit:phase0-baseline:live:gateway` (auto-detected cancel window) for gateway-origin windows, `pnpm audit:phase0-baseline:live:refresh` for one-shot refresh of all live windows (channel + gateway + backend-scoped), and `pnpm audit:phase0-baseline:live:drift` for backend artifact freshness/drift gates (writing `phase0_baseline_live_backend_drift_<UTC-date>.md/.json` reports). These scripts default to current UTC-date tags unless `--tag` is explicitly provided.
|
||||
|
||||
### Execution Model (Sessions + Per-Session Queue)
|
||||
|
||||
|
||||
@@ -169,7 +169,7 @@ Gateway streaming UX signals:
|
||||
- `pnpm audit:phase0-baseline:live` captures anonymized channel-origin live run/reaction baseline artifacts from real audit logs.
|
||||
- `pnpm audit:phase0-baseline:live:pi` and `pnpm audit:phase0-baseline:live:native` capture backend-scoped channel windows using `backend.route` timelines.
|
||||
- `pnpm audit:phase0-baseline:live:gateway` captures gateway-origin baseline windows by auto-selecting the latest cancel/cancelled session window (or use `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit windows).
|
||||
- `pnpm audit:phase0-baseline:live:refresh` runs both channel + gateway capture commands in one step for cadence refreshes.
|
||||
- `pnpm audit:phase0-baseline:live:refresh` runs channel + gateway + backend-scoped (`pi_embedded` and `native`) capture commands in one cadence step.
|
||||
- `pnpm audit:phase0-baseline:live:drift` evaluates backend-scoped artifact freshness/drift gates and writes `docs/plans/artifacts/phase0_baseline_live_backend_drift_<UTC-date>.md/.json`; `pnpm audit:phase0-baseline:live:refresh:drift` runs capture + drift checks in one cadence step.
|
||||
- `audit:phase0-baseline:live*` scripts are cadence-safe by default (UTC-date tags auto-generated unless explicitly overridden).
|
||||
- Canvas artifacts are persisted by the gateway so session UI surfaces can recover after daemon restarts.
|
||||
|
||||
@@ -34,7 +34,7 @@ If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
|
||||
- Audit phase-0 live telemetry snapshots can be regenerated with `pnpm audit:phase0-baseline:live` (channel-origin anonymized sample JSONL + summary JSON/markdown artifacts).
|
||||
- Backend-scoped channel snapshots can be regenerated with `pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native` (`--backend` filtering via `backend.route` timelines).
|
||||
- Gateway-origin phase-0 windows (including cancel-path samples) can be captured with `pnpm audit:phase0-baseline:live:gateway` (auto-detect latest cancel window) or `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit bounds.
|
||||
- `pnpm audit:phase0-baseline:live:refresh` runs both capture paths to refresh channel + gateway artifacts in one command.
|
||||
- `pnpm audit:phase0-baseline:live:refresh` runs channel + gateway + backend-scoped (`pi_embedded` and `native`) capture paths in one command.
|
||||
- `pnpm audit:phase0-baseline:live:drift` checks backend-scoped artifact freshness/drift gates and writes `phase0_baseline_live_backend_drift_<UTC-date>.md/.json`; `pnpm audit:phase0-baseline:live:refresh:drift` chains refresh + drift checks for scheduled cadence runs.
|
||||
- `audit:phase0-baseline:live*` package scripts now omit fixed tags so scheduled runs automatically roll to current UTC-date artifact tags.
|
||||
- Companion CLI supports one-shot shell bootstrap metadata for live sessions (`--app-version`/`--status-text`, `--latitude`/`--longitude`, `--push-token`) so desktop/mobile wrappers can initialize node status/location/push in a single launch flow.
|
||||
|
||||
@@ -203,7 +203,7 @@ Phase 0 is complete when:
|
||||
2. A baseline summary artifact is generated and committed under `docs/plans/artifacts/`.
|
||||
3. No user-visible response behavior changed compared to pre-phase baseline.
|
||||
|
||||
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), both windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (scheduling example included in README), backend-scoped channel windows are now available via `pnpm audit:phase0-baseline:live:pi` / `pnpm audit:phase0-baseline:live:native`, and backend artifact freshness/drift checks are now available via `pnpm audit:phase0-baseline:live:drift` (or chained with `pnpm audit:phase0-baseline:live:refresh:drift`) with drift report artifacts written to `docs/plans/artifacts/phase0_baseline_live_backend_drift_<UTC-date>.{md,json}`.
|
||||
Follow-up status (2026-02-27): live channel-session artifacts exist under `docs/plans/artifacts/phase0_baseline_live_2026-02-27.*` via `pnpm audit:phase0-baseline:live` (anonymized IDs), and a second gateway-origin live window (including `run.cancel` + `cancel_requested`/`cancelled`) exists under `docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.*`. Gateway window refreshes can now run via `pnpm audit:phase0-baseline:live:gateway` (auto-selected cancel window), all live windows can be refreshed together with `pnpm audit:phase0-baseline:live:refresh` (channel + gateway + backend-scoped `pi`/`native`; scheduling example included in README), backend artifact freshness/drift checks are now available via `pnpm audit:phase0-baseline:live:drift` (or chained with `pnpm audit:phase0-baseline:live:refresh:drift`) with drift report artifacts written to `docs/plans/artifacts/phase0_baseline_live_backend_drift_<UTC-date>.{md,json}`.
|
||||
|
||||
## Subagent Model Assignment Plan
|
||||
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
{
|
||||
"generated_at": "2026-02-27T16:46:42.576Z",
|
||||
"generated_at": "2026-02-27T17:36:01.625Z",
|
||||
"source_audit_path": "~/.local/share/flynn/audit.log",
|
||||
"source_event_count": 110,
|
||||
"sampled_event_count": 104,
|
||||
"source_event_count": 115,
|
||||
"sampled_event_count": 109,
|
||||
"filters": {
|
||||
"sources": [
|
||||
"channel"
|
||||
@@ -22,19 +22,19 @@
|
||||
},
|
||||
"summary": {
|
||||
"event_counts": {
|
||||
"run_state": 65,
|
||||
"run_state": 68,
|
||||
"run_cancel": 0,
|
||||
"reaction_match": 0,
|
||||
"reaction_skip": 39
|
||||
"reaction_skip": 41
|
||||
},
|
||||
"run_outcomes": {
|
||||
"overall": {
|
||||
"total_outcomes": 27,
|
||||
"complete": 27,
|
||||
"total_outcomes": 28,
|
||||
"complete": 28,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 38,
|
||||
"start": 40,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
@@ -43,12 +43,12 @@
|
||||
{
|
||||
"key": "gmail",
|
||||
"stats": {
|
||||
"total_outcomes": 25,
|
||||
"complete": 25,
|
||||
"total_outcomes": 26,
|
||||
"complete": 26,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 25,
|
||||
"start": 26,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
@@ -62,7 +62,7 @@
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 13,
|
||||
"start": 14,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
@@ -112,6 +112,20 @@
|
||||
"error_rate_pct": 0
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_f6304f25e43b",
|
||||
"stats": {
|
||||
"total_outcomes": 2,
|
||||
"complete": 2,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 2,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_33469de5a1ee",
|
||||
"stats": {
|
||||
@@ -322,20 +336,6 @@
|
||||
"error_rate_pct": 0
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_f6304f25e43b",
|
||||
"stats": {
|
||||
"total_outcomes": 1,
|
||||
"complete": 1,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 1,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_fd6536fa5ff4",
|
||||
"stats": {
|
||||
@@ -355,14 +355,14 @@
|
||||
"cancel_latency_ms": null,
|
||||
"reactions": {
|
||||
"matched": 0,
|
||||
"skipped": 39,
|
||||
"total": 39,
|
||||
"skipped": 41,
|
||||
"total": 41,
|
||||
"match_rate_pct": 0,
|
||||
"skip_rate_pct": 100,
|
||||
"skip_reasons": [
|
||||
{
|
||||
"reason": "no_rules",
|
||||
"count": 39,
|
||||
"count": 41,
|
||||
"pct": 100
|
||||
}
|
||||
]
|
||||
|
||||
@@ -102,3 +102,8 @@
|
||||
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772208000012}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"start","request_id":"request_a3bafbb93755"},"timestamp":1772208000013}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"complete","request_id":"request_a3bafbb93755","duration_ms":35239},"timestamp":1772208035252}
|
||||
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772211257454}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","state":"start","request_id":"request_607c64c2760f"},"timestamp":1772211257454}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","state":"complete","request_id":"request_607c64c2760f","duration_ms":3870},"timestamp":1772211261324}
|
||||
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_534570702ea5","channel":"cron","sender":"sender_552aeb8f1b32","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772211600036}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_534570702ea5","channel":"cron","sender":"sender_552aeb8f1b32","source":"channel","state":"start","request_id":"request_c0a9fc76c188"},"timestamp":1772211600036}
|
||||
|
||||
@@ -1,27 +1,27 @@
|
||||
# Phase 0 Baseline Telemetry Summary
|
||||
|
||||
- Run state events: 65
|
||||
- Run state events: 68
|
||||
- Run cancel events: 0
|
||||
- Reaction matches: 0
|
||||
- Reaction skips: 39
|
||||
- Reaction skips: 41
|
||||
|
||||
- Sources: channel
|
||||
|
||||
## Run Outcomes (Overall)
|
||||
|
||||
- Total outcomes: 27
|
||||
- Complete: 27 (100.00%)
|
||||
- Total outcomes: 28
|
||||
- Complete: 28 (100.00%)
|
||||
- Cancelled: 0 (0.00%)
|
||||
- Errors: 0 (0.00%)
|
||||
- Cancel requested: 0
|
||||
- Starts: 38
|
||||
- Starts: 40
|
||||
|
||||
## Run Outcomes by Channel
|
||||
|
||||
| Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
|
||||
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
|
||||
| gmail | 25 | 25 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 25 |
|
||||
| cron | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 13 |
|
||||
| gmail | 26 | 26 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 26 |
|
||||
| cron | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 14 |
|
||||
|
||||
## Run Outcomes by Session
|
||||
|
||||
@@ -30,6 +30,7 @@
|
||||
| session_2f2f1e414e81 | 5 | 5 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 5 |
|
||||
| session_f4d8ddc04194 | 3 | 3 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 3 |
|
||||
| session_eabc3c2a91b9 | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 2 |
|
||||
| session_f6304f25e43b | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 2 |
|
||||
| session_33469de5a1ee | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_3ffb2e631ab1 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_4d9e843358a3 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
@@ -45,7 +46,6 @@
|
||||
| session_cb9a69d8a362 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_e0a2a17b7329 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_ea839415979e | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_f6304f25e43b | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_fd6536fa5ff4 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
|
||||
## Cancel Latency
|
||||
@@ -55,11 +55,11 @@
|
||||
## Reaction Decisions
|
||||
|
||||
- Matched: 0 (0.00%)
|
||||
- Skipped: 39 (100.00%)
|
||||
- Skipped: 41 (100.00%)
|
||||
|
||||
### Skip Reasons
|
||||
|
||||
| Reason | Count | Percent |
|
||||
| --- | ---: | ---: |
|
||||
| no_rules | 39 | 100.00% |
|
||||
| no_rules | 41 | 100.00% |
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
{
|
||||
"generated_at": "2026-02-27T17:04:49.009Z",
|
||||
"generated_at": "2026-02-27T17:36:02.803Z",
|
||||
"artifacts_dir": "/home/will/lab/flynn/docs/plans/artifacts",
|
||||
"backends": [
|
||||
"pi_embedded",
|
||||
@@ -29,15 +29,15 @@
|
||||
"candidate": {
|
||||
"tag": "2026-02-27",
|
||||
"path": "/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.json",
|
||||
"generated_at": "2026-02-27T16:45:18.488Z"
|
||||
"generated_at": "2026-02-27T17:36:02.214Z"
|
||||
},
|
||||
"baseline": null,
|
||||
"comparison": {
|
||||
"baseline": null,
|
||||
"candidate": {
|
||||
"source_event_count": 110,
|
||||
"sampled_event_count": 56,
|
||||
"run_total_outcomes": 25,
|
||||
"source_event_count": 115,
|
||||
"sampled_event_count": 59,
|
||||
"run_total_outcomes": 26,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0,
|
||||
@@ -59,7 +59,7 @@
|
||||
"freshness": {
|
||||
"enabled": true,
|
||||
"pass": true,
|
||||
"actual_age_hours": 0.33,
|
||||
"actual_age_hours": 0,
|
||||
"threshold_hours": 36
|
||||
},
|
||||
"drift_gate": {
|
||||
@@ -68,7 +68,7 @@
|
||||
{
|
||||
"criterion": "candidate_sampled_events",
|
||||
"pass": true,
|
||||
"actual": "56",
|
||||
"actual": "59",
|
||||
"threshold": ">= 10"
|
||||
},
|
||||
{
|
||||
@@ -116,21 +116,21 @@
|
||||
"candidate": {
|
||||
"tag": "2026-02-27",
|
||||
"path": "/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.json",
|
||||
"generated_at": "2026-02-27T16:45:18.490Z"
|
||||
"generated_at": "2026-02-27T17:36:02.514Z"
|
||||
},
|
||||
"baseline": null,
|
||||
"comparison": {
|
||||
"baseline": null,
|
||||
"candidate": {
|
||||
"source_event_count": 110,
|
||||
"sampled_event_count": 13,
|
||||
"source_event_count": 115,
|
||||
"sampled_event_count": 15,
|
||||
"run_total_outcomes": 2,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0,
|
||||
"cancel_latency_p95_ms": null,
|
||||
"reaction_match_rate_pct": null,
|
||||
"reaction_skip_rate_pct": null
|
||||
"reaction_match_rate_pct": 0,
|
||||
"reaction_skip_rate_pct": 100
|
||||
},
|
||||
"deltas": {
|
||||
"sampled_event_count_pct": null,
|
||||
@@ -146,7 +146,7 @@
|
||||
"freshness": {
|
||||
"enabled": true,
|
||||
"pass": true,
|
||||
"actual_age_hours": 0.33,
|
||||
"actual_age_hours": 0,
|
||||
"threshold_hours": 36
|
||||
},
|
||||
"drift_gate": {
|
||||
@@ -155,7 +155,7 @@
|
||||
{
|
||||
"criterion": "candidate_sampled_events",
|
||||
"pass": true,
|
||||
"actual": "13",
|
||||
"actual": "15",
|
||||
"threshold": ">= 10"
|
||||
},
|
||||
{
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Phase-0 Backend Drift Check
|
||||
|
||||
Generated at: 2026-02-27T17:04:49.009Z
|
||||
Generated at: 2026-02-27T17:36:02.803Z
|
||||
Artifacts: /home/will/lab/flynn/docs/plans/artifacts
|
||||
Backends: pi_embedded, native
|
||||
Freshness max age (hours): 36
|
||||
@@ -19,9 +19,9 @@ Overall gate: PASS
|
||||
## pi_embedded
|
||||
- status: PASS
|
||||
- candidate: tag=2026-02-27 file=/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.json
|
||||
- candidate generated_at: 2026-02-27T16:45:18.488Z
|
||||
- candidate generated_at: 2026-02-27T17:36:02.214Z
|
||||
- baseline: none
|
||||
- candidate snapshot: sampled=56 outcomes=25 completion=100% cancel=0% error=0% cancel_p95_ms=n/a
|
||||
- candidate snapshot: sampled=59 outcomes=26 completion=100% cancel=0% error=0% cancel_p95_ms=n/a
|
||||
- deltas:
|
||||
sampled_event_count_pct=n/a
|
||||
run_total_outcomes_pct=n/a
|
||||
@@ -31,9 +31,9 @@ Overall gate: PASS
|
||||
cancel_latency_p95_ms=n/a
|
||||
reaction_match_rate_pp=n/a
|
||||
reaction_skip_rate_pp=n/a
|
||||
- freshness gate: PASS (age_hours=0.33 threshold=36)
|
||||
- freshness gate: PASS (age_hours=0 threshold=36)
|
||||
- drift gate: PASS
|
||||
PASS candidate_sampled_events actual=56 threshold=>= 10
|
||||
PASS candidate_sampled_events actual=59 threshold=>= 10
|
||||
PASS sampled_events_drop_pct actual=n/a threshold=<= 80
|
||||
PASS run_outcomes_drop_pct actual=n/a threshold=<= 80
|
||||
PASS completion_rate_drop_pp actual=n/a threshold=<= 35
|
||||
@@ -44,9 +44,9 @@ Overall gate: PASS
|
||||
## native
|
||||
- status: PASS
|
||||
- candidate: tag=2026-02-27 file=/home/will/lab/flynn/docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.json
|
||||
- candidate generated_at: 2026-02-27T16:45:18.490Z
|
||||
- candidate generated_at: 2026-02-27T17:36:02.514Z
|
||||
- baseline: none
|
||||
- candidate snapshot: sampled=13 outcomes=2 completion=100% cancel=0% error=0% cancel_p95_ms=n/a
|
||||
- candidate snapshot: sampled=15 outcomes=2 completion=100% cancel=0% error=0% cancel_p95_ms=n/a
|
||||
- deltas:
|
||||
sampled_event_count_pct=n/a
|
||||
run_total_outcomes_pct=n/a
|
||||
@@ -56,9 +56,9 @@ Overall gate: PASS
|
||||
cancel_latency_p95_ms=n/a
|
||||
reaction_match_rate_pp=n/a
|
||||
reaction_skip_rate_pp=n/a
|
||||
- freshness gate: PASS (age_hours=0.33 threshold=36)
|
||||
- freshness gate: PASS (age_hours=0 threshold=36)
|
||||
- drift gate: PASS
|
||||
PASS candidate_sampled_events actual=13 threshold=>= 10
|
||||
PASS candidate_sampled_events actual=15 threshold=>= 10
|
||||
PASS sampled_events_drop_pct actual=n/a threshold=<= 80
|
||||
PASS run_outcomes_drop_pct actual=n/a threshold=<= 80
|
||||
PASS completion_rate_drop_pp actual=n/a threshold=<= 35
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
{
|
||||
"generated_at": "2026-02-27T16:45:18.490Z",
|
||||
"generated_at": "2026-02-27T17:36:02.514Z",
|
||||
"source_audit_path": "~/.local/share/flynn/audit.log",
|
||||
"source_event_count": 110,
|
||||
"sampled_event_count": 13,
|
||||
"source_event_count": 115,
|
||||
"sampled_event_count": 15,
|
||||
"filters": {
|
||||
"sources": [
|
||||
"channel"
|
||||
@@ -14,7 +14,7 @@
|
||||
"probe"
|
||||
],
|
||||
"anonymized_identifiers": true,
|
||||
"backend_route_event_count": 127
|
||||
"backend_route_event_count": 129
|
||||
},
|
||||
"options": {
|
||||
"sources": [
|
||||
@@ -26,10 +26,10 @@
|
||||
},
|
||||
"summary": {
|
||||
"event_counts": {
|
||||
"run_state": 13,
|
||||
"run_state": 14,
|
||||
"run_cancel": 0,
|
||||
"reaction_match": 0,
|
||||
"reaction_skip": 0
|
||||
"reaction_skip": 1
|
||||
},
|
||||
"run_outcomes": {
|
||||
"overall": {
|
||||
@@ -38,7 +38,7 @@
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 11,
|
||||
"start": 12,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
@@ -52,7 +52,7 @@
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 11,
|
||||
"start": 12,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
@@ -172,6 +172,20 @@
|
||||
"error_rate_pct": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_534570702ea5",
|
||||
"stats": {
|
||||
"total_outcomes": 0,
|
||||
"complete": 0,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 1,
|
||||
"completion_rate_pct": null,
|
||||
"cancel_rate_pct": null,
|
||||
"error_rate_pct": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_683372f346c3",
|
||||
"stats": {
|
||||
@@ -219,11 +233,17 @@
|
||||
"cancel_latency_ms": null,
|
||||
"reactions": {
|
||||
"matched": 0,
|
||||
"skipped": 0,
|
||||
"total": 0,
|
||||
"match_rate_pct": null,
|
||||
"skip_rate_pct": null,
|
||||
"skip_reasons": []
|
||||
"skipped": 1,
|
||||
"total": 1,
|
||||
"match_rate_pct": 0,
|
||||
"skip_rate_pct": 100,
|
||||
"skip_reasons": [
|
||||
{
|
||||
"reason": "no_rules",
|
||||
"count": 1,
|
||||
"pct": 100
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -11,3 +11,5 @@
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_a3f64a8e3c1e","channel":"cron","sender":"sender_a31bd6d4a95a","source":"channel","state":"start","request_id":"request_fc572d83d4c6"},"timestamp":1772182800034}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"start","request_id":"request_a3bafbb93755"},"timestamp":1772208000013}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_5ae4ad331184","channel":"cron","sender":"sender_a912a223d950","source":"channel","state":"complete","request_id":"request_a3bafbb93755","duration_ms":35239},"timestamp":1772208035252}
|
||||
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_534570702ea5","channel":"cron","sender":"sender_552aeb8f1b32","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772211600036}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_534570702ea5","channel":"cron","sender":"sender_552aeb8f1b32","source":"channel","state":"start","request_id":"request_c0a9fc76c188"},"timestamp":1772211600036}
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
# Phase 0 Baseline Telemetry Summary
|
||||
|
||||
- Run state events: 13
|
||||
- Run state events: 14
|
||||
- Run cancel events: 0
|
||||
- Reaction matches: 0
|
||||
- Reaction skips: 0
|
||||
- Reaction skips: 1
|
||||
|
||||
- Sources: channel
|
||||
|
||||
@@ -14,13 +14,13 @@
|
||||
- Cancelled: 0 (0.00%)
|
||||
- Errors: 0 (0.00%)
|
||||
- Cancel requested: 0
|
||||
- Starts: 11
|
||||
- Starts: 12
|
||||
|
||||
## Run Outcomes by Channel
|
||||
|
||||
| Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
|
||||
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
|
||||
| cron | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 11 |
|
||||
| cron | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 12 |
|
||||
|
||||
## Run Outcomes by Session
|
||||
|
||||
@@ -34,6 +34,7 @@
|
||||
| session_494cb3b392af | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
|
||||
| session_49b700741e03 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
|
||||
| session_4cd8ba5e6df5 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
|
||||
| session_534570702ea5 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
|
||||
| session_683372f346c3 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
|
||||
| session_a3f64a8e3c1e | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
|
||||
| session_ffcee254d546 | 0 | 0 | 0 | 0 | n/a | n/a | n/a | 0 | 1 |
|
||||
@@ -44,12 +45,12 @@
|
||||
|
||||
## Reaction Decisions
|
||||
|
||||
- Matched: 0 (n/a)
|
||||
- Skipped: 0 (n/a)
|
||||
- Matched: 0 (0.00%)
|
||||
- Skipped: 1 (100.00%)
|
||||
|
||||
### Skip Reasons
|
||||
|
||||
| Reason | Count | Percent |
|
||||
| --- | ---: | ---: |
|
||||
| _none_ | 0 | 0.00% |
|
||||
| no_rules | 1 | 100.00% |
|
||||
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
{
|
||||
"generated_at": "2026-02-27T16:45:18.488Z",
|
||||
"generated_at": "2026-02-27T17:36:02.214Z",
|
||||
"source_audit_path": "~/.local/share/flynn/audit.log",
|
||||
"source_event_count": 110,
|
||||
"sampled_event_count": 56,
|
||||
"source_event_count": 115,
|
||||
"sampled_event_count": 59,
|
||||
"filters": {
|
||||
"sources": [
|
||||
"channel"
|
||||
@@ -14,7 +14,7 @@
|
||||
"probe"
|
||||
],
|
||||
"anonymized_identifiers": true,
|
||||
"backend_route_event_count": 127
|
||||
"backend_route_event_count": 129
|
||||
},
|
||||
"options": {
|
||||
"sources": [
|
||||
@@ -26,19 +26,19 @@
|
||||
},
|
||||
"summary": {
|
||||
"event_counts": {
|
||||
"run_state": 42,
|
||||
"run_state": 44,
|
||||
"run_cancel": 0,
|
||||
"reaction_match": 0,
|
||||
"reaction_skip": 14
|
||||
"reaction_skip": 15
|
||||
},
|
||||
"run_outcomes": {
|
||||
"overall": {
|
||||
"total_outcomes": 25,
|
||||
"complete": 25,
|
||||
"total_outcomes": 26,
|
||||
"complete": 26,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 17,
|
||||
"start": 18,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
@@ -47,12 +47,12 @@
|
||||
{
|
||||
"key": "gmail",
|
||||
"stats": {
|
||||
"total_outcomes": 25,
|
||||
"complete": 25,
|
||||
"total_outcomes": 26,
|
||||
"complete": 26,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 17,
|
||||
"start": 18,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
@@ -102,6 +102,20 @@
|
||||
"error_rate_pct": 0
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_f6304f25e43b",
|
||||
"stats": {
|
||||
"total_outcomes": 2,
|
||||
"complete": 2,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 2,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_33469de5a1ee",
|
||||
"stats": {
|
||||
@@ -284,20 +298,6 @@
|
||||
"error_rate_pct": 0
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_f6304f25e43b",
|
||||
"stats": {
|
||||
"total_outcomes": 1,
|
||||
"complete": 1,
|
||||
"cancelled": 0,
|
||||
"error": 0,
|
||||
"cancel_requested": 0,
|
||||
"start": 1,
|
||||
"completion_rate_pct": 100,
|
||||
"cancel_rate_pct": 0,
|
||||
"error_rate_pct": 0
|
||||
}
|
||||
},
|
||||
{
|
||||
"key": "session_fd6536fa5ff4",
|
||||
"stats": {
|
||||
@@ -317,14 +317,14 @@
|
||||
"cancel_latency_ms": null,
|
||||
"reactions": {
|
||||
"matched": 0,
|
||||
"skipped": 14,
|
||||
"total": 14,
|
||||
"skipped": 15,
|
||||
"total": 15,
|
||||
"match_rate_pct": 0,
|
||||
"skip_rate_pct": 100,
|
||||
"skip_reasons": [
|
||||
{
|
||||
"reason": "no_rules",
|
||||
"count": 14,
|
||||
"count": 15,
|
||||
"pct": 100
|
||||
}
|
||||
]
|
||||
|
||||
@@ -54,3 +54,6 @@
|
||||
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_2f2f1e414e81","channel":"gmail","sender":"sender_323cedc3233a","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772206157229}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_2f2f1e414e81","channel":"gmail","sender":"sender_323cedc3233a","source":"channel","state":"start","request_id":"request_ab73d670c119"},"timestamp":1772206157229}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_2f2f1e414e81","channel":"gmail","sender":"sender_323cedc3233a","source":"channel","state":"complete","request_id":"request_ab73d670c119","duration_ms":3850},"timestamp":1772206161079}
|
||||
{"level":"debug","event_type":"reaction.skip","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","reason":"no_rules","candidate_count":0},"timestamp":1772211257454}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","state":"start","request_id":"request_607c64c2760f"},"timestamp":1772211257454}
|
||||
{"level":"info","event_type":"run.state","event":{"session_id":"session_f6304f25e43b","channel":"gmail","sender":"sender_311c7608cc58","source":"channel","state":"complete","request_id":"request_607c64c2760f","duration_ms":3870},"timestamp":1772211261324}
|
||||
|
||||
@@ -1,26 +1,26 @@
|
||||
# Phase 0 Baseline Telemetry Summary
|
||||
|
||||
- Run state events: 42
|
||||
- Run state events: 44
|
||||
- Run cancel events: 0
|
||||
- Reaction matches: 0
|
||||
- Reaction skips: 14
|
||||
- Reaction skips: 15
|
||||
|
||||
- Sources: channel
|
||||
|
||||
## Run Outcomes (Overall)
|
||||
|
||||
- Total outcomes: 25
|
||||
- Complete: 25 (100.00%)
|
||||
- Total outcomes: 26
|
||||
- Complete: 26 (100.00%)
|
||||
- Cancelled: 0 (0.00%)
|
||||
- Errors: 0 (0.00%)
|
||||
- Cancel requested: 0
|
||||
- Starts: 17
|
||||
- Starts: 18
|
||||
|
||||
## Run Outcomes by Channel
|
||||
|
||||
| Channel | Outcomes | Complete | Cancelled | Error | Complete % | Cancel % | Error % | Cancel Req | Starts |
|
||||
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
|
||||
| gmail | 25 | 25 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 17 |
|
||||
| gmail | 26 | 26 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 18 |
|
||||
|
||||
## Run Outcomes by Session
|
||||
|
||||
@@ -29,6 +29,7 @@
|
||||
| session_2f2f1e414e81 | 5 | 5 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 5 |
|
||||
| session_f4d8ddc04194 | 3 | 3 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 3 |
|
||||
| session_eabc3c2a91b9 | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_f6304f25e43b | 2 | 2 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 2 |
|
||||
| session_33469de5a1ee | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_3ffb2e631ab1 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 0 |
|
||||
| session_4d9e843358a3 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 0 |
|
||||
@@ -42,7 +43,6 @@
|
||||
| session_cb9a69d8a362 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_e0a2a17b7329 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_ea839415979e | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_f6304f25e43b | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
| session_fd6536fa5ff4 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% | 0 | 1 |
|
||||
|
||||
## Cancel Latency
|
||||
@@ -52,11 +52,11 @@
|
||||
## Reaction Decisions
|
||||
|
||||
- Matched: 0 (0.00%)
|
||||
- Skipped: 14 (100.00%)
|
||||
- Skipped: 15 (100.00%)
|
||||
|
||||
### Skip Reasons
|
||||
|
||||
| Reason | Count | Percent |
|
||||
| --- | ---: | ---: |
|
||||
| no_rules | 14 | 100.00% |
|
||||
| no_rules | 15 | 100.00% |
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
{
|
||||
"generated_at": "2026-02-27T16:46:42.880Z",
|
||||
"generated_at": "2026-02-27T17:36:01.922Z",
|
||||
"source_audit_path": "~/.local/share/flynn/audit.log",
|
||||
"source_event_count": 6,
|
||||
"sampled_event_count": 6,
|
||||
|
||||
+31
-1
@@ -234,6 +234,36 @@
|
||||
],
|
||||
"test_status": "pnpm audit:phase0-baseline:live:drift + pnpm test:run src/audit/phase0BaselineDrift.test.ts + pnpm typecheck passing"
|
||||
},
|
||||
"phase0-live-baseline-refresh-full-window": {
|
||||
"status": "completed",
|
||||
"date": "2026-02-27",
|
||||
"updated": "2026-02-27",
|
||||
"summary": "Expanded `pnpm audit:phase0-baseline:live:refresh` to regenerate all live windows in one command (channel, gateway, backend-scoped `pi_embedded`, backend-scoped `native`) so scheduled `refresh:drift` runs keep backend artifacts fresh for baseline-vs-prior comparisons.",
|
||||
"files_modified": [
|
||||
"package.json",
|
||||
"README.md",
|
||||
"docs/api/PROTOCOL.md",
|
||||
"docs/architecture/AGENT_DIAGRAM.md",
|
||||
"docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md",
|
||||
"docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_2026-02-27.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_gateway_2026-02-27.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_pi_embedded_2026-02-27.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.jsonl",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_native_2026-02-27.json",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_drift_2026-02-27.md",
|
||||
"docs/plans/artifacts/phase0_baseline_live_backend_drift_2026-02-27.json",
|
||||
"docs/plans/state.json"
|
||||
],
|
||||
"test_status": "pnpm audit:phase0-baseline:live:refresh:drift + pnpm test:run src/audit/phase0BaselineDrift.test.ts + pnpm typecheck passing"
|
||||
},
|
||||
"phase0-instrumentation-ticket-checklist": {
|
||||
"status": "completed",
|
||||
"date": "2026-02-25",
|
||||
@@ -7420,7 +7450,7 @@
|
||||
"deeper_surfaces_phase0_ticket_03": "completed — gateway metrics now track run-state outcomes, cancel latency samples, and reaction decision counters with routing/gateway emitters",
|
||||
"deeper_surfaces_phase0_ticket_04": "completed — added phase-0 baseline summary tooling for run outcomes, cancel latency, and reaction decisions with markdown/json CLI output",
|
||||
"deeper_surfaces_phase0_ticket_05": "completed — documented phase-0 telemetry fields/workflow, refreshed architecture/protocol docs, generated anonymized live baseline artifacts for channel/gateway/backend-scoped (pi/native) windows, and added backend artifact freshness/drift gates with persisted drift reports (`phase0_baseline_live_backend_drift_<UTC-date>.{md,json}`)",
|
||||
"next_up": "Run scheduled `pnpm audit:phase0-baseline:live:refresh:drift` in each active environment and collect at least one additional UTC-date drift artifact so baseline-vs-prior comparisons become active before tightening thresholds or changing additional run-control/reaction semantics.",
|
||||
"next_up": "Run scheduled `pnpm audit:phase0-baseline:live:refresh:drift` in each active environment (now refreshing channel + gateway + backend-scoped windows together) and collect at least one additional UTC-date drift artifact so baseline-vs-prior comparisons become active before tightening thresholds or changing additional run-control/reaction semantics.",
|
||||
"pi_embedded_canary_spike": "completed — added optional pi_embedded backend adapter, canary-safe no-tools routing guard, backend success/fallback latency audit events, and docs/diagram updates while native remains default",
|
||||
"pi_embedded_evaluation_phase": "completed — final decision rollback (applied in runtime config): Window A failed latency/fallback gates (p50 +259ms, p95 +5695ms, fallback 25%, categories: pi_module_interface/empty_assistant_text); Window B remained sample-insufficient; controlled probes verified guard coverage (pi_no_tools_mode/capability_query/attachments_present each hit once)",
|
||||
"pi_embedded_manual_mode": "completed — added persisted runtime backend controls for manual Pi activation/deactivation (`/runtime` preferred, `/backend` alias; `status`, `activate pi`, `deactivate pi`, `use config`) while keeping config-driven default routing",
|
||||
|
||||
Reference in New Issue
Block a user