docs(plan): add pi canary evaluation gate and tracking (diagram review no topology change)

This commit is contained in:
William Valentin
2026-02-23 22:26:35 -08:00
parent afddd1ba7a
commit 4f88e047fd
7 changed files with 153 additions and 2 deletions
+13
View File
@@ -363,6 +363,19 @@ backends:
`pi_embedded` is intended for canary migration cohorts. In spike mode (`no_tools_mode: true`), Flynn keeps tool-oriented turns on native and only routes plain-text turns to Pi.
To evaluate canary performance from audit logs, run:
```bash
pnpm audit:backend-canary \
--audit ~/.local/share/flynn/audit.log \
--backend pi_embedded \
--baseline native \
--session telegram:8367012007 \
--format markdown
```
Phase-2 evaluation checklist and decision template: `docs/plans/pi_embedded_evaluation.md`.
When `args` is non-empty:
- use `{prompt}` in an argument to inject the full generated prompt directly into argv.
- if `{prompt}` is not present, Flynn appends backend-specific prompt args.