8.9 KiB
8.9 KiB
Flynn Deeper End-User Surfaces + Integrated Behavior Stack Plan
Date: 2026-02-25
Status: proposed roadmap
Scope: deepen assistant "product feel" (behavior semantics + user-facing surfaces) without rewriting Flynn core architecture
Summary
This plan adopts a balanced hybrid strategy:
- Improve behavior semantics first where correctness risk is highest (interrupt/cancel/run control).
- In parallel, ship selective deeper user surfaces (companion, canvas persistence, voice continuity).
- Land each slice with explicit observability gates so rollout decisions are data-driven.
Why This Plan
Flynn already has strong foundations:
- Queue + session orchestration:
src/gateway/lane-queue.ts,src/gateway/session-bridge.ts - Multi-path routing and backend fallback:
src/daemon/routing.ts - Companion RPC foundation:
src/companion/runtimeClient.ts,src/gateway/protocol.ts - Canvas API baseline:
src/gateway/handlers/canvas.ts,src/gateway/canvas-store.ts - Voice in/out primitives:
src/models/media.ts,src/models/tts.ts - Reactions baseline:
src/automation/reactions.ts
Largest remaining gap vs OpenClaw-like "assistant feel" is integration behavior across those systems, not missing foundational architecture.
Goals and Success Criteria
- Deterministic active-run control under bursty traffic
- Rich, safe proactive behavior stack
- Durable end-user surfaces for companion/canvas/voice
- Measurable reliability improvements across canary phases
Quantitative success gates:
- Cancel-to-ack p95 <= 500ms on gateway sessions.
- Duplicate assistant responses caused by run preemption: 0 in integration tests.
- Reaction false-positive rate <= 3% in canary logs.
- Companion reconnect success >= 99% in soak tests.
- Canvas artifact persistence survives daemon restart in integration tests.
- Voice failures degrade to text-only replies with no dropped responses.
Out of Scope
- Full native macOS/iOS/Android app suite in this phase set.
- Broad protocol redesign or protocol-version breaking changes.
- Pi backend expansion (kept separate from this roadmap until re-approval).
Workstreams and Complexity
| Workstream | Complexity | Main Risk |
|---|---|---|
| Run-control semantics unification | High | race conditions and cancellation ordering |
| Reactions + proactive behavior v2 | Medium-High | noisy or looping automation |
| Companion + canvas + voice deepening | High | cross-surface consistency and restart behavior |
| Rollout hardening + observability | Medium | incomplete canary signals |
Phase 0 - Baseline Instrumentation and Guardrails
Duration: 3-5 days
Execution checklist:
docs/plans/2026-02-25-phase0-instrumentation-ticket-checklist.md
Deliverables
- Add baseline event and metric instrumentation for:
- run-state transitions,
- cancellation path timings,
- reaction match/skip reasons,
- surface delivery outcomes.
- Define canary gate calculator inputs for each later phase.
Files
src/audit/types.tssrc/audit/logger.tssrc/gateway/metrics.tsdocs/api/PROTOCOL.md(event semantics if needed)docs/plans/state.json
Acceptance
- Baseline report generated before behavior changes.
- No runtime behavior changes in this phase, only observability.
Phase 1 - Run-Control Semantics + Gateway UX Signals
Duration: 2-3 weeks
Objectives
- Enforce "latest wins" semantics when queue policy is
interrupt. - Align cancellation behavior between gateway and channel router paths.
- Expose user-visible run lifecycle state in gateway events/UI.
Implementation
- Queue semantics hardening:
src/gateway/lane-queue.ts- ensure queued + active behavior rules are explicit and testable.
- Active run cancellation wiring:
src/gateway/handlers/agent.tssrc/gateway/session-bridge.tssrc/daemon/routing.ts(activeRunsparity behavior).
- Event surface:
- add additive
run_stateevent insrc/gateway/protocol.ts - consume/render in
src/gateway/ui/pages/chat.js.
- add additive
Test Plan
src/gateway/lane-queue.test.ts: preemption ordering, overflow with interrupt, debounce edge cases.src/gateway/handlers/agent.test.ts: interrupt + active cancel + queued supersede flows.src/daemon/routing.test.ts: channel-path cancellation parity.src/gateway/ui/pages/chat.test.ts:run_staterendering and transitions.
Acceptance
- Cancel-to-ack p95 <= 500ms.
- Zero duplicate final responses in integration suite.
- Backward compatibility for clients ignoring
run_state.
Phase 2 - Reactions and Proactive Behavior Stack V2
Duration: 2 weeks
Objectives
- Replace first-match reaction behavior with deterministic priority + cooldown semantics.
- Keep announce delivery safe and auditable.
- Prevent recursion/looping behavior.
Config/API Additions (backward-compatible)
Extend automation.reactions[] in src/config/schema.ts with:
priority(number, default100)cooldown_ms(number, default0)stop_on_match(boolean, defaulttrue)
Existing fields remain valid and unchanged.
Implementation
- Reaction engine expansion:
src/automation/reactions.ts(or splitreactionEngine.tsif needed).
- Routing integration:
src/daemon/routing.tsdeterministic reaction resolution.
- Delivery consistency:
src/automation/cron.tssrc/automation/webhooks.ts- preserve
delivery_modesemantics and audit metadata.
Test Plan
src/automation/reactions.test.ts:- priority conflict resolution,
- cooldown suppression,
- metadata and template rendering.
src/daemon/routing.test.ts:- reaction trigger integration and command-path exclusion.
src/automation/cron.test.ts/src/automation/webhooks.test.ts:- announce/isolation metadata correctness.
Acceptance
- False-positive match rate <= 3% in canary.
- No reaction recursion loops.
- Deterministic rule selection under overlap.
Phase 3 - Deeper Surfaces: Companion, Canvas Durability, Voice Continuity
Duration: 3-4 weeks
Objectives
- Upgrade companion from heartbeat-only utility to reliable daily-use surface.
- Make canvas artifacts durable across restart.
- Improve voice continuity behavior around cancellation and fallbacks.
Implementation
- Companion hardening:
src/cli/companion.tssrc/companion/runtimeClient.tssrc/gateway/handlers/node.ts- focus on reconnect and subscription resilience.
- Canvas persistence:
src/gateway/canvas-store.ts(durable backing instead of in-memory only)src/gateway/handlers/canvas.ts- UI rendering/inspection in
src/gateway/ui/pages/chat.js(or dedicated canvas page).
- Voice continuity:
src/daemon/routing.ts(talk-mode + cancellation + output behavior)src/models/tts.ts- channel adapter output checks where required.
Test Plan
- Companion integration tests for reconnect and event continuity.
- Canvas store tests for restart durability and eviction policy.
- Voice tests for TTS errors, fallback to text, and interrupted runs.
Acceptance
- Companion reconnect success >= 99% in soak.
- Canvas survives daemon restart in integration suite.
- Voice path never drops assistant reply when TTS fails.
Phase 4 - Rollout, Hardening, and Operator Readiness
Duration: 1 week
Deliverables
- Canary rollout plan by feature flag/surface.
- Explicit rollback playbook.
- Operator docs and architecture/protocol docs synchronized.
Documentation Updates (required in same PRs)
README.mddocs/api/PROTOCOL.mddocs/architecture/AGENT_DIAGRAM.mddocs/architecture/GATEWAY_SESSIONS_AND_QUEUE.mddocs/plans/state.json
Execution and Commit Structure
- Branch:
feature/deeper-surfaces-integrated-behavior-stack - Atomic commits per slice:
- implementation + tests + docs +
state.jsontogether
- implementation + tests + docs +
- Rebase onto
mainbefore merge. - Fast-forward merge only.
Model-Tier Delegation Plan for Implementation Work
claude-haiku-4.5:- mechanical schema/test/doc updates.
claude-sonnet-4.6:- default implementation tasks across queue/routing/companion/canvas.
claude-opus-4.6:- concurrency semantics review, failure-mode design, and rollout gate design.
Risks and Mitigations
- Risk: preemption races create duplicate or orphaned replies.
Mitigation: run-state event model + deterministic cancellation tests. - Risk: proactive rules become noisy.
Mitigation: priority/cooldown/stop semantics + canary thresholds. - Risk: deeper surfaces drift from core behavior semantics.
Mitigation: shared gateway protocol contracts and integration tests across surfaces.
Default Decisions Locked
- Keep gateway protocol backward-compatible (additive only).
- Prioritize behavior reliability before broadening platform count.
- Use companion CLI/runtime path as first deep-surface target.
- Keep Pi expansion out of this roadmap until separate canary re-approval.