# Flynn Personal Assistant Productization Plan (Post-Gap Rebaseline) Date: 2026-02-26 Status: completed roadmap (2026-02-27) Scope: ship the remaining product-layer capabilities that make Flynn feel like a daily autonomous personal assistant ## Rebaseline (What Is Already Done) Completion update (2026-02-27): all roadmap phases are now implemented, including companion app-surface packaging outputs (bootstrap export, signed/verified release bundles, reference app starter surfaces including a runnable macOS menu-bar Swift Package scaffold plus iOS/Android runtime shell skeletons, and CI artifact workflow), voice reliability, browser workflow reliability, and onboarding first-success improvements. The following were previously treated as gaps but are already implemented in Flynn: 1. Broader channels are present (`src/channels/index.ts` includes Signal, Matrix, Teams, BlueBubbles, LINE, Feishu, Zalo, Mattermost, Google Chat). 2. Guided onboarding is present (`flynn setup` and `flynn onboard` in `src/cli/setup.ts` and `src/cli/onboard.ts`). 3. Browser automation baseline is present (`browser.navigate/click/type/screenshot/content/eval` in `src/tools/builtin/browser/tools.ts`). 4. Companion protocol/runtime foundation is present (`src/companion/runtimeClient.ts`, `src/companion/platformClients.ts`). 5. Talk mode + wake phrase baseline is present (`src/daemon/routing.ts`, `audio.talk_mode` schema support). 6. Subagent sessions now include queue/budget controls, transcript export, and session inspection UX (`subagent.*`, `/subagents`). ## Remaining Product Gaps (Now) 1. No shipped end-user companion apps (desktop/mobile) despite protocol readiness. 2. Voice UX is functional but not yet a polished, end-to-end daily-driver experience across surfaces. 3. Browser tools exist but lack task-level reliability primitives (checkpoints/retries/guardrails) for autonomous workflows. 4. Onboarding lacks a "first success" guided path that validates real integrations live during setup. ## Product Goal Within 8-10 weeks, ship a stable "Personal Assistant Mode" that supports: 1. Always-available access from primary user surfaces (desktop + mobile companion path). 2. Reliable hands-free capture/reply loops with graceful degradation. 3. Safe browser-assisted task execution for common personal workflows. 4. First-run setup that gets a new user from zero to first successful automated task in under 15 minutes. ## Success Metrics 1. `activation_rate_7d` (setup complete -> first automated action within 7 days): >= 65%. 2. Companion reconnect success in soak: >= 99.5%. 3. Voice turn completion (reply delivered as audio or text fallback): >= 99.9%. 4. Browser workflow completion on canonical scripts: >= 90% with explicit failure reasons. 5. Time-to-first-successful-automation from `flynn setup`: median <= 15 minutes. ## Roadmap ## Phase 1 (Weeks 1-2): Companion MVP Surfaces (Desktop First) ### Deliverables 1. Ship a macOS menu-bar companion reference app using existing runtime protocol. 2. Ship a minimal mobile companion shell (iOS + Android) for registration, status, push token, and message handoff. 3. Add signed release artifacts and installation docs. Status update (2026-02-27): companion bootstrap-manifest export is now available via `flynn companion --export-bootstrap ` as a packaging contract for desktop/mobile shells, `flynn companion --export-release-bundle ` now emits bundle artifacts (bootstrap JSON + launcher + README + `CHECKSUMS.sha256` + `RELEASE_MANIFEST.json`, optional `CHECKSUMS.sha256.sig` with `--signing-key`), `flynn companion --verify-release-bundle ` now validates checksum/signature artifacts before install, `pnpm companion:bundle -- --output ...` now provides one-pass build-and-verify automation, `.github/workflows/companion-release-bundle.yml` provides CI artifact build/verify/upload, `flynn companion --export-shell-template ` now emits macOS/iOS/Android starter shell templates including iOS/Android runtime skeletons for register/status/location/push/handoff flows, `pnpm companion:reference-apps` now regenerates in-repo macOS/iOS/Android reference app starter directories plus `apps/companion/macos-app` runnable menu-bar scaffold, and `flynn companion` supports one-shot status/location/push bootstrap flags (`--app-version`, `--latitude/--longitude`, `--push-token`) so thin shells can initialize companion metadata in a single run. ### Implementation Anchors 1. `src/companion/runtimeClient.ts` 2. `src/companion/platformClients.ts` 3. `src/cli/companion.ts` 4. `src/gateway/handlers/node.ts` 5. `docs/api/PROTOCOL.md` ### Tests 1. Extend `src/companion/platformClients.integration.test.ts` for reconnect, background wake, and token refresh flows. 2. Add end-to-end gateway fixture tests for node lifecycle transitions and push-token updates. ### Exit Criteria 1. A user can install companion, pair, receive assistant response notifications, and reopen after disconnect without manual repair. ## Phase 2 (Weeks 3-4): Voice Daily-Driver Reliability ### Deliverables 1. Unify wake/talk controls across TUI, gateway UI, and companion surfaces. 2. Add robust TTS provider fallback policy with per-provider health tracking. 3. Add interruption-safe voice run control (cancel/replace behavior) consistent with text runs. ### Implementation Anchors 1. `src/daemon/routing.ts` 2. `src/models/tts.ts` 3. `src/gateway/handlers/agent.ts` 4. `src/gateway/protocol.ts` 5. `src/gateway/ui/pages/chat.js` ### Tests 1. Expand `src/daemon/routing.test.ts` with concurrent voice/text interruption cases. 2. Add provider-failure matrix tests for TTS fallback. 3. Add protocol/UI tests for voice run-state rendering. ### Exit Criteria 1. No dropped assistant replies when voice synthesis fails; response falls back to text deterministically. ## Phase 3 (Weeks 5-7): Browser Task Automation Reliability Layer ### Deliverables 1. Add browser workflow primitives: `browser.wait_for`, `browser.assert`, `browser.extract`, and retry wrappers. 2. Add task checkpoints with resumable execution state for long workflows. 3. Add guardrails: domain allowlists, explicit high-risk confirmation hooks, and bounded execution budgets. ### Implementation Anchors 1. `src/tools/builtin/browser/tools.ts` 2. `src/tools/executor.ts` 3. `src/tools/policy.ts` 4. `src/config/schema.ts` 5. `src/hooks/*` ### Tests 1. Add deterministic browser tool tests for retry/checkpoint/error classification. 2. Add policy tests for domain budget/confirm behavior. 3. Add integration tests that replay canonical user tasks (form fill, booking-like flow, account portal navigation). ### Exit Criteria 1. Canonical browser workflows pass at >= 90% in CI replay suite with auditable failures. ## Phase 4 (Weeks 8-10): Onboarding 2.0 + First-Success Funnel ### Deliverables 1. Add "Personal Assistant Mode" wizard preset focused on practical defaults. 2. Add live connectivity checks during setup (model, channel, memory, automation). 3. Add a post-setup guided first task that confirms end-to-end assistant operation. ### Implementation Anchors 1. `src/cli/setup.ts` 2. `src/cli/onboard.ts` 3. `src/cli/doctor.ts` 4. `README.md` 5. `docs/architecture/AGENT_DIAGRAM.md` 6. `docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md` 7. `docs/api/PROTOCOL.md` ### Tests 1. Extend `src/cli/setup/integration.test.ts` with live-check and first-task paths. 2. Add regression coverage for failed-check remediation prompts. ### Exit Criteria 1. New user reaches first successful automated task from clean install in median <= 15 minutes. ## Cross-Cutting Controls (All Phases) 1. Feature flags for each phase with canary rollout and rollback paths. 2. Audit event coverage for all new autonomous/voice/browser behaviors. 3. Every implementation PR must include tests, docs, and `docs/plans/state.json` updates. 4. Diagram review/update is mandatory for code changes affecting flow semantics. ## Execution Order and Parallelization 1. Run Phase 1 and Phase 2 in parallel after finalizing shared run-state contracts. 2. Start Phase 3 once Phase 2 interruption semantics are stable. 3. Start Phase 4 onboarding work in parallel with late Phase 3 once APIs stabilize. ## Top Risks and Mitigations 1. Risk: companion client reliability drifts from gateway contract. Mitigation: contract tests pinned to `docs/api/PROTOCOL.md` event schema. 2. Risk: voice experience appears flaky under provider variability. Mitigation: deterministic fallback policy + provider health scoring + explicit user-visible degrade messaging. 3. Risk: browser autonomy creates brittle flows. Mitigation: checkpoint/retry primitives, strict policy defaults, and explicit risk confirmation hooks. 4. Risk: roadmap spread too wide. Mitigation: desktop-first companion scope, canonical-task suite, and hard phase exit gates before expansion. ## Definition of Done (Roadmap Complete) 1. Companion desktop and mobile shells are shippable with documented install/run paths. 2. Voice and text paths share deterministic run-control semantics and pass reliability gates. 3. Browser automations run through a resilient workflow layer, not raw primitive chaining only. 4. Onboarding produces measurable first-success outcomes and reduced drop-off.