9.1 KiB
Flynn Personal Assistant Productization Plan (Post-Gap Rebaseline)
Date: 2026-02-26
Status: completed roadmap (2026-02-27)
Scope: ship the remaining product-layer capabilities that make Flynn feel like a daily autonomous personal assistant
Rebaseline (What Is Already Done)
Completion update (2026-02-27): all roadmap phases are now implemented, including companion app-surface packaging outputs (bootstrap export, signed/verified release bundles, reference app starter surfaces including a runnable macOS menu-bar Swift Package scaffold plus iOS/Android runtime shell skeletons, and CI artifact workflow), voice reliability, browser workflow reliability, and onboarding first-success improvements.
The following were previously treated as gaps but are already implemented in Flynn:
- Broader channels are present (
src/channels/index.tsincludes Signal, Matrix, Teams, BlueBubbles, LINE, Feishu, Zalo, Mattermost, Google Chat). - Guided onboarding is present (
flynn setupandflynn onboardinsrc/cli/setup.tsandsrc/cli/onboard.ts). - Browser automation baseline is present (
browser.navigate/click/type/screenshot/content/evalinsrc/tools/builtin/browser/tools.ts). - Companion protocol/runtime foundation is present (
src/companion/runtimeClient.ts,src/companion/platformClients.ts). - Talk mode + wake phrase baseline is present (
src/daemon/routing.ts,audio.talk_modeschema support). - Subagent sessions now include queue/budget controls, transcript export, and session inspection UX (
subagent.*,/subagents).
Remaining Product Gaps (Now)
- No shipped end-user companion apps (desktop/mobile) despite protocol readiness.
- Voice UX is functional but not yet a polished, end-to-end daily-driver experience across surfaces.
- Browser tools exist but lack task-level reliability primitives (checkpoints/retries/guardrails) for autonomous workflows.
- Onboarding lacks a "first success" guided path that validates real integrations live during setup.
Product Goal
Within 8-10 weeks, ship a stable "Personal Assistant Mode" that supports:
- Always-available access from primary user surfaces (desktop + mobile companion path).
- Reliable hands-free capture/reply loops with graceful degradation.
- Safe browser-assisted task execution for common personal workflows.
- First-run setup that gets a new user from zero to first successful automated task in under 15 minutes.
Success Metrics
activation_rate_7d(setup complete -> first automated action within 7 days): >= 65%.- Companion reconnect success in soak: >= 99.5%.
- Voice turn completion (reply delivered as audio or text fallback): >= 99.9%.
- Browser workflow completion on canonical scripts: >= 90% with explicit failure reasons.
- Time-to-first-successful-automation from
flynn setup: median <= 15 minutes.
Roadmap
Phase 1 (Weeks 1-2): Companion MVP Surfaces (Desktop First)
Deliverables
- Ship a macOS menu-bar companion reference app using existing runtime protocol.
- Ship a minimal mobile companion shell (iOS + Android) for registration, status, push token, and message handoff.
- Add signed release artifacts and installation docs.
Status update (2026-02-27): companion bootstrap-manifest export is now available via flynn companion --export-bootstrap <path|-> as a packaging contract for desktop/mobile shells, flynn companion --export-release-bundle <dir> now emits bundle artifacts (bootstrap JSON + launcher + README + CHECKSUMS.sha256 + RELEASE_MANIFEST.json, optional CHECKSUMS.sha256.sig with --signing-key), flynn companion --verify-release-bundle <dir> now validates checksum/signature artifacts before install, pnpm companion:bundle -- --output <dir> ... now provides one-pass build-and-verify automation, .github/workflows/companion-release-bundle.yml provides CI artifact build/verify/upload, flynn companion --export-shell-template <dir> now emits macOS/iOS/Android starter shell templates including iOS/Android runtime skeletons for register/status/location/push/handoff flows, pnpm companion:reference-apps now regenerates in-repo macOS/iOS/Android reference app starter directories plus apps/companion/macos-app runnable menu-bar scaffold, and flynn companion supports one-shot status/location/push bootstrap flags (--app-version, --latitude/--longitude, --push-token) so thin shells can initialize companion metadata in a single run.
Implementation Anchors
src/companion/runtimeClient.tssrc/companion/platformClients.tssrc/cli/companion.tssrc/gateway/handlers/node.tsdocs/api/PROTOCOL.md
Tests
- Extend
src/companion/platformClients.integration.test.tsfor reconnect, background wake, and token refresh flows. - Add end-to-end gateway fixture tests for node lifecycle transitions and push-token updates.
Exit Criteria
- A user can install companion, pair, receive assistant response notifications, and reopen after disconnect without manual repair.
Phase 2 (Weeks 3-4): Voice Daily-Driver Reliability
Deliverables
- Unify wake/talk controls across TUI, gateway UI, and companion surfaces.
- Add robust TTS provider fallback policy with per-provider health tracking.
- Add interruption-safe voice run control (cancel/replace behavior) consistent with text runs.
Implementation Anchors
src/daemon/routing.tssrc/models/tts.tssrc/gateway/handlers/agent.tssrc/gateway/protocol.tssrc/gateway/ui/pages/chat.js
Tests
- Expand
src/daemon/routing.test.tswith concurrent voice/text interruption cases. - Add provider-failure matrix tests for TTS fallback.
- Add protocol/UI tests for voice run-state rendering.
Exit Criteria
- No dropped assistant replies when voice synthesis fails; response falls back to text deterministically.
Phase 3 (Weeks 5-7): Browser Task Automation Reliability Layer
Deliverables
- Add browser workflow primitives:
browser.wait_for,browser.assert,browser.extract, and retry wrappers. - Add task checkpoints with resumable execution state for long workflows.
- Add guardrails: domain allowlists, explicit high-risk confirmation hooks, and bounded execution budgets.
Implementation Anchors
src/tools/builtin/browser/tools.tssrc/tools/executor.tssrc/tools/policy.tssrc/config/schema.tssrc/hooks/*
Tests
- Add deterministic browser tool tests for retry/checkpoint/error classification.
- Add policy tests for domain budget/confirm behavior.
- Add integration tests that replay canonical user tasks (form fill, booking-like flow, account portal navigation).
Exit Criteria
- Canonical browser workflows pass at >= 90% in CI replay suite with auditable failures.
Phase 4 (Weeks 8-10): Onboarding 2.0 + First-Success Funnel
Deliverables
- Add "Personal Assistant Mode" wizard preset focused on practical defaults.
- Add live connectivity checks during setup (model, channel, memory, automation).
- Add a post-setup guided first task that confirms end-to-end assistant operation.
Implementation Anchors
src/cli/setup.tssrc/cli/onboard.tssrc/cli/doctor.tsREADME.mddocs/architecture/AGENT_DIAGRAM.mddocs/architecture/GATEWAY_SESSIONS_AND_QUEUE.mddocs/api/PROTOCOL.md
Tests
- Extend
src/cli/setup/integration.test.tswith live-check and first-task paths. - Add regression coverage for failed-check remediation prompts.
Exit Criteria
- New user reaches first successful automated task from clean install in median <= 15 minutes.
Cross-Cutting Controls (All Phases)
- Feature flags for each phase with canary rollout and rollback paths.
- Audit event coverage for all new autonomous/voice/browser behaviors.
- Every implementation PR must include tests, docs, and
docs/plans/state.jsonupdates. - Diagram review/update is mandatory for code changes affecting flow semantics.
Execution Order and Parallelization
- Run Phase 1 and Phase 2 in parallel after finalizing shared run-state contracts.
- Start Phase 3 once Phase 2 interruption semantics are stable.
- Start Phase 4 onboarding work in parallel with late Phase 3 once APIs stabilize.
Top Risks and Mitigations
- Risk: companion client reliability drifts from gateway contract.
Mitigation: contract tests pinned todocs/api/PROTOCOL.mdevent schema. - Risk: voice experience appears flaky under provider variability.
Mitigation: deterministic fallback policy + provider health scoring + explicit user-visible degrade messaging. - Risk: browser autonomy creates brittle flows.
Mitigation: checkpoint/retry primitives, strict policy defaults, and explicit risk confirmation hooks. - Risk: roadmap spread too wide.
Mitigation: desktop-first companion scope, canonical-task suite, and hard phase exit gates before expansion.
Definition of Done (Roadmap Complete)
- Companion desktop and mobile shells are shippable with documented install/run paths.
- Voice and text paths share deterministic run-control semantics and pass reliability gates.
- Browser automations run through a resilient workflow layer, not raw primitive chaining only.
- Onboarding produces measurable first-success outcomes and reduced drop-off.