diff --git a/.planning/PROJECT.md b/.planning/PROJECT.md new file mode 100644 index 0000000..80c7488 --- /dev/null +++ b/.planning/PROJECT.md @@ -0,0 +1,73 @@ +# Flynn — Operator DX Milestone + +## What This Is + +A focused quality-of-life milestone for Flynn's operator (you). Flynn is a mature multi-channel AI assistant daemon with 10 model providers, 5 channel adapters, 40+ tools, a web dashboard, automation, sandboxing, and 1077 passing tests. This milestone targets the three biggest friction points in developing and operating Flynn: the monolithic daemon wiring file, lack of multi-environment config management, and limited runtime observability. + +## Core Value + +Make Flynn easier to reason about, configure, and monitor — so that adding features and diagnosing issues takes minutes, not hours. + +## Requirements + +### Validated + +- ✓ Multi-channel AI assistant daemon with tool loop — existing +- ✓ 10 model providers (Anthropic, OpenAI, Gemini, Bedrock, GitHub, OpenRouter, Zhipu, xAI, Ollama, llama.cpp) — existing +- ✓ 5 channel adapters (Telegram, Discord, Slack, WhatsApp, WebChat) — existing +- ✓ WebSocket gateway with JSON-RPC protocol — existing +- ✓ Web dashboard SPA (dashboard, chat, sessions, settings, usage) — existing +- ✓ YAML config with Zod validation and env var expansion — existing +- ✓ SQLite session persistence with TTL pruning — existing +- ✓ Memory system with hybrid keyword + vector search — existing +- ✓ Docker sandboxing per session — existing +- ✓ Multi-agent routing with per-agent config — existing +- ✓ Automation: cron, webhooks, heartbeat, Gmail watcher — existing +- ✓ MCP tool server integration — existing +- ✓ Skills system (bundled/managed/workspace) — existing +- ✓ Media pipeline (image analysis, audio transcription, outbound attachments) — existing +- ✓ Context compaction with memory extraction — existing +- ✓ Tool policy profiles with allow/deny lists — existing +- ✓ 1077 tests passing — existing + +### Active + +- [ ] Decompose daemon/index.ts into focused service modules +- [ ] Multi-environment config system (base + overlays) +- [ ] Live operational dashboard with real-time metrics + +### Out of Scope + +- New channel adapters (Signal, Matrix, Teams, Google Chat) — not the focus of this milestone +- Companion apps (macOS, iOS, Android) — massive scope, different project +- Structured logging framework — would complement the dashboard but adds complexity; evaluate after dashboard reveals what metrics matter +- Agent intelligence features (sub-agent spawning, planning) — separate milestone +- ESLint / type safety cleanup — worthwhile but not blocking current development + +## Context + +Flynn has been through rapid feature development (P0-P8, Tiers 1-4 all completed in ~7 days). The codebase grew fast and the wiring layer absorbed complexity. Key context: + +- **daemon/index.ts** is 1087 lines — it handles model client creation, channel setup, agent factory, memory initialization, vector indexer, session pruning, lifecycle management, and graceful shutdown. Every new feature touches this file. +- **Config** is a single YAML file validated by a 400+ line Zod schema. Managing dev vs Docker vs production requires manual YAML duplication. No layering, no environment-specific overrides. +- **Observability** is currently console.log/error/warn. The web dashboard shows basic stats but no real-time metrics: no message trace, no queue depth, no model call latency, no error stream. Debugging requires reading source and tailing stdout. +- The existing web dashboard (vanilla JS SPA) is functional and can be extended rather than rewritten. + +## Constraints + +- **Tech stack**: TypeScript, Node.js >= 22, pnpm. No new frameworks (keep vanilla JS for dashboard). +- **Backwards compatibility**: Existing config files must continue to work. Decomposition must not change public API or behavior. +- **Test coverage**: Maintain 1077+ passing tests. New modules need tests. +- **Single-operator**: This is a personal tool. Don't over-engineer for multi-tenant or team scenarios. + +## Key Decisions + +| Decision | Rationale | Outcome | +|----------|-----------|---------| +| Decompose god file, not rewrite | Preserves working code, reduces risk, can be done incrementally | -- Pending | +| Config overlays over separate files | Environment-specific overrides are less error-prone than maintaining N complete configs | -- Pending | +| Extend existing web dashboard | Already works, no framework dependencies, familiar codebase | -- Pending | +| Skip structured logging for now | Dashboard will reveal what metrics actually matter; avoid premature abstraction | -- Pending | + +--- +*Last updated: 2026-02-09 after initialization*