Files
flynn/.planning/PROJECT.md
T
2026-02-09 19:42:39 -08:00

4.5 KiB

Flynn — Operator DX Milestone

What This Is

A focused quality-of-life milestone for Flynn's operator (you). Flynn is a mature multi-channel AI assistant daemon with 10 model providers, 5 channel adapters, 40+ tools, a web dashboard, automation, sandboxing, and 1077 passing tests. This milestone targets the three biggest friction points in developing and operating Flynn: the monolithic daemon wiring file, lack of multi-environment config management, and limited runtime observability.

Core Value

Make Flynn easier to reason about, configure, and monitor — so that adding features and diagnosing issues takes minutes, not hours.

Requirements

Validated

  • ✓ Multi-channel AI assistant daemon with tool loop — existing
  • ✓ 10 model providers (Anthropic, OpenAI, Gemini, Bedrock, GitHub, OpenRouter, Zhipu, xAI, Ollama, llama.cpp) — existing
  • ✓ 5 channel adapters (Telegram, Discord, Slack, WhatsApp, WebChat) — existing
  • ✓ WebSocket gateway with JSON-RPC protocol — existing
  • ✓ Web dashboard SPA (dashboard, chat, sessions, settings, usage) — existing
  • ✓ YAML config with Zod validation and env var expansion — existing
  • ✓ SQLite session persistence with TTL pruning — existing
  • ✓ Memory system with hybrid keyword + vector search — existing
  • ✓ Docker sandboxing per session — existing
  • ✓ Multi-agent routing with per-agent config — existing
  • ✓ Automation: cron, webhooks, heartbeat, Gmail watcher — existing
  • ✓ MCP tool server integration — existing
  • ✓ Skills system (bundled/managed/workspace) — existing
  • ✓ Media pipeline (image analysis, audio transcription, outbound attachments) — existing
  • ✓ Context compaction with memory extraction — existing
  • ✓ Tool policy profiles with allow/deny lists — existing
  • ✓ 1077 tests passing — existing

Active

  • Decompose daemon/index.ts into focused service modules
  • Multi-environment config system (base + overlays)
  • Live operational dashboard with real-time metrics

Out of Scope

  • New channel adapters (Signal, Matrix, Teams, Google Chat) — not the focus of this milestone
  • Companion apps (macOS, iOS, Android) — massive scope, different project
  • Structured logging framework — would complement the dashboard but adds complexity; evaluate after dashboard reveals what metrics matter
  • Agent intelligence features (sub-agent spawning, planning) — separate milestone
  • ESLint / type safety cleanup — worthwhile but not blocking current development

Context

Flynn has been through rapid feature development (P0-P8, Tiers 1-4 all completed in ~7 days). The codebase grew fast and the wiring layer absorbed complexity. Key context:

  • daemon/index.ts is 1087 lines — it handles model client creation, channel setup, agent factory, memory initialization, vector indexer, session pruning, lifecycle management, and graceful shutdown. Every new feature touches this file.
  • Config is a single YAML file validated by a 400+ line Zod schema. Managing dev vs Docker vs production requires manual YAML duplication. No layering, no environment-specific overrides.
  • Observability is currently console.log/error/warn. The web dashboard shows basic stats but no real-time metrics: no message trace, no queue depth, no model call latency, no error stream. Debugging requires reading source and tailing stdout.
  • The existing web dashboard (vanilla JS SPA) is functional and can be extended rather than rewritten.

Constraints

  • Tech stack: TypeScript, Node.js >= 22, pnpm. No new frameworks (keep vanilla JS for dashboard).
  • Backwards compatibility: Existing config files must continue to work. Decomposition must not change public API or behavior.
  • Test coverage: Maintain 1077+ passing tests. New modules need tests.
  • Single-operator: This is a personal tool. Don't over-engineer for multi-tenant or team scenarios.

Key Decisions

Decision Rationale Outcome
Decompose god file, not rewrite Preserves working code, reduces risk, can be done incrementally -- Pending
Config overlays over separate files Environment-specific overrides are less error-prone than maintaining N complete configs -- Pending
Extend existing web dashboard Already works, no framework dependencies, familiar codebase -- Pending
Skip structured logging for now Dashboard will reveal what metrics actually matter; avoid premature abstraction -- Pending

Last updated: 2026-02-09 after initialization