flynn/.planning/codebase/ARCHITECTURE.md

# Architecture

**Analysis Date:** 2025-02-09

## Pattern Overview

**Overall:** Multi-channel AI agent daemon with layered pipeline architecture

**Key Characteristics:**
- Message pipeline: Channel Adapter → ChannelRegistry → MessageRouter → AgentOrchestrator → NativeAgent → ModelClient
- Registry/factory pattern for extensible channels, tools, models, and agent configs
- Dependency injection via constructor config objects (no DI container)
- YAML + Zod config-driven feature toggling — most subsystems are optional and activated by config
- Lifecycle-managed daemon with ordered shutdown handlers (LIFO)
- SQLite for session persistence, filesystem for memory persistence, SQLite for vector embeddings

## Layers

**CLI Layer:**
- Purpose: Parse commands, load config, bootstrap daemon or TUI
- Location: `src/cli/`
- Contains: Command registrations (commander.js), config loading, TUI entry
- Depends on: Config, Daemon
- Used by: End user via `flynn` binary
- Entry: `src/cli/index.ts` → registers commands: `start`, `tui`, `send`, `sessions`, `doctor`, `config`, `completion`

**Config Layer:**
- Purpose: Load YAML config, validate with Zod, expand `${ENV_VAR}` references
- Location: `src/config/`
- Contains: Zod schema definitions (`schema.ts`), YAML loader with env expansion (`loader.ts`)
- Depends on: `zod`, `yaml`
- Used by: All layers — the `Config` type flows through the entire system
- Key file: `src/config/schema.ts` — single source of truth for all configuration types (409 lines)

**Channel Layer:**
- Purpose: Uniform messaging abstraction across platforms
- Location: `src/channels/`
- Contains: `ChannelAdapter` interface, `ChannelRegistry`, platform adapters
- Depends on: Platform SDKs (grammy, discord.js, @slack/bolt, whatsapp-web.js)
- Used by: Daemon (message routing), Automation (cron/webhook output)
- Interface: `ChannelAdapter` with `connect()`, `disconnect()`, `send()`, `onMessage()`
- Adapters: `src/channels/telegram/adapter.ts`, `src/channels/discord/adapter.ts`, `src/channels/slack/adapter.ts`, `src/channels/whatsapp/adapter.ts`, `src/channels/webchat/adapter.ts`

**Agent Layer (Backend):**
- Purpose: Core AI agent loop — process messages, execute tools, manage conversation
- Location: `src/backends/native/`
- Contains: `NativeAgent` (tool loop), `AgentOrchestrator` (delegation, compaction, memory)
- Depends on: Models, Tools, Session, Context, Memory
- Used by: Daemon (via message router), Gateway (via SessionBridge)
- Key abstraction: `NativeAgent` runs the tool loop; `AgentOrchestrator` wraps it with orchestration features

**Model Layer:**
- Purpose: Unified interface to LLM providers with tier-based routing and fallback
- Location: `src/models/`
- Contains: `ModelClient` interface, provider implementations, `ModelRouter`, retry logic, cost estimation
- Depends on: Provider SDKs (@anthropic-ai/sdk, openai, @google/generative-ai, ollama, @aws-sdk/client-bedrock-runtime)
- Used by: Agent layer, Gateway SessionBridge
- Providers: `anthropic.ts`, `openai.ts`, `gemini.ts`, `bedrock.ts`, `github.ts`, `local/ollama.ts`, `local/llamacpp.ts`

**Tool Layer:**
- Purpose: Tool registry, execution with policy enforcement and hook checks
- Location: `src/tools/`
- Contains: `ToolRegistry`, `ToolExecutor`, `ToolPolicy`, builtin tool implementations
- Depends on: Hooks (confirmation), Memory, Process/Browser managers
- Used by: Agent layer (tool loop), Gateway (tool execution)
- Three tool patterns:
  - Static: `export const fooTool: Tool` (e.g., `src/tools/builtin/shell.ts`)
  - Factory: `export function createFooTool(dep): Tool` (e.g., `src/tools/builtin/media-send.ts`)
  - Multi-factory: `export function createFooTools(dep): Tool[]` (e.g., `src/tools/builtin/process/index.ts`)

**Session Layer:**
- Purpose: Persistent conversation history per channel+sender pair
- Location: `src/session/`
- Contains: `SessionStore` (SQLite), `SessionManager` (in-memory cache + store), `ManagedSession`
- Depends on: `better-sqlite3`
- Used by: Agent layer, Gateway SessionBridge

**Memory Layer:**
- Purpose: Persistent knowledge across sessions — namespace-based files + hybrid vector search
- Location: `src/memory/`
- Contains: `MemoryStore` (file-based), `VectorStore` (SQLite-backed embeddings), `HybridSearch`, embedding providers, text chunker
- Depends on: Embedding providers (OpenAI, Gemini, Ollama, llama.cpp, Voyage)
- Used by: Agent layer (memory injection into system prompt), Tools (memory.read/write/search)

**Context Layer:**
- Purpose: Token estimation and conversation compaction
- Location: `src/context/`
- Contains: Token estimator (`tokens.ts`), compaction logic (`compaction.ts`)
- Depends on: Agent layer (delegation for compaction), Memory (extraction)
- Used by: AgentOrchestrator (automatic compaction before each message)

**Gateway Layer:**
- Purpose: WebSocket JSON-RPC server + HTTP static file server + vanilla JS dashboard
- Location: `src/gateway/`
- Contains: `GatewayServer`, `Router`, `SessionBridge`, `LaneQueue`, auth, protocol, static serving, Tailscale Serve
- Depends on: `ws`, Session, Agent, Tools
- Used by: WebChat adapter, TUI (connects as WS client), external dashboard clients
- Protocol: JSON-RPC 2.0 over WebSocket

**Hooks Layer:**
- Purpose: Pattern-based tool confirmation engine
- Location: `src/hooks/`
- Contains: `HookEngine` with glob-pattern matching for confirm/log/silent actions
- Depends on: Nothing (pure logic)
- Used by: ToolExecutor (checks before execution)

**Prompt Layer:**
- Purpose: Assemble system prompt from template files (SOUL.md, AGENTS.md, etc.)
- Location: `src/prompt/`
- Contains: Template search and assembly logic
- Depends on: Filesystem (searches multiple directories)
- Used by: Daemon (system prompt construction at startup)

**MCP Layer:**
- Purpose: Model Context Protocol server management — bridge external MCP tools into the tool registry
- Location: `src/mcp/`
- Contains: `McpClient`, `McpManager`, tool bridging (`bridge.ts`)
- Depends on: `@modelcontextprotocol/sdk`
- Used by: Daemon (starts MCP servers, registers bridged tools)

**Skills Layer:**
- Purpose: Pluggable skill system — bundled, managed (installed), and workspace skills
- Location: `src/skills/`
- Contains: `SkillRegistry`, `SkillInstaller`, skill loader
- Depends on: Filesystem
- Used by: Daemon (loads skills, injects into system prompt)

**Agents Config Layer:**
- Purpose: Named agent configurations with per-agent overrides (model tier, tools, sandbox)
- Location: `src/agents/`
- Contains: `AgentConfigRegistry`, `AgentRouter` (channel+sender → agent config resolution)
- Depends on: Config
- Used by: Daemon message router (selects agent config per session)

**Automation Layer:**
- Purpose: Scheduled tasks, webhooks, heartbeat monitoring, Gmail watching
- Location: `src/automation/`
- Contains: `CronScheduler`, `WebhookHandler`, `HeartbeatMonitor`, `GmailWatcher`
- Depends on: `croner`, `googleapis`, Channels
- Used by: Daemon (registers as channel adapters or standalone monitors)

**Sandbox Layer:**
- Purpose: Docker container isolation for tool execution
- Location: `src/sandbox/`
- Contains: `DockerSandbox`, `SandboxManager`, sandboxed shell/process tools
- Depends on: Docker CLI
- Used by: Daemon message router (replaces shell/process tools with sandboxed variants)

**Frontend Layer (Legacy):**
- Purpose: Direct Telegram bot integration and TUI
- Location: `src/frontends/`
- Contains: Telegram bot handlers with confirmation UI, TUI (minimal readline + fullscreen React/Ink)
- Depends on: grammy (Telegram), ink/react (TUI)
- Used by: CLI commands (`start` uses Telegram frontend, `tui` uses TUI)

## Data Flow

**Inbound Message Processing (Channel → Response):**

1. Platform SDK receives message → Channel adapter normalizes to `InboundMessage`
2. Adapter calls `onMessage()` callback → `ChannelRegistry.handleInbound()` routes to `MessageHandler`
3. `createMessageRouter()` resolves agent config via `AgentRouter.resolve(channel, senderId)`
4. `getOrCreateAgent()` creates/retrieves `AgentOrchestrator` for the session (cached by `channel:sender:agentConfig`)
5. Audio routing: `supportsAudioInput()` checks provider capability — native audio passed through for Gemini/OpenAI/GitHub, transcribed via Whisper for others
6. `orchestrator.process()` → injects memory context → checks compaction → delegates to `NativeAgent.process()`
7. `NativeAgent.toolLoop()` → sends to `ModelRouter.chat()` → model returns response or tool calls
8. If tool calls: `ToolExecutor.execute()` → policy check → hook check → tool execution → loop back to model
9. Final text response returned → reply function sends via adapter → `adapter.send()` → platform SDK

**Gateway WebSocket Flow:**

1. Client connects via WebSocket → auth check → `SessionBridge.connect()` → `NativeAgent` created
2. Client sends JSON-RPC message → `GatewayServer.handleMessage()` → `Router.dispatch()` → handler
3. `agent.send` handler → `LaneQueue` serializes requests → `SessionBridge` processes via `NativeAgent`
4. Streaming events sent back via WebSocket as JSON-RPC notifications
5. HTTP requests serve static dashboard UI or webhook endpoints

**Model Routing with Fallback:**

1. `ModelRouter.chat(request, tier)` → tries primary client for requested tier
2. If retry config enabled: `withRetry()` wraps call with exponential backoff
3. On failure → try tier-specific fallbacks (e.g., Anthropic → GitHub Models same model)
4. On failure → try global fallback chain (typically local model)
5. All failures → throw aggregated error

**Compaction Flow:**

1. Before each `process()`, `AgentOrchestrator.compactIfNeeded()` checks token count vs threshold
2. If threshold exceeded → `compactHistory()` splits messages into compactable + recent (keep N turns)
3. Delegates summarization to `fast` tier via `orchestrator.delegate()`
4. Optionally extracts memory facts via separate delegation call
5. Replaces session history with `[summary_message, ...recent_messages]`

**State Management:**
- Session history: SQLite (`~/.local/share/flynn/sessions.db`) + in-memory cache in `SessionManager`
- Memory: Namespace-based markdown files in `~/.local/share/flynn/memory/`
- Vectors: SQLite (`~/.local/share/flynn/vectors.db`) for embeddings
- Config: YAML file at `~/.config/flynn/config.yaml` (read once at startup)

## Key Abstractions

**ModelClient:**
- Purpose: Uniform interface to any LLM provider
- Interface: `chat(request: ChatRequest): Promise<ChatResponse>` + optional `chatStream()`
- Implementations: `src/models/anthropic.ts`, `src/models/openai.ts`, `src/models/gemini.ts`, `src/models/bedrock.ts`, `src/models/github.ts`, `src/models/local/ollama.ts`, `src/models/local/llamacpp.ts`
- Pattern: Each provider wraps its SDK and normalizes to `ChatResponse`

**ModelRouter:**
- Purpose: Tier-based model selection with cascading fallback
- Location: `src/models/router.ts`
- Tiers: `fast`, `default`, `complex`, `local` — each maps to a `ModelClient`
- Implements `ModelClient` interface itself, so consumers don't need to know about tiers

**ChannelAdapter:**
- Purpose: Normalize platform-specific messaging into a common interface
- Interface: `connect()`, `disconnect()`, `send(peerId, msg)`, `onMessage(handler)`
- Location: `src/channels/types.ts`
- Pattern: Each adapter wraps a platform SDK, handles auth/filtering, emits `InboundMessage`

**Tool:**
- Purpose: Executable capability exposed to the AI model
- Interface: `{ name, description, inputSchema, execute(args): Promise<ToolResult> }`
- Location: `src/tools/types.ts`
- Registration: tool file → `src/tools/builtin/index.ts` → `src/tools/index.ts` → `src/daemon/index.ts`

**Session:**
- Purpose: Conversation state (message history) for a channel+sender pair
- Interface: `addMessage()`, `getHistory()`, `clear()`, `replaceHistory()`
- Location: `src/session/manager.ts`
- ID format: `channel:senderId` (e.g., `telegram:123456`)

**AgentOrchestrator:**
- Purpose: Wraps NativeAgent with delegation, compaction, memory, usage tracking
- Location: `src/backends/native/orchestrator.ts`
- Key method: `delegate(SubAgentRequest)` — stateless single-turn call to any tier
- Delegation tasks: compaction, memory extraction, classification, tool summarisation, complex reasoning

**DaemonContext:**
- Purpose: Holds all initialized subsystems returned by `startDaemon()`
- Location: `src/daemon/index.ts`
- Contains: config, lifecycle, session/model/tool/channel/gateway/mcp/skill/agent registries

## Entry Points

**CLI Binary (`flynn`):**
- Location: `src/cli/index.ts`
- Triggers: `flynn start`, `flynn tui`, `flynn send`, `flynn sessions`, `flynn doctor`, `flynn config`
- Responsibilities: Parse args, load config, bootstrap subsystems

**Daemon Start:**
- Location: `src/daemon/index.ts` → `startDaemon(config)`
- Triggers: `flynn start` CLI command
- Responsibilities: Initialize all subsystems in order, wire dependencies, start channel adapters and gateway

**Gateway Server:**
- Location: `src/gateway/server.ts`
- Triggers: HTTP/WS connections on configured port (default 18800)
- Responsibilities: JSON-RPC routing, WebSocket session management, static UI serving, webhook HTTP endpoints

**TUI:**
- Location: `src/frontends/tui/minimal.ts` (readline) and `src/frontends/tui/fullscreen.ts` (React/Ink)
- Triggers: `flynn tui` or `flynn tui --fullscreen`
- Responsibilities: Local interactive chat interface connecting to gateway via WebSocket

## Error Handling

**Strategy:** Catch-and-convert with descriptive context. No global error handler.

**Patterns:**
- Model layer: Retry with exponential backoff → tier fallback → global fallback → throw aggregated error
- Tool execution: `Promise.race` timeout → catch → return `ToolResult { success: false, error: message }`
- Channel adapters: `Promise.allSettled` for start/stop — log per-adapter errors, don't crash
- Daemon: Lifecycle LIFO shutdown handlers — each wrapped in try/catch
- Config: Zod validation throws with structured error messages on invalid config
- Gateway: JSON-RPC error codes (`ParseError`, `MethodNotFound`, `InternalError`)

## Cross-Cutting Concerns

**Logging:** `console.log`/`console.error`/`console.warn`/`console.debug` throughout. No structured logging framework. Debug-level messages for model fallback decisions.

**Validation:** Zod for config validation (`src/config/schema.ts`). Tool args validated by model-provided schema. No runtime validation on tool args beyond what the tool itself checks.

**Authentication:** Multi-layer:
- Gateway: Bearer token auth + optional Tailscale identity header (`src/gateway/auth.ts`)
- Telegram: `allowed_chat_ids` whitelist
- Discord: `allowed_guild_ids` + `allowed_channel_ids` whitelists
- Slack: `allowed_channel_ids` whitelist + signing secret
- WhatsApp: `allowed_numbers` + `allowed_group_ids` whitelists
- Webhooks: HMAC signature verification (per-webhook secret)
- Pairing: DM pairing codes for unknown senders (`src/channels/pairing.ts`)

**Tool Policy:** Profile-based filtering (minimal/messaging/coding/full) + glob-pattern allow/deny lists + per-agent/per-provider overrides (`src/tools/policy.ts`).

**Configuration:** Single YAML file with `${ENV_VAR}` expansion, validated by comprehensive Zod schema. Every subsystem is feature-toggled via config. Default config path: `~/.config/flynn/config.yaml`.

---

*Architecture analysis: 2025-02-09*