From 8717153e4890958da12fc839eeda61b72d988cc1 Mon Sep 17 00:00:00 2001 From: William Valentin Date: Mon, 2 Feb 2026 20:37:53 -0800 Subject: [PATCH] Add Flynn design document Initial design for a self-hosted personal AI agent with: - Telegram + TUI frontends - Multi-model routing with fallback chain - Claude Code/OpenCode CLI integration - Hook-based security for sensitive operations - Tailscale-only network exposure Co-Authored-By: Claude Opus 4.5 --- docs/plans/2026-02-02-flynn-design.md | 342 ++++++++++++++++++++++++++ 1 file changed, 342 insertions(+) create mode 100644 docs/plans/2026-02-02-flynn-design.md diff --git a/docs/plans/2026-02-02-flynn-design.md b/docs/plans/2026-02-02-flynn-design.md new file mode 100644 index 0000000..92014df --- /dev/null +++ b/docs/plans/2026-02-02-flynn-design.md @@ -0,0 +1,342 @@ +# Flynn Design Document + +**Date:** 2026-02-02 +**Status:** Draft + +## Overview + +Flynn is a self-hosted personal AI agent accessible via Telegram and a local TUI. It runs on your workstation, behind Tailscale, with no internet exposure. You text it like a friend - it can search the web, run commands, query your K8s cluster, manage files, and proactively notify you of events. + +## Core Principles + +- **Tailscale-only** - Daemon binds to Tailscale interface + localhost, never `0.0.0.0` +- **Single-user** - Your Telegram chat ID is the only authorized user +- **Hook-gated** - Sensitive operations require confirmation before execution +- **Smart routing** - Model selection based on task complexity, with multi-provider fallback +- **Multi-frontend** - One daemon, multiple interfaces (Telegram, TUI, web later) +- **Backend-agnostic** - Can delegate to Claude Code CLI, OpenCode CLI, or native agent + +## What Flynn Is Not + +- Not a multi-user platform +- Not internet-facing +- Not a replacement for Claude Code CLI - it complements it with mobile/async access + +## Architecture + +``` +┌─────────────┐ +│ Telegram │──┐ +│ (grammY) │ │ +└─────────────┘ │ + │ ┌─────────────────────────────────────┐ +┌─────────────┐ │ │ Flynn Daemon │ +│ TUI │──┼───▶│ │ +│ (Ink) │ │ │ ┌───────────┐ ┌──────────────┐ │ +└─────────────┘ │ │ │ Session │ │ Hook Engine │ │ + │ │ │ Manager │◀──▶│ (confirm via │ │ +┌─────────────┐ │ │ └───────────┘ │ active front)│ │ +│ Web UI │──┘ │ │ └──────────────┘ │ +│ (future) │ │ ▼ │ +└─────────────┘ │ ┌───────────┐ ┌──────────────┐ │ + │ │ Model │───▶│ Backends │ │ + │ │ Router │ │ • Claude Code│ │ + │ └───────────┘ │ • OpenCode │ │ + │ │ • Native │ │ + │ ┌───────────┐ └──────────────┘ │ + │ │ Notifier │ │ + │ │ (cron/wh) │ │ + │ └───────────┘ │ + └─────────────────────────────────────┘ + binds to Tailscale IP + localhost +``` + +### Components + +**Flynn Daemon** - Long-running TypeScript/Node.js process exposing an internal API via WebSocket or Unix socket. + +**Session Manager** - Shared sessions across frontends. Start a conversation on Telegram, continue in TUI. Persists to SQLite. + +**Model Router** - Routes requests to the appropriate model based on task complexity: +- Local LLMs (Ollama/llama.cpp) for private tasks, triage, offline +- Haiku for quick replies +- Sonnet for general work +- Opus for complex reasoning +- Fallback chain: Anthropic → OpenAI → Gemini → Local + +**Hook Engine** - Intercepts tool calls before execution. Sensitive operations send confirmation to the active frontend (Telegram or TUI). Non-sensitive operations execute immediately. + +**Backends** - Three execution modes: +- **Claude Code CLI** - Spawns your existing Claude Code setup with all agents/skills +- **OpenCode CLI** - Alternative agent runner +- **Native agent** - Lightweight built-in for simple tasks, triage, notifications + +**Notifier** - Cron scheduler and webhook listener for proactive messages. + +## Frontends + +### Telegram + +- grammY-based bot +- Allowlist by chat ID (single user) +- Voice message transcription via Whisper (local or API) +- Optional TTS responses +- Inline keyboard buttons for hook confirmations +- Commands: `/status`, `/reset`, `/cron`, `/hooks`, `/model` + +### TUI + +- Hybrid mode: minimal readline by default, Tab to expand full-screen +- Minimal mode: simple prompt with streaming output +- Full-screen mode: conversation pane, status bar, tool output +- Shares session state with Telegram frontend +- Built with Ink (React for CLI) + +## Tool System + +### Built-in Tools + +| Tool | Hook | Description | +|------|------|-------------| +| `shell` | confirm | Execute bash commands | +| `file.read` | log | Read files | +| `file.write` | confirm | Write files | +| `web.search` | log | Search the web | +| `web.fetch` | log | Fetch URL content | +| `notify` | silent | Send proactive message | +| `cron.manage` | confirm | Create/list/delete scheduled tasks | + +### MCP Integration + +Flynn connects to MCP servers as a client. Server configs defined in `config.yaml`. MCP tool calls go through the Hook Engine - any tool can be gated as sensitive. + +### Hook Classification + +```yaml +hooks: + confirm: # Requires confirmation + - shell.* + - file.write + - k8s.mutate + - cron.* + - backend.claude_code + log: # Executes but logs + - web.* + - file.read + silent: # No notification + - notify +``` + +## Network & Security + +**Inbound access:** +- Daemon binds to Tailscale IP + localhost only +- No internet-facing ports +- Telegram Bot API is outbound-only (long polling) + +**Outbound access:** +- Full LAN access (other machines, services, NAS, local APIs) +- Internet access for web search, LLM APIs + +**Authentication:** +- Telegram: chat ID allowlist (hardcoded in config) +- TUI: localhost only (implicit trust) +- Future web UI: Tailscale IP only + +**Security boundaries:** +1. Telegram allowlist - only your chat ID +2. Tailscale - no internet exposure +3. Hook engine - sensitive ops require confirmation +4. LAN access controlled by what tools expose + +## Configuration + +Location: `~/.config/flynn/config.yaml` + +```yaml +# Identity +telegram: + bot_token: ${FLYNN_TELEGRAM_TOKEN} + allowed_chat_ids: [123456789] + +# Network +server: + tailscale_only: true + localhost: true + port: 18800 + +# Model routing +models: + local: + provider: ollama # or llamacpp + endpoint: http://localhost:11434 + model: llama3.2 + for: [triage, private] + fast: + provider: anthropic + model: claude-haiku + default: + provider: anthropic + model: claude-sonnet + complex: + provider: anthropic + model: claude-opus + fallback_chain: [anthropic, openai, gemini, local] + +# Backends +backends: + claude_code: + enabled: true + path: /usr/bin/claude + opencode: + enabled: true + path: /usr/bin/opencode + native: + enabled: true + +# Hooks +hooks: + confirm: + - shell.* + - file.write + - k8s.mutate + - cron.* + - backend.claude_code + log: + - web.* + - file.read + silent: + - notify + +# MCP servers +mcp: + servers: + - name: filesystem + command: mcp-filesystem + args: ["/home/will"] + - name: brave-search + command: mcp-brave-search +``` + +## Project Structure + +``` +flynn/ +├── src/ +│ ├── daemon/ +│ │ ├── index.ts # Main daemon entry +│ │ ├── server.ts # WebSocket/Unix socket API +│ │ └── session.ts # Session manager +│ │ +│ ├── frontends/ +│ │ ├── telegram/ +│ │ │ ├── bot.ts # grammY bot setup +│ │ │ ├── handlers.ts # Message/command handlers +│ │ │ └── voice.ts # Transcription/TTS +│ │ │ +│ │ └── tui/ +│ │ ├── app.ts # TUI entry (Ink) +│ │ ├── minimal.ts # Readline mode +│ │ └── fullscreen.ts # Panel mode +│ │ +│ ├── backends/ +│ │ ├── router.ts # Backend selection logic +│ │ ├── claude-code.ts # Claude Code CLI spawner +│ │ ├── opencode.ts # OpenCode CLI spawner +│ │ └── native/ +│ │ ├── agent.ts # Built-in lightweight agent +│ │ └── tools/ # Native tool implementations +│ │ +│ ├── models/ +│ │ ├── router.ts # Model selection logic +│ │ ├── anthropic.ts # Anthropic API client +│ │ ├── openai.ts # OpenAI fallback +│ │ ├── gemini.ts # Gemini fallback +│ │ └── local/ +│ │ ├── ollama.ts # Ollama client +│ │ └── llamacpp.ts # llama.cpp server client +│ │ +│ ├── hooks/ +│ │ ├── engine.ts # Hook interception logic +│ │ └── confirm.ts # Confirmation flow +│ │ +│ ├── notify/ +│ │ ├── cron.ts # Scheduled tasks +│ │ └── webhook.ts # Webhook listener +│ │ +│ └── mcp/ +│ └── client.ts # MCP server connections +│ +├── config/ +│ └── default.yaml # Default config template +│ +├── package.json +├── tsconfig.json +└── README.md +``` + +## Key Dependencies + +- `grammy` - Telegram bot framework +- `ink` - React for CLI (TUI) +- `@anthropic-ai/sdk` - Anthropic API +- `openai` - OpenAI fallback +- `@google/generative-ai` - Gemini fallback +- `ollama` - Ollama client +- `croner` - Cron scheduling +- `better-sqlite3` - Session persistence +- `@modelcontextprotocol/sdk` - MCP client + +## Comparison with OpenClaw + +| Aspect | OpenClaw | Flynn | +|--------|----------|-------| +| Channels | 12+ (WhatsApp, Telegram, Discord, etc.) | Telegram + TUI | +| Network | Internet-facing optional | Tailscale-only | +| Security model | Docker sandboxes, complex policies | Hook-based confirmation | +| Multi-user | Yes, with isolation | Single-user | +| Backends | Pi agent (custom) | Claude Code, OpenCode, Native | +| Complexity | High (Gateway, Nodes, Canvas, etc.) | Focused | + +Flynn takes OpenClaw's core value proposition (text your AI like a friend) and strips it down to a single-user, security-focused design that integrates with existing CLI agents rather than replacing them. + +## Implementation Phases + +### Phase 1: Foundation +- Daemon skeleton with WebSocket API +- Basic Telegram bot (text only) +- Native agent with Anthropic API +- Config loading + +### Phase 2: Core Features +- Model router with fallback chain +- Hook engine with Telegram confirmations +- Session persistence +- Local LLM integration (Ollama) + +### Phase 3: TUI +- Minimal readline mode +- Full-screen panel mode +- Shared sessions with Telegram + +### Phase 4: Backend Integration +- Claude Code CLI spawner +- OpenCode CLI spawner +- Backend routing logic + +### Phase 5: Advanced +- Voice transcription/TTS +- Cron scheduler +- Webhook listener +- MCP client integration +- llama.cpp support + +## Open Questions + +1. **Session sharing granularity** - Should Telegram and TUI share the exact same session, or have separate sessions that can be "transferred"? + +2. **Backend selection** - Automatic routing to backends based on task type, or explicit user choice via command? + +3. **Voice response default** - When you send a voice message, should Flynn respond with voice by default, or text? + +4. **Notification batching** - Should cron/webhook notifications batch or send immediately?