Implement Phase 3 channel adapters that decouple message sources from the agent via a uniform ChannelAdapter interface and ChannelRegistry. - Add ChannelAdapter/InboundMessage/OutboundMessage types - Add ChannelRegistry for adapter lifecycle and message routing - Add TelegramAdapter (grammy bot, auth middleware, confirmations, chunking) - Add WebChatAdapter (thin shim over GatewayServer) - Refactor daemon to use ChannelRegistry with per-channel-per-user agents - Add config.get/config.patch gateway handlers (Phase 2 loose end) - Add system.restart gateway handler (Phase 2 loose end) - Add implementation plans and design docs Tests: 225 passing (33 new channel adapter + gateway handler tests)
14 KiB
OpenClaw Parity Design: Flynn Feature Roadmap
Overview
Plan to evolve Flynn from a multi-model chat wrapper into a full self-hosted OpenClaw alternative. Bottom-up approach: tools first, then gateway, then channels, then skills/advanced features.
Build approach: Use Sonnet/Haiku via GitHub Copilot for implementation subagents.
Current state: Flynn v0.1.0 covers ~22% of OpenClaw features. Strong in model routing (4 providers, tiered fallback) and session management (SQLite, transfer). Zero tool execution capability.
Phase 1: Agent Tool Framework + Agent Loop
Goal: Turn Flynn from a chatbot into an agent that can execute tools with multi-step reasoning.
1.1 Tool Definition System
src/tools/
├── types.ts # Tool, ToolCall, ToolResult interfaces
├── registry.ts # ToolRegistry: register, lookup, list, serialize for providers
├── executor.ts # ToolExecutor: run tools, enforce hooks, timeout, truncation
├── builtin/
│ ├── shell.ts # shell.exec - run bash commands (cwd, timeout, max output)
│ ├── file-read.ts # file.read - read file contents (path, offset, limit)
│ ├── file-write.ts # file.write - write/create files (path, content)
│ ├── file-edit.ts # file.edit - find-and-replace (path, oldString, newString)
│ ├── file-list.ts # file.list - glob/list directory (pattern, path)
│ └── web-fetch.ts # web.fetch - HTTP GET with markdown conversion (url, format)
Tool interface:
interface Tool {
name: string; // e.g. "shell.exec"
description: string; // For the model's tool selection
inputSchema: Record<string, unknown>; // JSON Schema for parameters
execute(args: unknown): Promise<ToolResult>;
}
interface ToolResult {
success: boolean;
output: string;
error?: string;
}
interface ToolCall {
id: string; // Provider-assigned call ID
name: string; // Tool name
args: unknown; // Parsed arguments
}
ToolRegistry: Collects tools, serializes to Anthropic format ({ name, description, input_schema }) or OpenAI format ({ type: "function", function: { name, description, parameters } }).
ToolExecutor: Wraps execution with:
- Hook engine check (confirm/log/silent) before execution
- Configurable timeout (default 30s for shell, 10s for file ops)
- Output truncation (max 50KB, with "truncated" marker)
- Error capture (catches exceptions, returns as ToolResult.error)
1.2 Model Provider Tool Support
Update ChatRequest to accept optional tools: Tool[].
Anthropic (src/models/anthropic.ts):
- Pass tools as
toolsparameter tomessages.create()/messages.stream() - Parse
tool_usecontent blocks from response (id, name, input) - Accept
tool_resultcontent blocks in messages (tool_use_id, content) - Handle
stop_reason: "tool_use"to signal tool call response
OpenAI (src/models/openai.ts):
- Pass tools as
toolsparameter withtype: "function"wrapper - Parse
tool_callsfrom response choices - Accept
role: "tool"messages withtool_call_id - Handle
finish_reason: "tool_calls"to signal tool call response
Types updates (src/models/types.ts):
- Add
ToolCalltoChatResponse - Add
tool_useandtool_resultmessage roles - Add
toolstoChatRequest - Add
tool_callsstream event type
1.3 Agent Loop
Replace single-turn NativeAgent with iterative tool-use loop:
User message
-> Model call (with tools in request)
-> If response contains tool_use:
-> For each tool call:
-> Check hooks (confirm/log/silent)
-> If confirm: wait for approval (Telegram inline keyboard / TUI prompt)
-> Execute tool via ToolExecutor
-> Collect ToolResult
-> Append tool_results to conversation
-> Loop back to model call
-> If text response (no tool_use):
-> Return final response to user
-> If max iterations reached (default 10):
-> Return partial response with warning
Streaming during tool execution: Emit status events:
{ event: "tool_start", tool: "shell.exec", args: { command: "ls" } }{ event: "tool_end", tool: "shell.exec", result: { success: true, output: "..." } }- These render as status lines in TUI and Telegram
Abort support:
- TUI: Escape key sets abort flag, checked before each tool execution
- Telegram:
/cancelcommand sets abort flag on active session - Agent loop checks abort flag before each iteration
1.4 Frontend Updates
Telegram:
- Show tool execution as status messages ("Running shell.exec:
ls -la...") - Confirmation buttons already exist, wire to tool executor
- Show final response after tool loop completes
TUI (both modes):
- Show tool calls inline with dimmed formatting
- Show tool results with output (truncated in UI if long)
- Streaming text interleaved with tool status
- Escape aborts the agent loop
1.5 Deliverables
src/tools/types.ts- Tool, ToolCall, ToolResult interfacessrc/tools/registry.ts- ToolRegistry with provider serializationsrc/tools/executor.ts- ToolExecutor with hooks, timeout, truncationsrc/tools/builtin/shell.ts- Shell exec toolsrc/tools/builtin/file-read.ts- File read toolsrc/tools/builtin/file-write.ts- File write toolsrc/tools/builtin/file-edit.ts- File edit toolsrc/tools/builtin/file-list.ts- File list/glob toolsrc/tools/builtin/web-fetch.ts- Web fetch toolsrc/models/types.ts- Add tool-related typessrc/models/anthropic.ts- Add tool use supportsrc/models/openai.ts- Add tool use supportsrc/backends/native/agent.ts- Agent loop with tool executionsrc/frontends/telegram/bot.ts- Tool status messagessrc/frontends/tui/minimal.ts- Tool display in minimal TUIsrc/frontends/tui/components/App.tsx- Tool display in fullscreen TUI- Tests for all new modules
Phase 2: WebSocket Gateway
Goal: Central control plane that multiple clients connect to. Decouples frontends from the agent.
2.1 Gateway Core
src/gateway/
├── server.ts # WebSocket server (ws library), configurable port
├── protocol.ts # JSON-RPC-style message types
├── router.ts # Routes methods to handlers
├── auth.ts # Token + password auth, Tailscale identity headers
├── session-bridge.ts # Maps WS client connections to sessions
└── handlers/
├── agent.ts # agent.send, agent.cancel, agent.status
├── sessions.ts # sessions.list, sessions.history, sessions.send
├── tools.ts # tools.list, tools.invoke
├── config.ts # config.get, config.patch
└── system.ts # system.health, system.restart
2.2 Protocol
JSON-RPC-like over WebSocket:
Request: { id: number, method: string, params: object }
Response: { id: number, result: object }
Error: { id: number, error: { code: number, message: string } }
Event: { id: number, event: string, data: object } // streaming
Event types for agent.send streaming:
content- text chunk from modeltool_start- tool execution beginningtool_end- tool execution completethinking- model reasoning (if exposed)done- final response
2.3 Control UI
Minimal web dashboard served from gateway port:
src/gateway/ui/
├── index.html # Dashboard: sessions, model info, config
├── chat.html # WebChat: WS-connected chat interface
└── assets/ # CSS + minimal JS (no framework, or Preact)
2.4 Daemon Refactor
Gateway becomes the hub:
- Owns session manager, tool registry, model router
- Telegram adapter connects as a gateway client (in-process bridge)
- TUI connects as a gateway client (in-process or WS)
- WebChat connects via WS from browser
2.5 Deliverables
src/gateway/server.ts- WebSocket serversrc/gateway/protocol.ts- Message type definitionssrc/gateway/router.ts- Method routingsrc/gateway/auth.ts- Authenticationsrc/gateway/session-bridge.ts- Client-to-session mapping- Gateway handlers (agent, sessions, tools, config, system)
src/gateway/ui/- Control UI + WebChat- Refactor daemon to use gateway as hub
- Refactor Telegram as gateway client
- Tests for gateway protocol and handlers
Phase 3: Channel Adapters
Goal: Multi-channel inbox. One assistant accessible from WhatsApp, Telegram, Discord, Slack, and WebChat.
3.1 Channel Adapter Interface
src/channels/
├── types.ts # ChannelAdapter interface
├── registry.ts # ChannelRegistry: load/unload at runtime
├── telegram/ # Refactored from src/frontends/telegram/
├── discord/ # discord.js
├── whatsapp/ # Baileys
├── slack/ # Bolt (Socket Mode)
└── webchat/ # Gateway WS built-in
ChannelAdapter interface:
interface ChannelAdapter {
name: string;
connect(): Promise<void>;
disconnect(): Promise<void>;
send(peerId: string, message: OutboundMessage): Promise<void>;
onMessage(handler: (msg: InboundMessage) => void): void;
}
3.2 Build Order
- Telegram (refactor existing)
- Discord (discord.js)
- WhatsApp (Baileys)
- Slack (Bolt Socket Mode)
- WebChat (gateway built-in)
3.3 Security Per Channel
Every adapter implements:
- DM pairing: unknown senders get pairing code, must be approved via CLI/UI
- Allowlists:
channels.<name>.allowFromconfig array - Group mention gating:
channels.<name>.groups.*.requireMention - Rate limiting: per-sender throttle (configurable)
- Message size limits per channel
3.4 Deliverables
src/channels/types.ts- ChannelAdapter interfacesrc/channels/registry.ts- Channel registrysrc/channels/telegram/- Refactored Telegram adaptersrc/channels/discord/- Discord adaptersrc/channels/whatsapp/- WhatsApp adaptersrc/channels/slack/- Slack adaptersrc/channels/webchat/- WebChat adapter- DM pairing system
- Per-channel security (allowlists, mention gating, rate limiting)
- Tests per adapter
Phase 4: Skills + MCP
Goal: Extensible capability system with community skills and MCP tool servers.
4.1 Skills System
src/skills/
├── types.ts # Skill, SkillManifest interfaces
├── loader.ts # Load SKILL.md + scripts from skill directories
├── registry.ts # Discovery, gating (OS/bin/env checks)
└── installer.ts # Auto-install dependencies (with user confirmation)
Skill directory structure:
~/.flynn/workspace/skills/<name>/
├── SKILL.md # Instructions injected into system prompt
├── manifest.json # Requirements, permissions, dependencies
└── scripts/ # Executable helpers
Three tiers: bundled (shipped with Flynn), managed (installed via CLI), workspace (user-created).
4.2 MCP Integration
src/mcp/
├── client.ts # MCP client (stdio transport)
├── bridge.ts # Convert MCP tools -> Flynn tool registry entries
└── manager.ts # Lifecycle: start/stop/restart MCP servers per config
Wire the existing mcp.servers config to actually start MCP server processes, discover their tools, and register them in the tool registry. MCP tools appear alongside builtins -- the agent doesn't know the difference.
4.3 Deliverables
- Skills types, loader, registry, installer
- MCP client, bridge, manager
- Wire MCP config to runtime
- Bundled skills (at least: web-search, git, system-info)
- Tests
Phase 5: Advanced Features
Build after core is solid. Each is independent.
| Feature | Module | Description |
|---|---|---|
| Cron/scheduling | src/automation/cron.ts |
Cron expressions trigger agent messages |
| Webhooks | src/automation/webhooks.ts |
Inbound HTTP triggers |
| Browser tool | src/tools/builtin/browser.ts |
Playwright headless browser |
| Agent-to-agent | sessions_list/history/send tools |
Multi-agent coordination |
| Sub-agents | src/backends/native/subagent.ts |
Spawn scoped child sessions |
| Sandboxing | src/sandbox/ |
Docker per-session isolation |
| Voice | src/voice/ |
STT (Whisper) + TTS (ElevenLabs) |
| Canvas/A2UI | src/gateway/ui/canvas/ |
Agent-driven web workspace |
| CLI surface | src/cli/ |
flynn gateway, flynn agent, flynn send, etc. |
| Mobile nodes | Separate repos | iOS (Swift) + Android (Kotlin) companion apps |
| Onboarding wizard | src/cli/onboard.ts |
Guided setup flow |
| Doctor diagnostics | src/cli/doctor.ts |
Config + health validation |
Implementation Notes
Model Selection for Subagents
Use cheaper/faster models via GitHub Copilot for implementation:
- Sonnet: Complex implementation tasks (agent loop, gateway, protocol)
- Haiku: Mechanical tasks (individual tool implementations, adapter boilerplate, tests)
- Opus: Design review, architecture decisions only
Testing Strategy
- Unit tests for all tools (mock filesystem/network)
- Unit tests for tool registry serialization (Anthropic + OpenAI formats)
- Integration tests for agent loop (mock model returns tool_use, verify execution)
- Integration tests for gateway protocol (WS client/server)
- E2E tests for tool execution (real shell commands in temp dirs)
Migration Path
- Phase 1 is additive (no breaking changes to existing code)
- Phase 2 refactors daemon (breaking, but internal only)
- Phase 3 moves Telegram (file rename, adapter interface)
- Each phase is a separate feature branch
Design Version: 1.0 Created: 2026-02-05 Approach: Bottom-up (tools -> gateway -> channels -> skills -> advanced)