aa95f2132c
Implement Phase 3 channel adapters that decouple message sources from the agent via a uniform ChannelAdapter interface and ChannelRegistry. - Add ChannelAdapter/InboundMessage/OutboundMessage types - Add ChannelRegistry for adapter lifecycle and message routing - Add TelegramAdapter (grammy bot, auth middleware, confirmations, chunking) - Add WebChatAdapter (thin shim over GatewayServer) - Refactor daemon to use ChannelRegistry with per-channel-per-user agents - Add config.get/config.patch gateway handlers (Phase 2 loose end) - Add system.restart gateway handler (Phase 2 loose end) - Add implementation plans and design docs Tests: 225 passing (33 new channel adapter + gateway handler tests)
379 lines
14 KiB
Markdown
379 lines
14 KiB
Markdown
# OpenClaw Parity Design: Flynn Feature Roadmap
|
|
|
|
## Overview
|
|
|
|
Plan to evolve Flynn from a multi-model chat wrapper into a full self-hosted OpenClaw alternative. Bottom-up approach: tools first, then gateway, then channels, then skills/advanced features.
|
|
|
|
**Build approach**: Use Sonnet/Haiku via GitHub Copilot for implementation subagents.
|
|
|
|
**Current state**: Flynn v0.1.0 covers ~22% of OpenClaw features. Strong in model routing (4 providers, tiered fallback) and session management (SQLite, transfer). Zero tool execution capability.
|
|
|
|
---
|
|
|
|
## Phase 1: Agent Tool Framework + Agent Loop
|
|
|
|
**Goal**: Turn Flynn from a chatbot into an agent that can execute tools with multi-step reasoning.
|
|
|
|
### 1.1 Tool Definition System
|
|
|
|
```
|
|
src/tools/
|
|
├── types.ts # Tool, ToolCall, ToolResult interfaces
|
|
├── registry.ts # ToolRegistry: register, lookup, list, serialize for providers
|
|
├── executor.ts # ToolExecutor: run tools, enforce hooks, timeout, truncation
|
|
├── builtin/
|
|
│ ├── shell.ts # shell.exec - run bash commands (cwd, timeout, max output)
|
|
│ ├── file-read.ts # file.read - read file contents (path, offset, limit)
|
|
│ ├── file-write.ts # file.write - write/create files (path, content)
|
|
│ ├── file-edit.ts # file.edit - find-and-replace (path, oldString, newString)
|
|
│ ├── file-list.ts # file.list - glob/list directory (pattern, path)
|
|
│ └── web-fetch.ts # web.fetch - HTTP GET with markdown conversion (url, format)
|
|
```
|
|
|
|
**Tool interface**:
|
|
```typescript
|
|
interface Tool {
|
|
name: string; // e.g. "shell.exec"
|
|
description: string; // For the model's tool selection
|
|
inputSchema: Record<string, unknown>; // JSON Schema for parameters
|
|
execute(args: unknown): Promise<ToolResult>;
|
|
}
|
|
|
|
interface ToolResult {
|
|
success: boolean;
|
|
output: string;
|
|
error?: string;
|
|
}
|
|
|
|
interface ToolCall {
|
|
id: string; // Provider-assigned call ID
|
|
name: string; // Tool name
|
|
args: unknown; // Parsed arguments
|
|
}
|
|
```
|
|
|
|
**ToolRegistry**: Collects tools, serializes to Anthropic format (`{ name, description, input_schema }`) or OpenAI format (`{ type: "function", function: { name, description, parameters } }`).
|
|
|
|
**ToolExecutor**: Wraps execution with:
|
|
- Hook engine check (confirm/log/silent) before execution
|
|
- Configurable timeout (default 30s for shell, 10s for file ops)
|
|
- Output truncation (max 50KB, with "truncated" marker)
|
|
- Error capture (catches exceptions, returns as ToolResult.error)
|
|
|
|
### 1.2 Model Provider Tool Support
|
|
|
|
Update `ChatRequest` to accept optional `tools: Tool[]`.
|
|
|
|
**Anthropic** (`src/models/anthropic.ts`):
|
|
- Pass tools as `tools` parameter to `messages.create()`/`messages.stream()`
|
|
- Parse `tool_use` content blocks from response (id, name, input)
|
|
- Accept `tool_result` content blocks in messages (tool_use_id, content)
|
|
- Handle `stop_reason: "tool_use"` to signal tool call response
|
|
|
|
**OpenAI** (`src/models/openai.ts`):
|
|
- Pass tools as `tools` parameter with `type: "function"` wrapper
|
|
- Parse `tool_calls` from response choices
|
|
- Accept `role: "tool"` messages with `tool_call_id`
|
|
- Handle `finish_reason: "tool_calls"` to signal tool call response
|
|
|
|
**Types updates** (`src/models/types.ts`):
|
|
- Add `ToolCall` to `ChatResponse`
|
|
- Add `tool_use` and `tool_result` message roles
|
|
- Add `tools` to `ChatRequest`
|
|
- Add `tool_calls` stream event type
|
|
|
|
### 1.3 Agent Loop
|
|
|
|
Replace single-turn NativeAgent with iterative tool-use loop:
|
|
|
|
```
|
|
User message
|
|
-> Model call (with tools in request)
|
|
-> If response contains tool_use:
|
|
-> For each tool call:
|
|
-> Check hooks (confirm/log/silent)
|
|
-> If confirm: wait for approval (Telegram inline keyboard / TUI prompt)
|
|
-> Execute tool via ToolExecutor
|
|
-> Collect ToolResult
|
|
-> Append tool_results to conversation
|
|
-> Loop back to model call
|
|
-> If text response (no tool_use):
|
|
-> Return final response to user
|
|
-> If max iterations reached (default 10):
|
|
-> Return partial response with warning
|
|
```
|
|
|
|
**Streaming during tool execution**: Emit status events:
|
|
- `{ event: "tool_start", tool: "shell.exec", args: { command: "ls" } }`
|
|
- `{ event: "tool_end", tool: "shell.exec", result: { success: true, output: "..." } }`
|
|
- These render as status lines in TUI and Telegram
|
|
|
|
**Abort support**:
|
|
- TUI: Escape key sets abort flag, checked before each tool execution
|
|
- Telegram: `/cancel` command sets abort flag on active session
|
|
- Agent loop checks abort flag before each iteration
|
|
|
|
### 1.4 Frontend Updates
|
|
|
|
**Telegram**:
|
|
- Show tool execution as status messages ("Running shell.exec: `ls -la`...")
|
|
- Confirmation buttons already exist, wire to tool executor
|
|
- Show final response after tool loop completes
|
|
|
|
**TUI (both modes)**:
|
|
- Show tool calls inline with dimmed formatting
|
|
- Show tool results with output (truncated in UI if long)
|
|
- Streaming text interleaved with tool status
|
|
- Escape aborts the agent loop
|
|
|
|
### 1.5 Deliverables
|
|
|
|
- [ ] `src/tools/types.ts` - Tool, ToolCall, ToolResult interfaces
|
|
- [ ] `src/tools/registry.ts` - ToolRegistry with provider serialization
|
|
- [ ] `src/tools/executor.ts` - ToolExecutor with hooks, timeout, truncation
|
|
- [ ] `src/tools/builtin/shell.ts` - Shell exec tool
|
|
- [ ] `src/tools/builtin/file-read.ts` - File read tool
|
|
- [ ] `src/tools/builtin/file-write.ts` - File write tool
|
|
- [ ] `src/tools/builtin/file-edit.ts` - File edit tool
|
|
- [ ] `src/tools/builtin/file-list.ts` - File list/glob tool
|
|
- [ ] `src/tools/builtin/web-fetch.ts` - Web fetch tool
|
|
- [ ] `src/models/types.ts` - Add tool-related types
|
|
- [ ] `src/models/anthropic.ts` - Add tool use support
|
|
- [ ] `src/models/openai.ts` - Add tool use support
|
|
- [ ] `src/backends/native/agent.ts` - Agent loop with tool execution
|
|
- [ ] `src/frontends/telegram/bot.ts` - Tool status messages
|
|
- [ ] `src/frontends/tui/minimal.ts` - Tool display in minimal TUI
|
|
- [ ] `src/frontends/tui/components/App.tsx` - Tool display in fullscreen TUI
|
|
- [ ] Tests for all new modules
|
|
|
|
---
|
|
|
|
## Phase 2: WebSocket Gateway
|
|
|
|
**Goal**: Central control plane that multiple clients connect to. Decouples frontends from the agent.
|
|
|
|
### 2.1 Gateway Core
|
|
|
|
```
|
|
src/gateway/
|
|
├── server.ts # WebSocket server (ws library), configurable port
|
|
├── protocol.ts # JSON-RPC-style message types
|
|
├── router.ts # Routes methods to handlers
|
|
├── auth.ts # Token + password auth, Tailscale identity headers
|
|
├── session-bridge.ts # Maps WS client connections to sessions
|
|
└── handlers/
|
|
├── agent.ts # agent.send, agent.cancel, agent.status
|
|
├── sessions.ts # sessions.list, sessions.history, sessions.send
|
|
├── tools.ts # tools.list, tools.invoke
|
|
├── config.ts # config.get, config.patch
|
|
└── system.ts # system.health, system.restart
|
|
```
|
|
|
|
### 2.2 Protocol
|
|
|
|
JSON-RPC-like over WebSocket:
|
|
|
|
```
|
|
Request: { id: number, method: string, params: object }
|
|
Response: { id: number, result: object }
|
|
Error: { id: number, error: { code: number, message: string } }
|
|
Event: { id: number, event: string, data: object } // streaming
|
|
```
|
|
|
|
Event types for agent.send streaming:
|
|
- `content` - text chunk from model
|
|
- `tool_start` - tool execution beginning
|
|
- `tool_end` - tool execution complete
|
|
- `thinking` - model reasoning (if exposed)
|
|
- `done` - final response
|
|
|
|
### 2.3 Control UI
|
|
|
|
Minimal web dashboard served from gateway port:
|
|
|
|
```
|
|
src/gateway/ui/
|
|
├── index.html # Dashboard: sessions, model info, config
|
|
├── chat.html # WebChat: WS-connected chat interface
|
|
└── assets/ # CSS + minimal JS (no framework, or Preact)
|
|
```
|
|
|
|
### 2.4 Daemon Refactor
|
|
|
|
Gateway becomes the hub:
|
|
- Owns session manager, tool registry, model router
|
|
- Telegram adapter connects as a gateway client (in-process bridge)
|
|
- TUI connects as a gateway client (in-process or WS)
|
|
- WebChat connects via WS from browser
|
|
|
|
### 2.5 Deliverables
|
|
|
|
- [ ] `src/gateway/server.ts` - WebSocket server
|
|
- [ ] `src/gateway/protocol.ts` - Message type definitions
|
|
- [ ] `src/gateway/router.ts` - Method routing
|
|
- [ ] `src/gateway/auth.ts` - Authentication
|
|
- [ ] `src/gateway/session-bridge.ts` - Client-to-session mapping
|
|
- [ ] Gateway handlers (agent, sessions, tools, config, system)
|
|
- [ ] `src/gateway/ui/` - Control UI + WebChat
|
|
- [ ] Refactor daemon to use gateway as hub
|
|
- [ ] Refactor Telegram as gateway client
|
|
- [ ] Tests for gateway protocol and handlers
|
|
|
|
---
|
|
|
|
## Phase 3: Channel Adapters
|
|
|
|
**Goal**: Multi-channel inbox. One assistant accessible from WhatsApp, Telegram, Discord, Slack, and WebChat.
|
|
|
|
### 3.1 Channel Adapter Interface
|
|
|
|
```
|
|
src/channels/
|
|
├── types.ts # ChannelAdapter interface
|
|
├── registry.ts # ChannelRegistry: load/unload at runtime
|
|
├── telegram/ # Refactored from src/frontends/telegram/
|
|
├── discord/ # discord.js
|
|
├── whatsapp/ # Baileys
|
|
├── slack/ # Bolt (Socket Mode)
|
|
└── webchat/ # Gateway WS built-in
|
|
```
|
|
|
|
**ChannelAdapter interface**:
|
|
```typescript
|
|
interface ChannelAdapter {
|
|
name: string;
|
|
connect(): Promise<void>;
|
|
disconnect(): Promise<void>;
|
|
send(peerId: string, message: OutboundMessage): Promise<void>;
|
|
onMessage(handler: (msg: InboundMessage) => void): void;
|
|
}
|
|
```
|
|
|
|
### 3.2 Build Order
|
|
|
|
1. Telegram (refactor existing)
|
|
2. Discord (discord.js)
|
|
3. WhatsApp (Baileys)
|
|
4. Slack (Bolt Socket Mode)
|
|
5. WebChat (gateway built-in)
|
|
|
|
### 3.3 Security Per Channel
|
|
|
|
Every adapter implements:
|
|
- DM pairing: unknown senders get pairing code, must be approved via CLI/UI
|
|
- Allowlists: `channels.<name>.allowFrom` config array
|
|
- Group mention gating: `channels.<name>.groups.*.requireMention`
|
|
- Rate limiting: per-sender throttle (configurable)
|
|
- Message size limits per channel
|
|
|
|
### 3.4 Deliverables
|
|
|
|
- [ ] `src/channels/types.ts` - ChannelAdapter interface
|
|
- [ ] `src/channels/registry.ts` - Channel registry
|
|
- [ ] `src/channels/telegram/` - Refactored Telegram adapter
|
|
- [ ] `src/channels/discord/` - Discord adapter
|
|
- [ ] `src/channels/whatsapp/` - WhatsApp adapter
|
|
- [ ] `src/channels/slack/` - Slack adapter
|
|
- [ ] `src/channels/webchat/` - WebChat adapter
|
|
- [ ] DM pairing system
|
|
- [ ] Per-channel security (allowlists, mention gating, rate limiting)
|
|
- [ ] Tests per adapter
|
|
|
|
---
|
|
|
|
## Phase 4: Skills + MCP
|
|
|
|
**Goal**: Extensible capability system with community skills and MCP tool servers.
|
|
|
|
### 4.1 Skills System
|
|
|
|
```
|
|
src/skills/
|
|
├── types.ts # Skill, SkillManifest interfaces
|
|
├── loader.ts # Load SKILL.md + scripts from skill directories
|
|
├── registry.ts # Discovery, gating (OS/bin/env checks)
|
|
└── installer.ts # Auto-install dependencies (with user confirmation)
|
|
```
|
|
|
|
Skill directory structure:
|
|
```
|
|
~/.flynn/workspace/skills/<name>/
|
|
├── SKILL.md # Instructions injected into system prompt
|
|
├── manifest.json # Requirements, permissions, dependencies
|
|
└── scripts/ # Executable helpers
|
|
```
|
|
|
|
Three tiers: bundled (shipped with Flynn), managed (installed via CLI), workspace (user-created).
|
|
|
|
### 4.2 MCP Integration
|
|
|
|
```
|
|
src/mcp/
|
|
├── client.ts # MCP client (stdio transport)
|
|
├── bridge.ts # Convert MCP tools -> Flynn tool registry entries
|
|
└── manager.ts # Lifecycle: start/stop/restart MCP servers per config
|
|
```
|
|
|
|
Wire the existing `mcp.servers` config to actually start MCP server processes, discover their tools, and register them in the tool registry. MCP tools appear alongside builtins -- the agent doesn't know the difference.
|
|
|
|
### 4.3 Deliverables
|
|
|
|
- [ ] Skills types, loader, registry, installer
|
|
- [ ] MCP client, bridge, manager
|
|
- [ ] Wire MCP config to runtime
|
|
- [ ] Bundled skills (at least: web-search, git, system-info)
|
|
- [ ] Tests
|
|
|
|
---
|
|
|
|
## Phase 5: Advanced Features
|
|
|
|
Build after core is solid. Each is independent.
|
|
|
|
| Feature | Module | Description |
|
|
|---------|--------|-------------|
|
|
| Cron/scheduling | `src/automation/cron.ts` | Cron expressions trigger agent messages |
|
|
| Webhooks | `src/automation/webhooks.ts` | Inbound HTTP triggers |
|
|
| Browser tool | `src/tools/builtin/browser.ts` | Playwright headless browser |
|
|
| Agent-to-agent | `sessions_list/history/send` tools | Multi-agent coordination |
|
|
| Sub-agents | `src/backends/native/subagent.ts` | Spawn scoped child sessions |
|
|
| Sandboxing | `src/sandbox/` | Docker per-session isolation |
|
|
| Voice | `src/voice/` | STT (Whisper) + TTS (ElevenLabs) |
|
|
| Canvas/A2UI | `src/gateway/ui/canvas/` | Agent-driven web workspace |
|
|
| CLI surface | `src/cli/` | `flynn gateway`, `flynn agent`, `flynn send`, etc. |
|
|
| Mobile nodes | Separate repos | iOS (Swift) + Android (Kotlin) companion apps |
|
|
| Onboarding wizard | `src/cli/onboard.ts` | Guided setup flow |
|
|
| Doctor diagnostics | `src/cli/doctor.ts` | Config + health validation |
|
|
|
|
---
|
|
|
|
## Implementation Notes
|
|
|
|
### Model Selection for Subagents
|
|
|
|
Use cheaper/faster models via GitHub Copilot for implementation:
|
|
- **Sonnet**: Complex implementation tasks (agent loop, gateway, protocol)
|
|
- **Haiku**: Mechanical tasks (individual tool implementations, adapter boilerplate, tests)
|
|
- **Opus**: Design review, architecture decisions only
|
|
|
|
### Testing Strategy
|
|
|
|
- Unit tests for all tools (mock filesystem/network)
|
|
- Unit tests for tool registry serialization (Anthropic + OpenAI formats)
|
|
- Integration tests for agent loop (mock model returns tool_use, verify execution)
|
|
- Integration tests for gateway protocol (WS client/server)
|
|
- E2E tests for tool execution (real shell commands in temp dirs)
|
|
|
|
### Migration Path
|
|
|
|
- Phase 1 is additive (no breaking changes to existing code)
|
|
- Phase 2 refactors daemon (breaking, but internal only)
|
|
- Phase 3 moves Telegram (file rename, adapter interface)
|
|
- Each phase is a separate feature branch
|
|
|
|
---
|
|
|
|
*Design Version: 1.0*
|
|
*Created: 2026-02-05*
|
|
*Approach: Bottom-up (tools -> gateway -> channels -> skills -> advanced)*
|