feat: add channel adapter abstraction with Telegram and WebChat adapters

Implement Phase 3 channel adapters that decouple message sources from
the agent via a uniform ChannelAdapter interface and ChannelRegistry.

- Add ChannelAdapter/InboundMessage/OutboundMessage types
- Add ChannelRegistry for adapter lifecycle and message routing
- Add TelegramAdapter (grammy bot, auth middleware, confirmations, chunking)
- Add WebChatAdapter (thin shim over GatewayServer)
- Refactor daemon to use ChannelRegistry with per-channel-per-user agents
- Add config.get/config.patch gateway handlers (Phase 2 loose end)
- Add system.restart gateway handler (Phase 2 loose end)
- Add implementation plans and design docs

Tests: 225 passing (33 new channel adapter + gateway handler tests)
This commit is contained in:
William Valentin
2026-02-05 20:00:36 -08:00
parent 282a15d2b9
commit aa95f2132c
19 changed files with 4123 additions and 37 deletions
@@ -0,0 +1,378 @@
# OpenClaw Parity Design: Flynn Feature Roadmap
## Overview
Plan to evolve Flynn from a multi-model chat wrapper into a full self-hosted OpenClaw alternative. Bottom-up approach: tools first, then gateway, then channels, then skills/advanced features.
**Build approach**: Use Sonnet/Haiku via GitHub Copilot for implementation subagents.
**Current state**: Flynn v0.1.0 covers ~22% of OpenClaw features. Strong in model routing (4 providers, tiered fallback) and session management (SQLite, transfer). Zero tool execution capability.
---
## Phase 1: Agent Tool Framework + Agent Loop
**Goal**: Turn Flynn from a chatbot into an agent that can execute tools with multi-step reasoning.
### 1.1 Tool Definition System
```
src/tools/
├── types.ts # Tool, ToolCall, ToolResult interfaces
├── registry.ts # ToolRegistry: register, lookup, list, serialize for providers
├── executor.ts # ToolExecutor: run tools, enforce hooks, timeout, truncation
├── builtin/
│ ├── shell.ts # shell.exec - run bash commands (cwd, timeout, max output)
│ ├── file-read.ts # file.read - read file contents (path, offset, limit)
│ ├── file-write.ts # file.write - write/create files (path, content)
│ ├── file-edit.ts # file.edit - find-and-replace (path, oldString, newString)
│ ├── file-list.ts # file.list - glob/list directory (pattern, path)
│ └── web-fetch.ts # web.fetch - HTTP GET with markdown conversion (url, format)
```
**Tool interface**:
```typescript
interface Tool {
name: string; // e.g. "shell.exec"
description: string; // For the model's tool selection
inputSchema: Record<string, unknown>; // JSON Schema for parameters
execute(args: unknown): Promise<ToolResult>;
}
interface ToolResult {
success: boolean;
output: string;
error?: string;
}
interface ToolCall {
id: string; // Provider-assigned call ID
name: string; // Tool name
args: unknown; // Parsed arguments
}
```
**ToolRegistry**: Collects tools, serializes to Anthropic format (`{ name, description, input_schema }`) or OpenAI format (`{ type: "function", function: { name, description, parameters } }`).
**ToolExecutor**: Wraps execution with:
- Hook engine check (confirm/log/silent) before execution
- Configurable timeout (default 30s for shell, 10s for file ops)
- Output truncation (max 50KB, with "truncated" marker)
- Error capture (catches exceptions, returns as ToolResult.error)
### 1.2 Model Provider Tool Support
Update `ChatRequest` to accept optional `tools: Tool[]`.
**Anthropic** (`src/models/anthropic.ts`):
- Pass tools as `tools` parameter to `messages.create()`/`messages.stream()`
- Parse `tool_use` content blocks from response (id, name, input)
- Accept `tool_result` content blocks in messages (tool_use_id, content)
- Handle `stop_reason: "tool_use"` to signal tool call response
**OpenAI** (`src/models/openai.ts`):
- Pass tools as `tools` parameter with `type: "function"` wrapper
- Parse `tool_calls` from response choices
- Accept `role: "tool"` messages with `tool_call_id`
- Handle `finish_reason: "tool_calls"` to signal tool call response
**Types updates** (`src/models/types.ts`):
- Add `ToolCall` to `ChatResponse`
- Add `tool_use` and `tool_result` message roles
- Add `tools` to `ChatRequest`
- Add `tool_calls` stream event type
### 1.3 Agent Loop
Replace single-turn NativeAgent with iterative tool-use loop:
```
User message
-> Model call (with tools in request)
-> If response contains tool_use:
-> For each tool call:
-> Check hooks (confirm/log/silent)
-> If confirm: wait for approval (Telegram inline keyboard / TUI prompt)
-> Execute tool via ToolExecutor
-> Collect ToolResult
-> Append tool_results to conversation
-> Loop back to model call
-> If text response (no tool_use):
-> Return final response to user
-> If max iterations reached (default 10):
-> Return partial response with warning
```
**Streaming during tool execution**: Emit status events:
- `{ event: "tool_start", tool: "shell.exec", args: { command: "ls" } }`
- `{ event: "tool_end", tool: "shell.exec", result: { success: true, output: "..." } }`
- These render as status lines in TUI and Telegram
**Abort support**:
- TUI: Escape key sets abort flag, checked before each tool execution
- Telegram: `/cancel` command sets abort flag on active session
- Agent loop checks abort flag before each iteration
### 1.4 Frontend Updates
**Telegram**:
- Show tool execution as status messages ("Running shell.exec: `ls -la`...")
- Confirmation buttons already exist, wire to tool executor
- Show final response after tool loop completes
**TUI (both modes)**:
- Show tool calls inline with dimmed formatting
- Show tool results with output (truncated in UI if long)
- Streaming text interleaved with tool status
- Escape aborts the agent loop
### 1.5 Deliverables
- [ ] `src/tools/types.ts` - Tool, ToolCall, ToolResult interfaces
- [ ] `src/tools/registry.ts` - ToolRegistry with provider serialization
- [ ] `src/tools/executor.ts` - ToolExecutor with hooks, timeout, truncation
- [ ] `src/tools/builtin/shell.ts` - Shell exec tool
- [ ] `src/tools/builtin/file-read.ts` - File read tool
- [ ] `src/tools/builtin/file-write.ts` - File write tool
- [ ] `src/tools/builtin/file-edit.ts` - File edit tool
- [ ] `src/tools/builtin/file-list.ts` - File list/glob tool
- [ ] `src/tools/builtin/web-fetch.ts` - Web fetch tool
- [ ] `src/models/types.ts` - Add tool-related types
- [ ] `src/models/anthropic.ts` - Add tool use support
- [ ] `src/models/openai.ts` - Add tool use support
- [ ] `src/backends/native/agent.ts` - Agent loop with tool execution
- [ ] `src/frontends/telegram/bot.ts` - Tool status messages
- [ ] `src/frontends/tui/minimal.ts` - Tool display in minimal TUI
- [ ] `src/frontends/tui/components/App.tsx` - Tool display in fullscreen TUI
- [ ] Tests for all new modules
---
## Phase 2: WebSocket Gateway
**Goal**: Central control plane that multiple clients connect to. Decouples frontends from the agent.
### 2.1 Gateway Core
```
src/gateway/
├── server.ts # WebSocket server (ws library), configurable port
├── protocol.ts # JSON-RPC-style message types
├── router.ts # Routes methods to handlers
├── auth.ts # Token + password auth, Tailscale identity headers
├── session-bridge.ts # Maps WS client connections to sessions
└── handlers/
├── agent.ts # agent.send, agent.cancel, agent.status
├── sessions.ts # sessions.list, sessions.history, sessions.send
├── tools.ts # tools.list, tools.invoke
├── config.ts # config.get, config.patch
└── system.ts # system.health, system.restart
```
### 2.2 Protocol
JSON-RPC-like over WebSocket:
```
Request: { id: number, method: string, params: object }
Response: { id: number, result: object }
Error: { id: number, error: { code: number, message: string } }
Event: { id: number, event: string, data: object } // streaming
```
Event types for agent.send streaming:
- `content` - text chunk from model
- `tool_start` - tool execution beginning
- `tool_end` - tool execution complete
- `thinking` - model reasoning (if exposed)
- `done` - final response
### 2.3 Control UI
Minimal web dashboard served from gateway port:
```
src/gateway/ui/
├── index.html # Dashboard: sessions, model info, config
├── chat.html # WebChat: WS-connected chat interface
└── assets/ # CSS + minimal JS (no framework, or Preact)
```
### 2.4 Daemon Refactor
Gateway becomes the hub:
- Owns session manager, tool registry, model router
- Telegram adapter connects as a gateway client (in-process bridge)
- TUI connects as a gateway client (in-process or WS)
- WebChat connects via WS from browser
### 2.5 Deliverables
- [ ] `src/gateway/server.ts` - WebSocket server
- [ ] `src/gateway/protocol.ts` - Message type definitions
- [ ] `src/gateway/router.ts` - Method routing
- [ ] `src/gateway/auth.ts` - Authentication
- [ ] `src/gateway/session-bridge.ts` - Client-to-session mapping
- [ ] Gateway handlers (agent, sessions, tools, config, system)
- [ ] `src/gateway/ui/` - Control UI + WebChat
- [ ] Refactor daemon to use gateway as hub
- [ ] Refactor Telegram as gateway client
- [ ] Tests for gateway protocol and handlers
---
## Phase 3: Channel Adapters
**Goal**: Multi-channel inbox. One assistant accessible from WhatsApp, Telegram, Discord, Slack, and WebChat.
### 3.1 Channel Adapter Interface
```
src/channels/
├── types.ts # ChannelAdapter interface
├── registry.ts # ChannelRegistry: load/unload at runtime
├── telegram/ # Refactored from src/frontends/telegram/
├── discord/ # discord.js
├── whatsapp/ # Baileys
├── slack/ # Bolt (Socket Mode)
└── webchat/ # Gateway WS built-in
```
**ChannelAdapter interface**:
```typescript
interface ChannelAdapter {
name: string;
connect(): Promise<void>;
disconnect(): Promise<void>;
send(peerId: string, message: OutboundMessage): Promise<void>;
onMessage(handler: (msg: InboundMessage) => void): void;
}
```
### 3.2 Build Order
1. Telegram (refactor existing)
2. Discord (discord.js)
3. WhatsApp (Baileys)
4. Slack (Bolt Socket Mode)
5. WebChat (gateway built-in)
### 3.3 Security Per Channel
Every adapter implements:
- DM pairing: unknown senders get pairing code, must be approved via CLI/UI
- Allowlists: `channels.<name>.allowFrom` config array
- Group mention gating: `channels.<name>.groups.*.requireMention`
- Rate limiting: per-sender throttle (configurable)
- Message size limits per channel
### 3.4 Deliverables
- [ ] `src/channels/types.ts` - ChannelAdapter interface
- [ ] `src/channels/registry.ts` - Channel registry
- [ ] `src/channels/telegram/` - Refactored Telegram adapter
- [ ] `src/channels/discord/` - Discord adapter
- [ ] `src/channels/whatsapp/` - WhatsApp adapter
- [ ] `src/channels/slack/` - Slack adapter
- [ ] `src/channels/webchat/` - WebChat adapter
- [ ] DM pairing system
- [ ] Per-channel security (allowlists, mention gating, rate limiting)
- [ ] Tests per adapter
---
## Phase 4: Skills + MCP
**Goal**: Extensible capability system with community skills and MCP tool servers.
### 4.1 Skills System
```
src/skills/
├── types.ts # Skill, SkillManifest interfaces
├── loader.ts # Load SKILL.md + scripts from skill directories
├── registry.ts # Discovery, gating (OS/bin/env checks)
└── installer.ts # Auto-install dependencies (with user confirmation)
```
Skill directory structure:
```
~/.flynn/workspace/skills/<name>/
├── SKILL.md # Instructions injected into system prompt
├── manifest.json # Requirements, permissions, dependencies
└── scripts/ # Executable helpers
```
Three tiers: bundled (shipped with Flynn), managed (installed via CLI), workspace (user-created).
### 4.2 MCP Integration
```
src/mcp/
├── client.ts # MCP client (stdio transport)
├── bridge.ts # Convert MCP tools -> Flynn tool registry entries
└── manager.ts # Lifecycle: start/stop/restart MCP servers per config
```
Wire the existing `mcp.servers` config to actually start MCP server processes, discover their tools, and register them in the tool registry. MCP tools appear alongside builtins -- the agent doesn't know the difference.
### 4.3 Deliverables
- [ ] Skills types, loader, registry, installer
- [ ] MCP client, bridge, manager
- [ ] Wire MCP config to runtime
- [ ] Bundled skills (at least: web-search, git, system-info)
- [ ] Tests
---
## Phase 5: Advanced Features
Build after core is solid. Each is independent.
| Feature | Module | Description |
|---------|--------|-------------|
| Cron/scheduling | `src/automation/cron.ts` | Cron expressions trigger agent messages |
| Webhooks | `src/automation/webhooks.ts` | Inbound HTTP triggers |
| Browser tool | `src/tools/builtin/browser.ts` | Playwright headless browser |
| Agent-to-agent | `sessions_list/history/send` tools | Multi-agent coordination |
| Sub-agents | `src/backends/native/subagent.ts` | Spawn scoped child sessions |
| Sandboxing | `src/sandbox/` | Docker per-session isolation |
| Voice | `src/voice/` | STT (Whisper) + TTS (ElevenLabs) |
| Canvas/A2UI | `src/gateway/ui/canvas/` | Agent-driven web workspace |
| CLI surface | `src/cli/` | `flynn gateway`, `flynn agent`, `flynn send`, etc. |
| Mobile nodes | Separate repos | iOS (Swift) + Android (Kotlin) companion apps |
| Onboarding wizard | `src/cli/onboard.ts` | Guided setup flow |
| Doctor diagnostics | `src/cli/doctor.ts` | Config + health validation |
---
## Implementation Notes
### Model Selection for Subagents
Use cheaper/faster models via GitHub Copilot for implementation:
- **Sonnet**: Complex implementation tasks (agent loop, gateway, protocol)
- **Haiku**: Mechanical tasks (individual tool implementations, adapter boilerplate, tests)
- **Opus**: Design review, architecture decisions only
### Testing Strategy
- Unit tests for all tools (mock filesystem/network)
- Unit tests for tool registry serialization (Anthropic + OpenAI formats)
- Integration tests for agent loop (mock model returns tool_use, verify execution)
- Integration tests for gateway protocol (WS client/server)
- E2E tests for tool execution (real shell commands in temp dirs)
### Migration Path
- Phase 1 is additive (no breaking changes to existing code)
- Phase 2 refactors daemon (breaking, but internal only)
- Phase 3 moves Telegram (file rename, adapter interface)
- Each phase is a separate feature branch
---
*Design Version: 1.0*
*Created: 2026-02-05*
*Approach: Bottom-up (tools -> gateway -> channels -> skills -> advanced)*
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,203 @@
# Phase 3: Channel Adapters — Implementation Plan
## Goal
Introduce a `ChannelAdapter` abstraction that decouples message sources (Telegram, WebChat, future Discord/WhatsApp/Slack) from the agent. Each adapter handles platform-specific I/O and maps messages to a common interface. A `ChannelRegistry` manages adapter lifecycle and routes messages to/from the agent.
## Scope (This Iteration)
1. **Channel types**`ChannelAdapter` interface, `InboundMessage`, `OutboundMessage`, `ChannelStatus`
2. **Channel registry** — Register, start/stop, route messages, adapter lifecycle
3. **Telegram adapter** — Refactor existing `src/frontends/telegram/` into a `ChannelAdapter`
4. **WebChat adapter** — Wrap the existing gateway WS into a `ChannelAdapter`
5. **Daemon integration** — Replace direct bot/gateway creation with registry-managed adapters
Discord, WhatsApp, and Slack adapters are deferred (require new dependencies + credentials).
## Architecture
```
src/channels/
├── types.ts # ChannelAdapter interface, message types
├── registry.ts # ChannelRegistry: lifecycle + message routing
├── registry.test.ts # Registry tests
├── index.ts # Barrel exports
├── telegram/
│ ├── adapter.ts # TelegramAdapter implements ChannelAdapter
│ ├── adapter.test.ts # Adapter tests
│ └── index.ts # Barrel
└── webchat/
├── adapter.ts # WebChatAdapter implements ChannelAdapter
├── adapter.test.ts # Adapter tests
└── index.ts # Barrel
```
The existing `src/frontends/telegram/` code (bot.ts, handlers.ts, confirmations.ts) stays in place and is wrapped by the adapter. The adapter delegates to the existing bot creation logic. No breaking changes to existing code.
## Types Design
### InboundMessage
```typescript
interface InboundMessage {
id: string; // Platform message ID
channel: string; // Adapter name: "telegram", "webchat", etc.
senderId: string; // Platform user ID
senderName?: string; // Display name (optional)
text: string; // Message text
replyTo?: string; // ID of message being replied to
timestamp: number; // Unix ms
metadata?: Record<string, unknown>; // Platform-specific extras
}
```
### OutboundMessage
```typescript
interface OutboundMessage {
text: string; // Response text (markdown)
replyTo?: string; // Original message ID
metadata?: Record<string, unknown>; // Platform-specific extras (e.g. parse_mode)
}
```
### ChannelAdapter
```typescript
interface ChannelAdapter {
readonly name: string;
readonly status: ChannelStatus;
/** Start the adapter (connect to platform, begin listening). */
connect(): Promise<void>;
/** Stop the adapter (disconnect, clean up). */
disconnect(): Promise<void>;
/** Send a message to a specific peer. */
send(peerId: string, message: OutboundMessage): Promise<void>;
/** Register the inbound message handler. Called by registry. */
onMessage(handler: (msg: InboundMessage) => void): void;
/** Register a tool event handler for displaying tool execution status. */
onToolEvent?(handler: (peerId: string, event: ToolStatusEvent) => void): void;
}
type ChannelStatus = 'disconnected' | 'connecting' | 'connected' | 'error';
interface ToolStatusEvent {
type: 'start' | 'end';
tool: string;
args?: unknown;
result?: { success: boolean; output: string; error?: string };
}
```
### ChannelRegistry
```typescript
class ChannelRegistry {
register(adapter: ChannelAdapter): void;
unregister(name: string): void;
get(name: string): ChannelAdapter | undefined;
list(): ChannelAdapter[];
/** Start all registered adapters. */
startAll(): Promise<void>;
/** Stop all registered adapters. */
stopAll(): Promise<void>;
/** Set the message handler that all adapters route to. */
setMessageHandler(handler: (msg: InboundMessage, reply: (msg: OutboundMessage) => Promise<void>) => Promise<void>): void;
}
```
## Telegram Adapter Design
The `TelegramAdapter` wraps the existing `createTelegramBot()` logic:
- `connect()`: Creates grammy Bot, starts long polling
- `disconnect()`: Stops the bot
- `send()`: Calls `bot.api.sendMessage(peerId, text, { parse_mode: 'Markdown' })`
- `onMessage()`: Sets up `bot.on('message:text', ...)` to convert grammy context to `InboundMessage`
- Preserves existing confirmations, commands (/start, /reset, /status, /local, /cloud, /model)
- Preserves chat ID allowlist check as middleware
- Tool status display: adapter handles the `onToolUse` events by posting/editing Telegram messages
Constructor takes:
```typescript
interface TelegramAdapterConfig {
botToken: string;
allowedChatIds: number[];
hookEngine?: HookEngine;
}
```
The adapter does NOT take an agent directly — the registry routes messages to the agent.
## WebChat Adapter Design
The `WebChatAdapter` is a thin shim since the gateway already handles WS connections.
- `connect()`: No-op (gateway server is already running)
- `disconnect()`: No-op (gateway lifecycle managed by daemon)
- `send()`: Sends via the gateway's WS connection to the peer
- `onMessage()`: Hooks into the gateway's agent.send handler to intercept messages
Constructor takes:
```typescript
interface WebChatAdapterConfig {
gateway: GatewayServer;
}
```
This adapter is simpler because the gateway already has its own session bridge and agent management. The adapter primarily exists to:
1. Report WebChat as a registered channel in the registry
2. Allow the daemon to manage all channels uniformly
3. Provide status/metrics via a common interface
## Daemon Integration
The daemon currently:
1. Creates a grammy Bot directly
2. Creates a GatewayServer directly
3. Starts both independently
After refactor:
1. Creates a ChannelRegistry
2. Creates TelegramAdapter + WebChatAdapter
3. Registers both with the registry
4. Registry starts all adapters
The message handler in the registry creates per-channel agents via the session manager, same as the existing session bridge pattern.
## Implementation Order
1. `src/channels/types.ts` — Pure types (no runtime)
2. `src/channels/registry.ts` — Registry class
3. `src/channels/registry.test.ts` — Registry unit tests
4. `src/channels/telegram/adapter.ts` — Telegram adapter
5. `src/channels/telegram/adapter.test.ts` — Telegram adapter tests
6. `src/channels/webchat/adapter.ts` — WebChat adapter
7. `src/channels/webchat/adapter.test.ts` — WebChat adapter tests
8. `src/channels/index.ts` + sub-barrel exports
9. `src/daemon/index.ts` — Wire registry
10. Run full test suite
## Existing Code Impact
- `src/frontends/telegram/`**NOT deleted**. The adapter wraps these existing modules.
- `src/gateway/`**NOT modified**. WebChat adapter wraps the existing gateway.
- `src/daemon/index.ts`**Modified** to use ChannelRegistry.
- `src/backends/native/agent.ts`**NOT modified**. Agent creation happens in the registry message handler.
## Test Strategy
- Unit tests for ChannelRegistry (mock adapters)
- Unit tests for TelegramAdapter (mock grammy Bot)
- Unit tests for WebChatAdapter (mock GatewayServer)
- Existing tests remain unchanged (frontends/telegram, gateway)
---
*Plan Version: 1.0*
*Created: 2026-02-05*
*Parent: docs/plans/2026-02-05-openclaw-parity-design.md Phase 3*