will/flynn

Files

T

William Valentin aa95f2132c feat: add channel adapter abstraction with Telegram and WebChat adapters

Implement Phase 3 channel adapters that decouple message sources from
the agent via a uniform ChannelAdapter interface and ChannelRegistry.

- Add ChannelAdapter/InboundMessage/OutboundMessage types
- Add ChannelRegistry for adapter lifecycle and message routing
- Add TelegramAdapter (grammy bot, auth middleware, confirmations, chunking)
- Add WebChatAdapter (thin shim over GatewayServer)
- Refactor daemon to use ChannelRegistry with per-channel-per-user agents
- Add config.get/config.patch gateway handlers (Phase 2 loose end)
- Add system.restart gateway handler (Phase 2 loose end)
- Add implementation plans and design docs

Tests: 225 passing (33 new channel adapter + gateway handler tests)

2026-02-05 20:00:36 -08:00

14 KiB

Raw Blame History

OpenClaw Parity Design: Flynn Feature Roadmap

Overview

Plan to evolve Flynn from a multi-model chat wrapper into a full self-hosted OpenClaw alternative. Bottom-up approach: tools first, then gateway, then channels, then skills/advanced features.

Build approach: Use Sonnet/Haiku via GitHub Copilot for implementation subagents.

Current state: Flynn v0.1.0 covers ~22% of OpenClaw features. Strong in model routing (4 providers, tiered fallback) and session management (SQLite, transfer). Zero tool execution capability.

Phase 1: Agent Tool Framework + Agent Loop

Goal: Turn Flynn from a chatbot into an agent that can execute tools with multi-step reasoning.

1.1 Tool Definition System

src/tools/
├── types.ts          # Tool, ToolCall, ToolResult interfaces
├── registry.ts       # ToolRegistry: register, lookup, list, serialize for providers
├── executor.ts       # ToolExecutor: run tools, enforce hooks, timeout, truncation
├── builtin/
│   ├── shell.ts      # shell.exec - run bash commands (cwd, timeout, max output)
│   ├── file-read.ts  # file.read - read file contents (path, offset, limit)
│   ├── file-write.ts # file.write - write/create files (path, content)
│   ├── file-edit.ts  # file.edit - find-and-replace (path, oldString, newString)
│   ├── file-list.ts  # file.list - glob/list directory (pattern, path)
│   └── web-fetch.ts  # web.fetch - HTTP GET with markdown conversion (url, format)

Tool interface:

interface Tool {
  name: string;                           // e.g. "shell.exec"
  description: string;                    // For the model's tool selection
  inputSchema: Record<string, unknown>;   // JSON Schema for parameters
  execute(args: unknown): Promise<ToolResult>;
}

interface ToolResult {
  success: boolean;
  output: string;
  error?: string;
}

interface ToolCall {
  id: string;          // Provider-assigned call ID
  name: string;        // Tool name
  args: unknown;       // Parsed arguments
}

ToolRegistry: Collects tools, serializes to Anthropic format ({ name, description, input_schema }) or OpenAI format ({ type: "function", function: { name, description, parameters } }).

ToolExecutor: Wraps execution with:

Hook engine check (confirm/log/silent) before execution
Configurable timeout (default 30s for shell, 10s for file ops)
Output truncation (max 50KB, with "truncated" marker)
Error capture (catches exceptions, returns as ToolResult.error)

1.2 Model Provider Tool Support

Update ChatRequest to accept optional tools: Tool[].

Anthropic (src/models/anthropic.ts):

Pass tools as tools parameter to messages.create()/messages.stream()
Parse tool_use content blocks from response (id, name, input)
Accept tool_result content blocks in messages (tool_use_id, content)
Handle stop_reason: "tool_use" to signal tool call response

OpenAI (src/models/openai.ts):

Pass tools as tools parameter with type: "function" wrapper
Parse tool_calls from response choices
Accept role: "tool" messages with tool_call_id
Handle finish_reason: "tool_calls" to signal tool call response

Types updates (src/models/types.ts):

Add ToolCall to ChatResponse
Add tool_use and tool_result message roles
Add tools to ChatRequest
Add tool_calls stream event type

1.3 Agent Loop

Replace single-turn NativeAgent with iterative tool-use loop:

User message
  -> Model call (with tools in request)
  -> If response contains tool_use:
       -> For each tool call:
            -> Check hooks (confirm/log/silent)
            -> If confirm: wait for approval (Telegram inline keyboard / TUI prompt)
            -> Execute tool via ToolExecutor
            -> Collect ToolResult
       -> Append tool_results to conversation
       -> Loop back to model call
  -> If text response (no tool_use):
       -> Return final response to user
  -> If max iterations reached (default 10):
       -> Return partial response with warning

Streaming during tool execution: Emit status events:

{ event: "tool_start", tool: "shell.exec", args: { command: "ls" } }
{ event: "tool_end", tool: "shell.exec", result: { success: true, output: "..." } }
These render as status lines in TUI and Telegram

Abort support:

TUI: Escape key sets abort flag, checked before each tool execution
Telegram: /cancel command sets abort flag on active session
Agent loop checks abort flag before each iteration

1.4 Frontend Updates

Telegram:

Show tool execution as status messages ("Running shell.exec: ls -la...")
Confirmation buttons already exist, wire to tool executor
Show final response after tool loop completes

TUI (both modes):

Show tool calls inline with dimmed formatting
Show tool results with output (truncated in UI if long)
Streaming text interleaved with tool status
Escape aborts the agent loop

1.5 Deliverables

src/tools/types.ts - Tool, ToolCall, ToolResult interfaces
src/tools/registry.ts - ToolRegistry with provider serialization
src/tools/executor.ts - ToolExecutor with hooks, timeout, truncation
src/tools/builtin/shell.ts - Shell exec tool
src/tools/builtin/file-read.ts - File read tool
src/tools/builtin/file-write.ts - File write tool
src/tools/builtin/file-edit.ts - File edit tool
src/tools/builtin/file-list.ts - File list/glob tool
src/tools/builtin/web-fetch.ts - Web fetch tool
src/models/types.ts - Add tool-related types
src/models/anthropic.ts - Add tool use support
src/models/openai.ts - Add tool use support
src/backends/native/agent.ts - Agent loop with tool execution
src/frontends/telegram/bot.ts - Tool status messages
src/frontends/tui/minimal.ts - Tool display in minimal TUI
src/frontends/tui/components/App.tsx - Tool display in fullscreen TUI
Tests for all new modules

Phase 2: WebSocket Gateway

Goal: Central control plane that multiple clients connect to. Decouples frontends from the agent.

2.1 Gateway Core

src/gateway/
├── server.ts         # WebSocket server (ws library), configurable port
├── protocol.ts       # JSON-RPC-style message types
├── router.ts         # Routes methods to handlers
├── auth.ts           # Token + password auth, Tailscale identity headers
├── session-bridge.ts # Maps WS client connections to sessions
└── handlers/
    ├── agent.ts      # agent.send, agent.cancel, agent.status
    ├── sessions.ts   # sessions.list, sessions.history, sessions.send
    ├── tools.ts      # tools.list, tools.invoke
    ├── config.ts     # config.get, config.patch
    └── system.ts     # system.health, system.restart

2.2 Protocol

JSON-RPC-like over WebSocket:

Request:  { id: number, method: string, params: object }
Response: { id: number, result: object }
Error:    { id: number, error: { code: number, message: string } }
Event:    { id: number, event: string, data: object }  // streaming

Event types for agent.send streaming:

content - text chunk from model
tool_start - tool execution beginning
tool_end - tool execution complete
thinking - model reasoning (if exposed)
done - final response

2.3 Control UI

Minimal web dashboard served from gateway port:

src/gateway/ui/
├── index.html        # Dashboard: sessions, model info, config
├── chat.html         # WebChat: WS-connected chat interface
└── assets/           # CSS + minimal JS (no framework, or Preact)

2.4 Daemon Refactor

Gateway becomes the hub:

Owns session manager, tool registry, model router
Telegram adapter connects as a gateway client (in-process bridge)
TUI connects as a gateway client (in-process or WS)
WebChat connects via WS from browser

2.5 Deliverables

src/gateway/server.ts - WebSocket server
src/gateway/protocol.ts - Message type definitions
src/gateway/router.ts - Method routing
src/gateway/auth.ts - Authentication
src/gateway/session-bridge.ts - Client-to-session mapping
Gateway handlers (agent, sessions, tools, config, system)
src/gateway/ui/ - Control UI + WebChat
Refactor daemon to use gateway as hub
Refactor Telegram as gateway client
Tests for gateway protocol and handlers

Phase 3: Channel Adapters

Goal: Multi-channel inbox. One assistant accessible from WhatsApp, Telegram, Discord, Slack, and WebChat.

3.1 Channel Adapter Interface

src/channels/
├── types.ts          # ChannelAdapter interface
├── registry.ts       # ChannelRegistry: load/unload at runtime
├── telegram/         # Refactored from src/frontends/telegram/
├── discord/          # discord.js
├── whatsapp/         # Baileys
├── slack/            # Bolt (Socket Mode)
└── webchat/          # Gateway WS built-in

ChannelAdapter interface:

interface ChannelAdapter {
  name: string;
  connect(): Promise<void>;
  disconnect(): Promise<void>;
  send(peerId: string, message: OutboundMessage): Promise<void>;
  onMessage(handler: (msg: InboundMessage) => void): void;
}

3.2 Build Order

Telegram (refactor existing)
Discord (discord.js)
WhatsApp (Baileys)
Slack (Bolt Socket Mode)
WebChat (gateway built-in)

3.3 Security Per Channel

Every adapter implements:

DM pairing: unknown senders get pairing code, must be approved via CLI/UI
Allowlists: channels.<name>.allowFrom config array
Group mention gating: channels.<name>.groups.*.requireMention
Rate limiting: per-sender throttle (configurable)
Message size limits per channel

3.4 Deliverables

src/channels/types.ts - ChannelAdapter interface
src/channels/registry.ts - Channel registry
src/channels/telegram/ - Refactored Telegram adapter
src/channels/discord/ - Discord adapter
src/channels/whatsapp/ - WhatsApp adapter
src/channels/slack/ - Slack adapter
src/channels/webchat/ - WebChat adapter
DM pairing system
Per-channel security (allowlists, mention gating, rate limiting)
Tests per adapter

Phase 4: Skills + MCP

Goal: Extensible capability system with community skills and MCP tool servers.

4.1 Skills System

src/skills/
├── types.ts          # Skill, SkillManifest interfaces
├── loader.ts         # Load SKILL.md + scripts from skill directories
├── registry.ts       # Discovery, gating (OS/bin/env checks)
└── installer.ts      # Auto-install dependencies (with user confirmation)

Skill directory structure:

~/.flynn/workspace/skills/<name>/
├── SKILL.md          # Instructions injected into system prompt
├── manifest.json     # Requirements, permissions, dependencies
└── scripts/          # Executable helpers

Three tiers: bundled (shipped with Flynn), managed (installed via CLI), workspace (user-created).

4.2 MCP Integration

src/mcp/
├── client.ts         # MCP client (stdio transport)
├── bridge.ts         # Convert MCP tools -> Flynn tool registry entries
└── manager.ts        # Lifecycle: start/stop/restart MCP servers per config

Wire the existing mcp.servers config to actually start MCP server processes, discover their tools, and register them in the tool registry. MCP tools appear alongside builtins -- the agent doesn't know the difference.

4.3 Deliverables

Skills types, loader, registry, installer
MCP client, bridge, manager
Wire MCP config to runtime
Bundled skills (at least: web-search, git, system-info)
Tests

Phase 5: Advanced Features

Build after core is solid. Each is independent.

Feature	Module	Description
Cron/scheduling	`src/automation/cron.ts`	Cron expressions trigger agent messages
Webhooks	`src/automation/webhooks.ts`	Inbound HTTP triggers
Browser tool	`src/tools/builtin/browser.ts`	Playwright headless browser
Agent-to-agent	`sessions_list/history/send` tools	Multi-agent coordination
Sub-agents	`src/backends/native/subagent.ts`	Spawn scoped child sessions
Sandboxing	`src/sandbox/`	Docker per-session isolation
Voice	`src/voice/`	STT (Whisper) + TTS (ElevenLabs)
Canvas/A2UI	`src/gateway/ui/canvas/`	Agent-driven web workspace
CLI surface	`src/cli/`	`flynn gateway`, `flynn agent`, `flynn send`, etc.
Mobile nodes	Separate repos	iOS (Swift) + Android (Kotlin) companion apps
Onboarding wizard	`src/cli/onboard.ts`	Guided setup flow
Doctor diagnostics	`src/cli/doctor.ts`	Config + health validation

Implementation Notes

Model Selection for Subagents

Use cheaper/faster models via GitHub Copilot for implementation:

Sonnet: Complex implementation tasks (agent loop, gateway, protocol)
Haiku: Mechanical tasks (individual tool implementations, adapter boilerplate, tests)
Opus: Design review, architecture decisions only

Testing Strategy

Unit tests for all tools (mock filesystem/network)
Unit tests for tool registry serialization (Anthropic + OpenAI formats)
Integration tests for agent loop (mock model returns tool_use, verify execution)
Integration tests for gateway protocol (WS client/server)
E2E tests for tool execution (real shell commands in temp dirs)

Migration Path

Phase 1 is additive (no breaking changes to existing code)
Phase 2 refactors daemon (breaking, but internal only)
Phase 3 moves Telegram (file rename, adapter interface)
Each phase is a separate feature branch

Design Version: 1.0 Created: 2026-02-05 Approach: Bottom-up (tools -> gateway -> channels -> skills -> advanced)

14 KiB Raw Blame History