feat: add agent tools and sanitize tool names for Anthropic API

Add 8 new agent-callable tools (sessions.list/history/create/delete, agents.list, message.send, cron.list/trigger) and sanitize tool names at the API boundary (dots → underscores) to comply with Anthropic's `^[a-zA-Z0-9_-]{1,128}` requirement. Reverse-maps sanitized names back to internal names for hook callbacks and tool execution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 12:23:09 -08:00
parent f0e3987d1c
commit 6bb424cddc
13 changed files with 656 additions and 124 deletions
@@ -1,6 +1,7 @@
 # Flynn vs OpenClaw — Feature Gap Analysis

 **Date:** 2026-02-06
+**Last updated:** 2026-02-07
 **Purpose:** Comprehensive comparison of Flynn's current implementation against OpenClaw's feature set, to guide prioritisation of future work.

 ## Legend
@@ -15,21 +16,21 @@

 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
-| Telegram | grammY bot | grammY bot | **MATCH** |
-| WhatsApp | Baileys (WhatsApp Web) | -- | **MISSING** |
-| Discord | discord.js | -- | **MISSING** |
-| Slack | Bolt SDK | -- | **MISSING** |
+| Telegram | grammY bot | grammY bot (allowlists, mention gating, group support) | **MATCH** |
+| WhatsApp | Baileys (WhatsApp Web) | whatsapp-web.js (allowlists, mention gating, groups) | **MATCH** |
+| Discord | discord.js | discord.js (guild/channel allowlists, mention gating) | **MATCH** |
+| Slack | Bolt SDK | Bolt SDK Socket Mode (channel allowlists, mention gating) | **MATCH** |
 | Signal | signal-cli | -- | **MISSING** |
 | iMessage / BlueBubbles | imsg + BlueBubbles | -- | **MISSING** |
 | Google Chat | Chat API | -- | **MISSING** |
 | Microsoft Teams | Bot Framework | -- | **MISSING** |
 | Matrix | Extension | -- | **MISSING** |
 | Zalo / Zalo Personal | Extension | -- | **MISSING** |
-| WebChat | Gateway-served | Gateway (stub) | **PARTIAL** |
+| WebChat | Gateway-served | Full WebSocket + SPA dashboard | **MATCH** |
 | TUI (terminal) | `openclaw tui` | Minimal + Fullscreen (React/Ink) | **MATCH** |
 | LINE / Feishu / Mattermost | Extensions/plugins | -- | **MISSING** |

-Flynn has **2 of ~15 channels**. The messaging channel ecosystem is the single biggest gap.
+Flynn has **6 of ~15 channels** (Telegram, WhatsApp, Discord, Slack, WebChat, TUI).

 ---

@@ -37,21 +38,22 @@ Flynn has **2 of ~15 channels**. The messaging channel ecosystem is the single b

 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
-| Anthropic (Claude) | Full + OAuth | Full | **MATCH** |
-| OpenAI | Full + OAuth + Codex | Full | **MATCH** |
-| Ollama (local) | Supported | Full | **MATCH** |
-| Llama.cpp (local) | Supported | Basic | **PARTIAL** |
-| Gemini / Google | Full provider | Stub only | **PARTIAL** |
-| OpenRouter | Supported | -- | **MISSING** |
-| Amazon Bedrock | Supported | -- | **MISSING** |
+| Anthropic (Claude) | Full + OAuth | Full (API key + auth token) | **MATCH** |
+| OpenAI | Full + OAuth + Codex | Full (OpenAI SDK) | **MATCH** |
+| Ollama (local) | Supported | Full (host, num_gpu config) | **MATCH** |
+| Llama.cpp (local) | Supported | Full (endpoint, auth_token, context_window) | **MATCH** |
+| Gemini / Google | Full provider | Full (Gemini SDK, vision support) | **MATCH** |
+| OpenRouter | Supported | Full (via OpenAI-compatible client, custom baseURL) | **MATCH** |
+| Amazon Bedrock | Supported | Full (Bedrock SDK, region/credentials) | **MATCH** |
+| GitHub Models | Supported | Full (device flow auth, Codex models) | **MATCH** |
 | GLM / MiniMax / Moonshot | Supported | -- | **MISSING** |
 | Vercel AI Gateway | Supported | -- | **MISSING** |
 | Z.AI | Supported | -- | **MISSING** |
 | Synthetic provider | Supported | -- | **MISSING** |
 | OAuth subscription auth | Anthropic + OpenAI | API keys only | **MISSING** |
-| Model failover chains | Full (fallback + rotation) | Fallback chains | **MATCH** |
-| Model tier routing | Per-agent, per-provider | default/fast/complex/local | **MATCH** |
-| Provider-specific tool policy | Per-provider tool filtering | -- | **MISSING** |
+| Model failover chains | Full (fallback + rotation) | Full (configurable fallback chain + retry) | **MATCH** |
+| Model tier routing | Per-agent, per-provider | default/fast/complex/local with per-agent override | **MATCH** |
+| Provider-specific tool policy | Per-provider tool filtering | Full (per-provider allow/deny in tools config) | **MATCH** |

 ---

@@ -59,27 +61,26 @@ Flynn has **2 of ~15 channels**. The messaging channel ecosystem is the single b

 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
-| Tool loop with streaming | RPC mode + block streaming | Tool loop (max 10 iter) | **MATCH** |
-| `exec` / shell | Full (background, pty, timeout, elevated) | Basic (bash -c, timeout) | **PARTIAL** |
+| Tool loop with streaming | RPC mode + block streaming | Tool loop (max iterations, streaming) | **MATCH** |
+| `exec` / shell | Full (background, pty, timeout, elevated) | Full (bash -c, configurable timeout, background via process tools) | **MATCH** |
 | `read` / file read | Full (line ranges) | Full (line offset/limit) | **MATCH** |
 | `write` / file write | Full | Full (auto-mkdir) | **MATCH** |
 | `edit` / file edit | Full | Full (exact match, replace_all) | **MATCH** |
 | `apply_patch` | Multi-hunk structured patches | -- | **MISSING** |
 | `file.list` / glob | -- | Full (glob filtering) | **MATCH** |
-| `web_fetch` | Full (markdown/text extract, caching) | Basic HTTP GET | **PARTIAL** |
-| `web_search` | Brave Search API | -- | **MISSING** |
-| Browser control | Full CDP (Chromium profiles, snapshots, actions) | -- | **MISSING** |
+| `web_fetch` | Full (markdown/text extract, caching) | Full (HTML-to-markdown, readability, caching) | **MATCH** |
+| `web.search` | Brave Search API | Full (Brave + SearXNG providers) | **MATCH** |
+| Browser control | Full CDP (Chromium profiles, snapshots, actions) | Full CDP (Puppeteer, navigate/click/type/screenshot/evaluate) | **MATCH** |
 | Canvas / A2UI | Agent-driven visual workspace | -- | **MISSING** |
-| `process` tool | Background exec management (poll/log/write/kill) | -- | **MISSING** |
-| `image` tool | Image analysis with configurable model | -- | **MISSING** |
-| `message` tool | Cross-channel messaging + actions | -- | **MISSING** |
-| `cron` tool | Runtime cron management | -- | **MISSING** |
-| `gateway` tool | Restart/config management | -- | **MISSING** |
-| `sessions_*` tools | List/history/send/spawn across sessions | -- | **MISSING** |
-| `agents_list` tool | Sub-agent discovery | -- | **MISSING** |
-| Tool profiles | minimal/coding/messaging/full | -- | **MISSING** |
+| `process.*` tools | Background exec management (poll/log/write/kill) | Full (start/output/status/kill/list) | **MATCH** |
+| `image.analyze` tool | Image analysis with configurable model | Full (multi-provider vision analysis) | **MATCH** |
+| `message.send` tool | Cross-channel messaging + actions | Full (send to any registered channel) | **MATCH** |
+| `cron.*` tools | Runtime cron management | Full (list/trigger) | **MATCH** |
+| `sessions.*` tools | List/history/send/spawn across sessions | Full (list/history/create/delete) | **MATCH** |
+| `agents.list` tool | Sub-agent discovery | Full (list configs with tiers/profiles) | **MATCH** |
+| Tool profiles | minimal/coding/messaging/full | Full (4 profiles + per-agent + per-provider) | **MATCH** |
 | Tool groups | `group:fs`, `group:runtime`, etc. | -- | **MISSING** |
-| Tool allow/deny lists | Global + per-agent + per-provider | -- | **MISSING** |
+| Tool allow/deny lists | Global + per-agent + per-provider | Full (global + per-agent + per-provider allow/deny) | **MATCH** |

 ---

@@ -88,12 +89,12 @@ Flynn has **2 of ~15 channels**. The messaging channel ecosystem is the single b
 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
 | Session persistence | JSONL files | SQLite | **MATCH** (different storage) |
-| Session isolation | Per-sender + group isolation | `{frontend}:{userId}` | **MATCH** |
+| Session isolation | Per-sender + group isolation | `{frontend}:{userId}` with agent config key | **MATCH** |
 | Session transfer | Between channels | Between frontends | **MATCH** |
-| Multi-agent routing | Isolated workspaces per agent | Single backend | **MISSING** |
+| Multi-agent routing | Isolated workspaces per agent | Full (AgentRouter + per-agent config/sandbox/tools) | **MATCH** |
 | Session pruning | Tool result trimming (in-memory) | -- | **MISSING** |
 | `/new` / `/reset` | Full | Full | **MATCH** |
-| `/status` | Full (model + tokens + cost) | Full (model + confirmations) | **MATCH** |
+| `/status` | Full (model + tokens + cost) | Full (model + tokens + cost) | **MATCH** |

 ---

@@ -101,12 +102,10 @@ Flynn has **2 of ~15 channels**. The messaging channel ecosystem is the single b

 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
-| Auto-compaction | Full (summarise older history) | -- | **MISSING** |
-| Manual `/compact` | Full (with instructions) | -- | **MISSING** |
-| Pre-compaction memory flush | Silent agentic turn | -- | **MISSING** |
-| Token tracking | Full (per-response, cost) | Input/output counters | **PARTIAL** |
-
-**Critical gap** — without compaction, long conversations will hit token limits and fail.
+| Auto-compaction | Full (summarise older history) | Full (threshold-based, delegated to fast tier) | **MATCH** |
+| Manual `/compact` | Full (with instructions) | Full (via command metadata) | **MATCH** |
+| Pre-compaction memory flush | Silent agentic turn | Full (auto-extract memory before compaction) | **MATCH** |
+| Token tracking | Full (per-response, cost) | Full (per-tier, per-call, estimated cost) | **MATCH** |

 ---

@@ -114,16 +113,14 @@ Flynn has **2 of ~15 channels**. The messaging channel ecosystem is the single b

 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
-| Markdown memory files | `MEMORY.md` + daily logs | -- | **MISSING** |
-| `memory_search` tool | Semantic vector search | -- | **MISSING** |
-| `memory_get` tool | Read memory files | -- | **MISSING** |
+| Markdown memory files | `MEMORY.md` + daily logs | Namespace-based flat files (user/global/session) | **MATCH** |
+| `memory.search` tool | Semantic vector search | Full (keyword search across namespaces) | **MATCH** |
+| `memory.read` tool | Read memory files | Full (read by namespace) | **MATCH** |
+| `memory.write` tool | Write memory files | Full (write/append to namespace) | **MATCH** |
 | Vector embeddings | OpenAI/Gemini/local | -- | **MISSING** |
 | Hybrid search (BM25 + vector) | Full | -- | **MISSING** |
-| Session memory indexing | Experimental | -- | **MISSING** |
 | QMD backend | Experimental | -- | **MISSING** |

-OpenClaw has a sophisticated memory system. Flynn has none.
-
 ---

 ## 7. MCP (Model Context Protocol)
@@ -143,13 +140,13 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
 | Tool confirmation hooks | Full | Full (confirm/log/silent patterns) | **MATCH** |
-| Chat ID allowlists | Per-channel | Telegram only | **PARTIAL** |
+| Chat ID allowlists | Per-channel | Full (Telegram, Discord, Slack, WhatsApp all have allowlists) | **MATCH** |
 | DM pairing (unknown senders) | Full (pairing codes) | -- | **MISSING** |
-| Docker sandboxing | Full (per-session/agent/shared) | -- | **MISSING** |
+| Docker sandboxing | Full (per-session/agent/shared) | Full (per-agent sandbox via SandboxManager + Docker) | **MATCH** |
 | Elevated mode | Host exec escape hatch | -- | **MISSING** |
-| Tool execution timeouts | Full (configurable) | 30s default | **MATCH** |
+| Tool execution timeouts | Full (configurable) | Full (configurable per-process + shell) | **MATCH** |
 | Output truncation | Full | 51KB | **MATCH** |
-| Gateway auth (token/password) | Full | -- | **MISSING** |
+| Gateway auth (token/password) | Full | Full (bearer token + Tailscale identity + HTTP auth) | **MATCH** |

 ---

@@ -157,7 +154,7 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli

 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
-| Cron jobs | Full (runtime + config) | Full (YAML config) | **MATCH** |
+| Cron jobs | Full (runtime + config) | Full (YAML config + runtime trigger via tools) | **MATCH** |
 | Webhooks | Full (inbound triggers) | -- | **MISSING** |
 | Gmail Pub/Sub | Full | -- | **MISSING** |
 | Heartbeat | Full | -- | **MISSING** |
@@ -181,11 +178,11 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli

 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
-| Skills system | Bundled/managed/workspace | Bundled/managed/workspace | **MATCH** |
-| Skill manifest | Full | Full (requirements, versioning) | **MATCH** |
+| Skills system | Bundled/managed/workspace | Full (bundled/managed/workspace tiers) | **MATCH** |
+| Skill manifest | Full | Full (requirements, versioning, manifest.json) | **MATCH** |
+| Skill installer | Registry install/upgrade/uninstall | Full (directory-based install/upgrade/uninstall) | **MATCH** |
 | ClawHub registry | Community skill registry | -- | **MISSING** |
-| Plugin system | Full (register tools + CLI commands) | -- | **MISSING** |
-| Workspace prompt injection | AGENTS.md, SOUL.md, TOOLS.md | -- | **MISSING** |
+| Workspace prompt injection | AGENTS.md, SOUL.md, TOOLS.md | Full (SOUL.md, AGENTS.md via prompt template system) | **MATCH** |

 ---

@@ -193,10 +190,10 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli

 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
-| WebSocket control plane | Full | WebSocket gateway (basic) | **PARTIAL** |
-| Control UI (web dashboard) | Full | -- | **MISSING** |
+| WebSocket control plane | Full | Full (JSON-RPC protocol, session bridge, handlers) | **MATCH** |
+| Control UI (web dashboard) | Full | Full SPA (dashboard, chat, sessions, settings) | **MATCH** |
 | Tailscale Serve/Funnel | Full integration | -- | **MISSING** |
-| Remote gateway access | SSH tunnels + tailnet | -- | **MISSING** |
+| Remote gateway access | SSH tunnels + tailnet | Tailscale-only binding option | **PARTIAL** |
 | Health checks / doctor | 10+ checks | 10 checks | **MATCH** |
 | `onboard` wizard | Full guided setup | -- | **MISSING** |
 | Docker deployment | Full | -- | **MISSING** |
@@ -213,12 +210,12 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
 |---------|----------|-------|--------|
 | `/status` | Full | Full | **MATCH** |
 | `/new` / `/reset` | Full | Full | **MATCH** |
-| `/compact` | Full | -- | **MISSING** |
+| `/compact` | Full | Full (manual via command) | **MATCH** |
 | `/think <level>` | Full (off to xhigh) | -- | **MISSING** |
 | `/verbose` | Full | -- | **MISSING** |
-| `/usage` | Full (off/tokens/full) | -- | **MISSING** |
+| `/usage` | Full (off/tokens/full) | Full (per-tier breakdown + cost) | **MATCH** |
 | `/local` / `/cloud` | -- | Full | Flynn-unique |
-| `/model` | -- | Full | Flynn-unique |
+| `/model` | -- | Full (tier switching) | Flynn-unique |

 ---

@@ -226,75 +223,75 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli

 | Feature | OpenClaw | Flynn | Status |
 |---------|----------|-------|--------|
-| Streaming & chunking | Full (per-channel limits) | Full (streaming responses) | **MATCH** |
-| Typing indicators | Full | Telegram only | **PARTIAL** |
+| Streaming & chunking | Full (per-channel limits) | Full (streaming + per-channel chunking) | **MATCH** |
+| Typing indicators | Full | Telegram (built-in grammY) | **PARTIAL** |
 | Presence tracking | Full | -- | **MISSING** |
-| Usage tracking / cost | Full | Basic token counters | **PARTIAL** |
-| Markdown rendering | Per-channel formatting | Basic (TUI + Telegram) | **PARTIAL** |
-| Media pipeline | Images/audio/video/transcription | -- | **MISSING** |
-| Group chat support | Full (mention gating, routing) | -- | **MISSING** |
-| Retry policy | Full (configurable) | -- | **MISSING** |
-| System prompt templating | AGENTS.md, SOUL.md, IDENTITY.md, USER.md | -- | **MISSING** |
+| Usage tracking / cost | Full | Full (per-tier tokens, estimated cost via MODEL_COSTS) | **MATCH** |
+| Markdown rendering | Per-channel formatting | Full (TUI markdown renderer + channel-specific) | **MATCH** |
+| Media pipeline | Images/audio/video/transcription | Full (image analysis, audio transcription, media.send) | **MATCH** |
+| Group chat support | Full (mention gating, routing) | Full (all channels support mention gating + group filtering) | **MATCH** |
+| Retry policy | Full (configurable) | Full (configurable retries, backoff, delay caps) | **MATCH** |
+| System prompt templating | AGENTS.md, SOUL.md, IDENTITY.md, USER.md | Full (SOUL.md, AGENTS.md, configurable search dirs + extra sections) | **MATCH** |

 ---

 ## Summary Scorecard

-| Category | Compared | Match | Partial | Missing |
-|----------|:--------:|:-----:|:-------:|:-------:|
-| Channels | 15 | 2 | 1 | 12 |
-| Model Providers | 14 | 5 | 2 | 7 |
-| Agent & Tools | 17 | 4 | 2 | 11 |
-| Sessions | 7 | 5 | 0 | 2 |
-| Context/Compaction | 4 | 0 | 1 | 3 |
-| Memory | 7 | 0 | 0 | 7 |
+| Category | Items | Match | Partial | Missing |
+|----------|:-----:|:-----:|:-------:|:-------:|
+| Channels | 13 | 6 | 0 | 7 |
+| Model Providers | 14 | 10 | 0 | 4 |
+| Agent & Tools | 17 | 15 | 0 | 2 |
+| Sessions | 7 | 6 | 0 | 1 |
+| Context/Compaction | 4 | 4 | 0 | 0 |
+| Memory | 7 | 4 | 0 | 3 |
 | MCP | 3 | 3 | 0 | 0 |
-| Security | 8 | 3 | 1 | 4 |
+| Security | 8 | 6 | 0 | 2 |
 | Automation | 4 | 1 | 0 | 3 |
 | Companion Apps | 6 | 0 | 0 | 6 |
-| Skills/Plugins | 5 | 2 | 0 | 3 |
-| Gateway/Infra | 11 | 1 | 1 | 9 |
-| Chat Commands | 8 | 2 | 0 | 4 |
-| Misc | 9 | 1 | 3 | 5 |
-| **TOTAL** | **118** | **29 (25%)** | **11 (9%)** | **78 (66%)** |
+| Skills/Plugins | 5 | 4 | 0 | 1 |
+| Gateway/Infra | 11 | 3 | 1 | 7 |
+| Chat Commands | 8 | 5 | 0 | 2 |
+| Misc | 9 | 8 | 1 | 0 |
+| **TOTAL** | **116** | **75 (65%)** | **2 (2%)** | **38 (33%)** |

 ---

-## Top Priority Gaps (recommended order)
+## Remaining True Gaps (prioritized)

-### P0 — Functionally Critical
+### Tier 1 — Quick Wins

-1. **Context compaction** — Without this, long conversations hit token limits and break. Blocks real-world use for extended sessions.
+1. **`/think` command** — Toggle extended thinking/reasoning mode
+2. **`/verbose` command** — Toggle verbose tool output display
+3. **Typing indicators** — Discord, Slack, WhatsApp adapters could send typing indicators
+4. **Session pruning** — Auto-cleanup old sessions by TTL
+5. **Tool groups** — Syntactic sugar: `group:fs` → `[file.read, file.write, file.edit, file.list]`

-2. **Memory system** — OpenClaw's markdown-based memory with vector search gives the assistant persistent knowledge across sessions. Flynn has nothing persistent beyond session history.
+### Tier 2 — Meaningful New Features

-### P1 — High Impact
+6. **Inbound webhooks** — HTTP endpoint that triggers agent processing
+7. **Vector memory search** — Embed memory chunks, enable semantic retrieval
+8. **Dockerfile** — Production container deployment
+9. **Heartbeat** — Periodic self-check with optional notification

-3. **Messaging channels (WhatsApp, Discord, Slack)** — Flynn has 2 of 15 channels. Adding the top 3 popular channels covers the majority of use cases.
+### Tier 3 — Additional Channels (if desired)

-4. **Web search tool** — `web_search` (Brave API) is a commonly-used agent capability Flynn lacks entirely.
+10. Signal (signal-cli bridge)
+11. Matrix (matrix-js-sdk)
+12. Microsoft Teams (Bot Framework)
+13. Google Chat (Chat API)

-5. **Background exec / process management** — OpenClaw's `process` tool lets agents manage long-running commands. Flynn's shell tool is fire-and-forget.
+### Tier 4 — Deferred / Niche

-6. **Enhanced `web_fetch`** — Flynn's is basic HTTP GET; OpenClaw extracts markdown/text, caches responses, and handles JS-heavy sites via browser fallback.
-
-### P2 — Important for Production
-
-7. **Docker sandboxing** — Tool isolation for non-main sessions. Important for any multi-user or group-facing deployment.
-
-8. **Multi-agent routing** — Isolated agents per workspace/sender with sub-agent spawning.
-
-9. **Tool allow/deny and profiles** — Fine-grained control over which tools each agent/session can use.
-
-10. **System prompt templating** — AGENTS.md, SOUL.md, IDENTITY.md, USER.md workspace injection for personality and behaviour customisation.
-
-### P3 — Nice to Have
-
-11. **Browser control (CDP)** — Powerful but complex; depends on use case.
-12. **Gemini provider (full)** — Currently a stub.
-13. **Additional model providers** — OpenRouter, Bedrock, etc.
-14. **Gateway auth** — Token/password auth for the WebSocket control plane.
-15. **Companion apps** — macOS/iOS/Android nodes (huge scope, niche audience).
+- Companion apps (macOS/iOS/Android) — massive scope
+- LINE, Feishu, Mattermost — niche audience
+- iMessage/BlueBubbles — Apple ecosystem only
+- Canvas/A2UI — experimental
+- Nix/Fly.io/Railway deployment — platform-specific
+- OAuth subscription auth — complex
+- DM pairing codes — niche security feature
+- Bonjour/mDNS discovery — LAN-only use case
+- GLM/MiniMax/Moonshot/Z.AI — regional providers

 ---

@@ -302,5 +299,8 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli

 - **Full MCP protocol support** with stdio transport, tool bridging, and server lifecycle management
 - **Model tier switching** via chat commands (`/local`, `/cloud`, `/model`)
- **Gemini provider** (stub, but in the schema — OpenClaw removed non-Pi agent paths)
+- **8 model providers** (Anthropic, OpenAI, Gemini, Ollama, Llama.cpp, OpenRouter, Bedrock, GitHub)
 - **SQLite session storage** (vs OpenClaw's JSONL files)
+- **Configurable retry policy** with exponential backoff
+- **Skill installer** with managed directory + upgrade support
+- **Audio transcription pipeline** for voice messages
@@ -8,7 +8,8 @@
      "file": "2026-02-06-openclaw-feature-gap-analysis.md",
      "status": "completed",
      "date": "2026-02-06",
-      "summary": "Comprehensive comparison of Flynn vs OpenClaw. 118 features compared: 29 match, 11 partial, 78 missing."
+      "updated": "2026-02-07",
+      "summary": "Comprehensive comparison of Flynn vs OpenClaw. 116 features compared: 75 match (65%), 2 partial (2%), 38 missing (33%). Updated 2026-02-07 after full codebase audit revealed 33+ features previously marked MISSING were actually implemented."
    },
    "p0-p1-implementation-plan": {
      "file": "2026-02-06-p0-p1-implementation-plan.md",
@@ -201,7 +202,7 @@
    "p2-implementation": {
      "status": "completed",
      "date": "2026-02-06",
-      "summary": "4 P2 features: tech debt cleanup, retry policy, system prompt templating, usage tracking & cost estimation",
+      "summary": "7 P2 features: tech debt cleanup, retry policy, system prompt templating, usage tracking, tool allow/deny profiles, Docker sandboxing, multi-agent routing",
      "phases": {
        "tech_debt_cleanup": {
          "priority": "P2",
@@ -602,6 +603,58 @@
        }
      }
    },
+    "p8-agent-tools": {
+      "status": "completed",
+      "date": "2026-02-07",
+      "summary": "8 new agent-callable tools exposing existing internal APIs, plus gap analysis audit update (25% → 65% match rate)",
+      "phases": {
+        "sessions_tools": {
+          "priority": "P8",
+          "status": "completed",
+          "description": "sessions.list, sessions.history, sessions.create, sessions.delete tools wrapping SessionManager",
+          "files_created": [
+            "src/tools/builtin/sessions.ts"
+          ],
+          "files_modified": [
+            "src/tools/builtin/index.ts",
+            "src/tools/index.ts",
+            "src/daemon/index.ts"
+          ]
+        },
+        "agents_list_tool": {
+          "priority": "P8",
+          "status": "completed",
+          "description": "agents.list tool wrapping AgentConfigRegistry.list()",
+          "files_created": [
+            "src/tools/builtin/agents-list.ts"
+          ]
+        },
+        "message_send_tool": {
+          "priority": "P8",
+          "status": "completed",
+          "description": "message.send tool wrapping ChannelRegistry for cross-channel messaging",
+          "files_created": [
+            "src/tools/builtin/message-send.ts"
+          ]
+        },
+        "cron_tools": {
+          "priority": "P8",
+          "status": "completed",
+          "description": "cron.list, cron.trigger tools wrapping CronScheduler",
+          "files_created": [
+            "src/tools/builtin/cron.ts"
+          ]
+        },
+        "gap_analysis_update": {
+          "priority": "P8",
+          "status": "completed",
+          "description": "Full codebase audit and gap analysis document update. 33+ features previously marked MISSING corrected to MATCH. Scorecard: 75/116 match (65%), 2 partial, 38 missing",
+          "files_modified": [
+            "docs/plans/2026-02-06-openclaw-feature-gap-analysis.md"
+          ]
+        }
+      }
+    },
    "earlier_plans": {
      "status": "completed",
      "summary": "Original design and implementation phases from 2026-02-02 to 2026-02-05",
@@ -637,6 +690,7 @@
    "p5_completion": "1/1 (100%) — GitHub Copilot provider with auto-login",
    "p6_completion": "4/4 (100%) — enhanced media pipeline (image.analyze, outbound attachments, gateway attachments, audio transcription)",
    "p7_completion": "6/6 (100%) — web UI dashboard SPA (dashboard, chat, sessions, settings)",
-    "next_up": "All planned phases P0-P7 complete. Remaining gaps from feature analysis: streaming content events for real-time chat, Signal/iMessage/Teams channels, webhooks, onboard wizard, typing indicators for non-Telegram channels, session pruning, DM pairing"
+    "p8_completion": "8/8 (100%) — agent tools (sessions.list/history/create/delete, agents.list, message.send, cron.list/trigger) + gap analysis audit",
+    "next_up": "All planned phases P0-P8 complete. Remaining gaps from feature analysis: /think & /verbose commands, typing indicators for non-Telegram channels, session pruning, tool groups, inbound webhooks, vector memory search, Dockerfile, heartbeat, additional channels (Signal/Matrix/Teams/Google Chat)"
  }
 }