feat: implement Tier 3 features — lane queue, credential redaction, token dashboard, xAI, Voyage AI

- Lane Queue: per-session FIFO queue in gateway replacing reject-when-busy (9 tests) - Credential Redaction: redactConfig() expanded to cover 18+ secret fields (16 tests) - Web UI Token Dashboard: system.tokenUsage endpoint + Usage page with summary cards - xAI (Grok) Provider: OpenAI-compatible client with model pricing - Voyage AI Embeddings: new embedding provider with configurable dimensions (5 tests) - Update gap analysis: 90→95 match (70%→74%), Tier 3 section marked DONE - Update state.json: test count 1001→1034, add tier3_completion entry Total: 1034 tests passing across 85 files, typecheck clean
2026-02-09 10:32:57 -08:00
parent 1d126cddfb
commit 9be8f76bc7
26 changed files with 1395 additions and 105 deletions
@@ -1,7 +1,7 @@
 # Flynn vs OpenClaw — Feature Gap Analysis

 **Date:** 2026-02-06
-**Last updated:** 2026-02-07 (post tier-2 implementation)
+**Last updated:** 2026-02-09 (refreshed against OpenClaw v2026.2.6)
 **Purpose:** Comprehensive comparison of Flynn's current implementation against OpenClaw's feature set, to guide prioritisation of future work.

 ## Legend
@@ -46,9 +46,11 @@ Flynn has **6 of ~15 channels** (Telegram, WhatsApp, Discord, Slack, WebChat, TU
 | OpenRouter | Supported | Full (via OpenAI-compatible client, custom baseURL) | **MATCH** |
 | Amazon Bedrock | Supported | Full (Bedrock SDK, region/credentials) | **MATCH** |
 | GitHub Models | Supported | Full (device flow auth, Codex models) | **MATCH** |
-| GLM / MiniMax / Moonshot | Supported | -- | **MISSING** |
+| Zhipu AI (GLM) | Supported | Full (OpenAI-compatible client, GLM models) | **MATCH** |
+| MiniMax / Moonshot | Supported | -- | **MISSING** |
+| xAI (Grok) | Supported (v2026.2.6) | Full (OpenAI-compatible client, xai provider) | **MATCH** |
 | Vercel AI Gateway | Supported | -- | **MISSING** |
-| Z.AI | Supported | -- | **MISSING** |
+| Voyage AI embeddings | Supported (v2026.2.6) | Full (Voyage AI provider, configurable dimensions) | **MATCH** |
 | Synthetic provider | Supported | -- | **MISSING** |
 | OAuth subscription auth | Anthropic + OpenAI | API keys only | **MISSING** |
 | Model failover chains | Full (fallback + rotation) | Full (configurable fallback chain + retry) | **MATCH** |
@@ -71,6 +73,7 @@ Flynn has **6 of ~15 channels** (Telegram, WhatsApp, Discord, Slack, WebChat, TU
 | `web_fetch` | Full (markdown/text extract, caching) | Full (HTML-to-markdown, readability, caching) | **MATCH** |
 | `web.search` | Brave Search API | Full (Brave + SearXNG providers) | **MATCH** |
 | Browser control | Full CDP (Chromium profiles, snapshots, actions) | Full CDP (Puppeteer, navigate/click/type/screenshot/evaluate) | **MATCH** |
+| Lane Queue (serial exec) | Concurrency control for sessions | Full (per-session FIFO queue in gateway) | **MATCH** |
 | Canvas / A2UI | Agent-driven visual workspace | -- | **MISSING** |
 | `process.*` tools | Background exec management (poll/log/write/kill) | Full (start/output/status/kill/list) | **MATCH** |
 | `image.analyze` tool | Image analysis with configurable model | Full (multi-provider vision analysis) | **MATCH** |
@@ -143,6 +146,8 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
 | Tool confirmation hooks | Full | Full (confirm/log/silent patterns) | **MATCH** |
 | Chat ID allowlists | Per-channel | Full (Telegram, Discord, Slack, WhatsApp all have allowlists) | **MATCH** |
 | DM pairing (unknown senders) | Full (pairing codes) | -- | **MISSING** |
+| Credential redaction | Config responses redacted (v2026.2.6) | Full (18+ secret fields redacted from config API) | **MATCH** |
+| Skill/plugin code safety scanner | Static analysis (v2026.2.6) | -- | **MISSING** |
 | Docker sandboxing | Full (per-session/agent/shared) | Full (per-agent sandbox via SandboxManager + Docker) | **MATCH** |
 | Elevated mode | Host exec escape hatch | -- | **MISSING** |
 | Tool execution timeouts | Full (configurable) | Full (configurable per-process + shell) | **MATCH** |
@@ -199,6 +204,8 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
 | `onboard` wizard | Full guided setup | -- | **MISSING** |
 | Docker deployment | Full | Full (multi-stage Dockerfile, docker-compose.yml) | **MATCH** |
 | Nix deployment | Full | -- | **MISSING** |
+| Shell completion | Auto-detect + cached (v2026.2.3) | -- | **MISSING** |
+| Announce delivery mode | Isolated job delivery (v2026.2.3) | -- | **MISSING** |
 | Fly.io / Railway / Render | Supported | -- | **MISSING** |
 | Bonjour/mDNS discovery | Full | -- | **MISSING** |
 | Gateway lock | Full | -- | **MISSING** |
@@ -227,6 +234,7 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
 | Streaming & chunking | Full (per-channel limits) | Full (streaming + per-channel chunking) | **MATCH** |
 | Typing indicators | Full | Telegram, Discord, WhatsApp (per-adapter) | **MATCH** |
 | Presence tracking | Full | -- | **MISSING** |
+| Web UI token dashboard | Usage visualization (v2026.2.6) | Full (Usage page with summary cards, per-session table, auto-refresh) | **MATCH** |
 | Usage tracking / cost | Full | Full (per-tier tokens, estimated cost via MODEL_COSTS) | **MATCH** |
 | Markdown rendering | Per-channel formatting | Full (TUI markdown renderer + channel-specific) | **MATCH** |
 | Media pipeline | Images/audio/video/transcription | Full (image analysis, audio transcription, media.send) | **MATCH** |
@@ -241,20 +249,22 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
 | Category | Items | Match | Partial | Missing |
 |----------|:-----:|:-----:|:-------:|:-------:|
 | Channels | 13 | 6 | 0 | 7 |
-| Model Providers | 14 | 10 | 0 | 4 |
-| Agent & Tools | 18 | 18 | 0 | 0 |
+| Model Providers | 18 | 14 | 0 | 4 |
+| Agent & Tools | 22 | 21 | 0 | 1 |
 | Sessions | 7 | 7 | 0 | 0 |
 | Context/Compaction | 4 | 4 | 0 | 0 |
 | Memory | 7 | 6 | 0 | 1 |
 | MCP | 3 | 3 | 0 | 0 |
-| Security | 8 | 6 | 0 | 2 |
+| Security | 10 | 7 | 0 | 3 |
 | Automation | 4 | 4 | 0 | 0 |
 | Companion Apps | 6 | 0 | 0 | 6 |
 | Skills/Plugins | 5 | 4 | 0 | 1 |
-| Gateway/Infra | 11 | 4 | 1 | 6 |
-| Chat Commands | 8 | 7 | 0 | 0 |
-| Misc | 9 | 9 | 0 | 0 |
-| **TOTAL** | **117** | **88 (75%)** | **1 (1%)** | **27 (23%)**  |
+| Gateway/Infra | 13 | 4 | 1 | 8 |
+| Chat Commands | 6 | 6 | 0 | 0 |
+| Misc | 10 | 9 | 0 | 1 |
+| **TOTAL** | **128** | **95 (74%)** | **1 (1%)** | **32 (25%)**  |
+
+*Note: Match rate improved from 70% to 74% after implementing Tier 3 features (Lane Queue, credential redaction, Web UI token dashboard, xAI Grok provider, Voyage AI embeddings).*

 ---

@@ -268,24 +278,40 @@ All five Tier 1 items implemented: `!!think` prefix, `/verbose` command, typing

 All four Tier 2 items implemented: inbound webhooks (HTTP POST /webhooks/:name with HMAC auth), vector memory search (hybrid keyword+vector with OpenAI/Gemini/Ollama/LlamaCpp embeddings), Dockerfile (multi-stage build), heartbeat monitor (5 checks with notification). See `docs/plans/2026-02-07-tier2-implementation-plan.md`.

-### Tier 3 — Additional Channels (if desired)
+### ~~Tier 3 — Practical Improvements~~ (DONE — implemented 2026-02-09)

-10. Signal (signal-cli bridge)
-11. Matrix (matrix-js-sdk)
-12. Microsoft Teams (Bot Framework)
-13. Google Chat (Chat API)
+All five Tier 3 items implemented: Lane Queue (per-session FIFO in gateway), credential redaction (18+ secret fields), Web UI token dashboard (usage page with summary cards), xAI Grok provider (OpenAI-compatible), Voyage AI embeddings (configurable dimensions). +33 new tests.

-### Tier 4 — Deferred / Niche
+### Tier 4 — Additional Channels (if desired)
+
+6. Signal (signal-cli bridge)
+7. Matrix (matrix-js-sdk)
+8. Microsoft Teams (Bot Framework)
+9. Google Chat (Chat API)
+
+### Tier 5 — Deferred / Niche

 - Companion apps (macOS/iOS/Android) — massive scope
 - LINE, Feishu, Mattermost — niche audience
 - iMessage/BlueBubbles — Apple ecosystem only
- Canvas/A2UI — experimental
+- Canvas/A2UI — experimental visual workspace
+- Voice Wake / Talk Mode — ElevenLabs TTS integration
 - Nix/Fly.io/Railway deployment — platform-specific
 - OAuth subscription auth — complex
 - DM pairing codes — niche security feature
+- Skill/plugin safety scanner — static analysis
+- Shell completion — CLI ergonomics
+- Announce delivery mode — isolated job delivery
 - Bonjour/mDNS discovery — LAN-only use case
- GLM/MiniMax/Moonshot/Z.AI — regional providers
+- MiniMax/Moonshot — regional providers
+- Synthetic provider — testing/mock
+- Elevated mode — sandbox escape hatch
+- Onboard wizard — guided setup
+- Gateway lock — single-client mode
+- Tailscale Serve/Funnel — native integration
+- ClawHub/skill registry — community marketplace
+- QMD backend — experimental memory search
+- Presence tracking — online/offline status

 ---

@@ -293,7 +319,7 @@ All four Tier 2 items implemented: inbound webhooks (HTTP POST /webhooks/:name w

 - **Full MCP protocol support** with stdio transport, tool bridging, and server lifecycle management
 - **Model tier switching** via chat commands (`/local`, `/cloud`, `/model`)
- **8 model providers** (Anthropic, OpenAI, Gemini, Ollama, Llama.cpp, OpenRouter, Bedrock, GitHub)
+- **10 model providers** (Anthropic, OpenAI, Gemini, Ollama, Llama.cpp, OpenRouter, Bedrock, GitHub, Zhipu, xAI)
 - **SQLite session storage** (vs OpenClaw's JSONL files)
 - **Configurable retry policy** with exponential backoff
 - **Skill installer** with managed directory + upgrade support