feat: implement Tier 3 features — lane queue, credential redaction, token dashboard, xAI, Voyage AI

- Lane Queue: per-session FIFO queue in gateway replacing reject-when-busy (9 tests)
- Credential Redaction: redactConfig() expanded to cover 18+ secret fields (16 tests)
- Web UI Token Dashboard: system.tokenUsage endpoint + Usage page with summary cards
- xAI (Grok) Provider: OpenAI-compatible client with model pricing
- Voyage AI Embeddings: new embedding provider with configurable dimensions (5 tests)
- Update gap analysis: 90→95 match (70%→74%), Tier 3 section marked DONE
- Update state.json: test count 1001→1034, add tier3_completion entry

Total: 1034 tests passing across 85 files, typecheck clean
This commit is contained in:
William Valentin
2026-02-09 10:32:57 -08:00
parent 1d126cddfb
commit 9be8f76bc7
26 changed files with 1395 additions and 105 deletions
@@ -1,7 +1,7 @@
# Flynn vs OpenClaw — Feature Gap Analysis
**Date:** 2026-02-06
**Last updated:** 2026-02-07 (post tier-2 implementation)
**Last updated:** 2026-02-09 (refreshed against OpenClaw v2026.2.6)
**Purpose:** Comprehensive comparison of Flynn's current implementation against OpenClaw's feature set, to guide prioritisation of future work.
## Legend
@@ -46,9 +46,11 @@ Flynn has **6 of ~15 channels** (Telegram, WhatsApp, Discord, Slack, WebChat, TU
| OpenRouter | Supported | Full (via OpenAI-compatible client, custom baseURL) | **MATCH** |
| Amazon Bedrock | Supported | Full (Bedrock SDK, region/credentials) | **MATCH** |
| GitHub Models | Supported | Full (device flow auth, Codex models) | **MATCH** |
| GLM / MiniMax / Moonshot | Supported | -- | **MISSING** |
| Zhipu AI (GLM) | Supported | Full (OpenAI-compatible client, GLM models) | **MATCH** |
| MiniMax / Moonshot | Supported | -- | **MISSING** |
| xAI (Grok) | Supported (v2026.2.6) | Full (OpenAI-compatible client, xai provider) | **MATCH** |
| Vercel AI Gateway | Supported | -- | **MISSING** |
| Z.AI | Supported | -- | **MISSING** |
| Voyage AI embeddings | Supported (v2026.2.6) | Full (Voyage AI provider, configurable dimensions) | **MATCH** |
| Synthetic provider | Supported | -- | **MISSING** |
| OAuth subscription auth | Anthropic + OpenAI | API keys only | **MISSING** |
| Model failover chains | Full (fallback + rotation) | Full (configurable fallback chain + retry) | **MATCH** |
@@ -71,6 +73,7 @@ Flynn has **6 of ~15 channels** (Telegram, WhatsApp, Discord, Slack, WebChat, TU
| `web_fetch` | Full (markdown/text extract, caching) | Full (HTML-to-markdown, readability, caching) | **MATCH** |
| `web.search` | Brave Search API | Full (Brave + SearXNG providers) | **MATCH** |
| Browser control | Full CDP (Chromium profiles, snapshots, actions) | Full CDP (Puppeteer, navigate/click/type/screenshot/evaluate) | **MATCH** |
| Lane Queue (serial exec) | Concurrency control for sessions | Full (per-session FIFO queue in gateway) | **MATCH** |
| Canvas / A2UI | Agent-driven visual workspace | -- | **MISSING** |
| `process.*` tools | Background exec management (poll/log/write/kill) | Full (start/output/status/kill/list) | **MATCH** |
| `image.analyze` tool | Image analysis with configurable model | Full (multi-provider vision analysis) | **MATCH** |
@@ -143,6 +146,8 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
| Tool confirmation hooks | Full | Full (confirm/log/silent patterns) | **MATCH** |
| Chat ID allowlists | Per-channel | Full (Telegram, Discord, Slack, WhatsApp all have allowlists) | **MATCH** |
| DM pairing (unknown senders) | Full (pairing codes) | -- | **MISSING** |
| Credential redaction | Config responses redacted (v2026.2.6) | Full (18+ secret fields redacted from config API) | **MATCH** |
| Skill/plugin code safety scanner | Static analysis (v2026.2.6) | -- | **MISSING** |
| Docker sandboxing | Full (per-session/agent/shared) | Full (per-agent sandbox via SandboxManager + Docker) | **MATCH** |
| Elevated mode | Host exec escape hatch | -- | **MISSING** |
| Tool execution timeouts | Full (configurable) | Full (configurable per-process + shell) | **MATCH** |
@@ -199,6 +204,8 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
| `onboard` wizard | Full guided setup | -- | **MISSING** |
| Docker deployment | Full | Full (multi-stage Dockerfile, docker-compose.yml) | **MATCH** |
| Nix deployment | Full | -- | **MISSING** |
| Shell completion | Auto-detect + cached (v2026.2.3) | -- | **MISSING** |
| Announce delivery mode | Isolated job delivery (v2026.2.3) | -- | **MISSING** |
| Fly.io / Railway / Render | Supported | -- | **MISSING** |
| Bonjour/mDNS discovery | Full | -- | **MISSING** |
| Gateway lock | Full | -- | **MISSING** |
@@ -227,6 +234,7 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
| Streaming & chunking | Full (per-channel limits) | Full (streaming + per-channel chunking) | **MATCH** |
| Typing indicators | Full | Telegram, Discord, WhatsApp (per-adapter) | **MATCH** |
| Presence tracking | Full | -- | **MISSING** |
| Web UI token dashboard | Usage visualization (v2026.2.6) | Full (Usage page with summary cards, per-session table, auto-refresh) | **MATCH** |
| Usage tracking / cost | Full | Full (per-tier tokens, estimated cost via MODEL_COSTS) | **MATCH** |
| Markdown rendering | Per-channel formatting | Full (TUI markdown renderer + channel-specific) | **MATCH** |
| Media pipeline | Images/audio/video/transcription | Full (image analysis, audio transcription, media.send) | **MATCH** |
@@ -241,20 +249,22 @@ Flynn actually has MCP support that OpenClaw doesn't emphasise — OpenClaw reli
| Category | Items | Match | Partial | Missing |
|----------|:-----:|:-----:|:-------:|:-------:|
| Channels | 13 | 6 | 0 | 7 |
| Model Providers | 14 | 10 | 0 | 4 |
| Agent & Tools | 18 | 18 | 0 | 0 |
| Model Providers | 18 | 14 | 0 | 4 |
| Agent & Tools | 22 | 21 | 0 | 1 |
| Sessions | 7 | 7 | 0 | 0 |
| Context/Compaction | 4 | 4 | 0 | 0 |
| Memory | 7 | 6 | 0 | 1 |
| MCP | 3 | 3 | 0 | 0 |
| Security | 8 | 6 | 0 | 2 |
| Security | 10 | 7 | 0 | 3 |
| Automation | 4 | 4 | 0 | 0 |
| Companion Apps | 6 | 0 | 0 | 6 |
| Skills/Plugins | 5 | 4 | 0 | 1 |
| Gateway/Infra | 11 | 4 | 1 | 6 |
| Chat Commands | 8 | 7 | 0 | 0 |
| Misc | 9 | 9 | 0 | 0 |
| **TOTAL** | **117** | **88 (75%)** | **1 (1%)** | **27 (23%)** |
| Gateway/Infra | 13 | 4 | 1 | 8 |
| Chat Commands | 6 | 6 | 0 | 0 |
| Misc | 10 | 9 | 0 | 1 |
| **TOTAL** | **128** | **95 (74%)** | **1 (1%)** | **32 (25%)** |
*Note: Match rate improved from 70% to 74% after implementing Tier 3 features (Lane Queue, credential redaction, Web UI token dashboard, xAI Grok provider, Voyage AI embeddings).*
---
@@ -268,24 +278,40 @@ All five Tier 1 items implemented: `!!think` prefix, `/verbose` command, typing
All four Tier 2 items implemented: inbound webhooks (HTTP POST /webhooks/:name with HMAC auth), vector memory search (hybrid keyword+vector with OpenAI/Gemini/Ollama/LlamaCpp embeddings), Dockerfile (multi-stage build), heartbeat monitor (5 checks with notification). See `docs/plans/2026-02-07-tier2-implementation-plan.md`.
### Tier 3 — Additional Channels (if desired)
### ~~Tier 3 — Practical Improvements~~ (DONE — implemented 2026-02-09)
10. Signal (signal-cli bridge)
11. Matrix (matrix-js-sdk)
12. Microsoft Teams (Bot Framework)
13. Google Chat (Chat API)
All five Tier 3 items implemented: Lane Queue (per-session FIFO in gateway), credential redaction (18+ secret fields), Web UI token dashboard (usage page with summary cards), xAI Grok provider (OpenAI-compatible), Voyage AI embeddings (configurable dimensions). +33 new tests.
### Tier 4 — Deferred / Niche
### Tier 4 — Additional Channels (if desired)
6. Signal (signal-cli bridge)
7. Matrix (matrix-js-sdk)
8. Microsoft Teams (Bot Framework)
9. Google Chat (Chat API)
### Tier 5 — Deferred / Niche
- Companion apps (macOS/iOS/Android) — massive scope
- LINE, Feishu, Mattermost — niche audience
- iMessage/BlueBubbles — Apple ecosystem only
- Canvas/A2UI — experimental
- Canvas/A2UI — experimental visual workspace
- Voice Wake / Talk Mode — ElevenLabs TTS integration
- Nix/Fly.io/Railway deployment — platform-specific
- OAuth subscription auth — complex
- DM pairing codes — niche security feature
- Skill/plugin safety scanner — static analysis
- Shell completion — CLI ergonomics
- Announce delivery mode — isolated job delivery
- Bonjour/mDNS discovery — LAN-only use case
- GLM/MiniMax/Moonshot/Z.AI — regional providers
- MiniMax/Moonshot — regional providers
- Synthetic provider — testing/mock
- Elevated mode — sandbox escape hatch
- Onboard wizard — guided setup
- Gateway lock — single-client mode
- Tailscale Serve/Funnel — native integration
- ClawHub/skill registry — community marketplace
- QMD backend — experimental memory search
- Presence tracking — online/offline status
---
@@ -293,7 +319,7 @@ All four Tier 2 items implemented: inbound webhooks (HTTP POST /webhooks/:name w
- **Full MCP protocol support** with stdio transport, tool bridging, and server lifecycle management
- **Model tier switching** via chat commands (`/local`, `/cloud`, `/model`)
- **8 model providers** (Anthropic, OpenAI, Gemini, Ollama, Llama.cpp, OpenRouter, Bedrock, GitHub)
- **10 model providers** (Anthropic, OpenAI, Gemini, Ollama, Llama.cpp, OpenRouter, Bedrock, GitHub, Zhipu, xAI)
- **SQLite session storage** (vs OpenClaw's JSONL files)
- **Configurable retry policy** with exponential backoff
- **Skill installer** with managed directory + upgrade support
+72 -3
View File
@@ -750,6 +750,74 @@
}
}
},
"tier3-remaining-features": {
"status": "completed",
"date": "2026-02-09",
"summary": "5 Tier 3 features from gap analysis: Lane Queue, credential redaction, Web UI token dashboard, xAI (Grok) provider, Voyage AI embeddings",
"phases": {
"lane_queue": {
"priority": "Tier3",
"status": "completed",
"description": "Per-session FIFO queue in gateway — serializes concurrent requests instead of rejecting. LaneQueue class with enqueue/cancel/queueLength methods.",
"files_created": [
"src/gateway/lane-queue.ts",
"src/gateway/lane-queue.test.ts"
],
"files_modified": [
"src/gateway/handlers/agent.ts",
"src/gateway/server.ts",
"src/gateway/index.ts"
],
"test_status": "9/9 passing"
},
"credential_redaction": {
"priority": "Tier3",
"status": "completed",
"description": "Expanded redactConfig() from 2 secret locations to 18+ secret fields — telegram, discord, slack tokens; server.token; all model tier api_key/auth_token; web_search, audio, memory embedding api_keys; webhook secrets; gmail credentials; MCP server env vars.",
"files_modified": [
"src/gateway/handlers/config.ts",
"src/gateway/handlers/handlers.test.ts"
],
"test_status": "16/16 passing"
},
"web_ui_token_dashboard": {
"priority": "Tier3",
"status": "completed",
"description": "system.tokenUsage gateway endpoint + Usage page in web dashboard SPA with summary cards, per-session table, and auto-refresh.",
"files_created": [
"src/gateway/ui/pages/usage.js"
],
"files_modified": [
"src/gateway/handlers/system.ts",
"src/gateway/session-bridge.ts",
"src/daemon/index.ts",
"src/gateway/ui/index.html",
"src/gateway/ui/style.css"
]
},
"xai_grok_provider": {
"priority": "Tier3",
"status": "completed",
"description": "xAI as OpenAI-compatible model provider — reuses OpenAIClient with baseURL https://api.x.ai/v1, XAI_API_KEY env var fallback, pricing for grok-3/grok-3-mini/grok-2/grok-2-mini/grok-3-fast.",
"files_modified": [
"src/config/schema.ts",
"src/daemon/index.ts",
"src/models/costs.ts"
]
},
"voyage_ai_embeddings": {
"priority": "Tier3",
"status": "completed",
"description": "Voyage AI embedding provider for memory/vector search — OpenAI SDK with baseURL https://api.voyageai.com/v1, defaults to 1024 dimensions, VOYAGE_API_KEY env var.",
"files_modified": [
"src/config/schema.ts",
"src/memory/embeddings.ts",
"src/memory/embeddings.test.ts"
],
"test_status": "5/5 passing"
}
}
},
"earlier_plans": {
"plans": [
{ "file": "2026-02-02-flynn-design.md", "status": "completed" },
@@ -773,7 +841,7 @@
},
"overall_progress": {
"total_test_count": 1001,
"total_test_count": 1034,
"all_tests_passing": true,
"p0_completion": "3/3 (100%)",
"p1_completion": "4/4 (100%)",
@@ -786,7 +854,8 @@
"p8_completion": "8/8 (100%) — agent tools (sessions.list/history/create/delete, agents.list, message.send, cron.list/trigger) + gap analysis audit",
"tier1_completion": "5/5 (100%) — !!think prefix, /verbose command, typing indicators (Discord/WhatsApp), session pruning (TTL), tool groups",
"tier2_completion": "4/4 (100%) — inbound webhooks, vector memory search, Dockerfile, heartbeat monitor",
"feature_gap_scorecard": "88/116 match (76%), 1 partial (1%), 27 missing (23%)",
"next_up": "All phases P0-P8 and Tiers 1-3 complete. Local model tool calling added. Remaining gaps: Tier 3 channels (Signal, Matrix, Teams, Google Chat), Tier 4 deferred/niche items"
"tier3_completion": "5/5 (100%) — lane queue, credential redaction, web UI token dashboard, xAI (Grok) provider, Voyage AI embeddings",
"feature_gap_scorecard": "95/128 match (74%), 1 partial (1%), 32 missing (25%)",
"next_up": "All phases P0-P8 and Tiers 1-3 complete. Local model tool calling added. Remaining gaps: Tier 4 channels (Signal, Matrix, Teams, Google Chat), Tier 5 deferred/niche items"
}
}