Five additive features with no breaking changes: - Tool groups: group:fs, group:runtime, group:web, group:memory syntactic sugar for allow/deny lists in tool policy config - Typing indicators: Discord sendTyping() and WhatsApp sendStateTyping() on message receipt for better UX feedback - Session pruning: TTL-based auto-cleanup via sessions.ttl config with hourly daemon timer and SQLite GROUP BY pruning - /verbose command: TUI command parser toggle for raw streaming display - !!think prefix: per-message extended thinking mode wired through Anthropic (budget_tokens), OpenAI/GitHub (reasoning_effort), and Gemini (thinkingConfig) providers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9.1 KiB
Tier 1 Quick Wins — Design
Date: 2026-02-07 Status: Draft Scope: 5 additive features, no breaking changes
1. Per-message thinking mode (!!think prefix)
Trigger
User prefixes a message with !!think. The prefix is stripped before the message reaches the model.
Data flow
- Frontend/channel adapter detects
!!thinkprefix, strips it, setsthinking: trueon the message metadata - Agent loop passes
thinkingflag through toChatRequest - Each provider client checks the flag:
- Anthropic: sets
thinking.budget_tokens(default 4096) - OpenAI/GitHub Models: sets
reasoning_effort(default'medium') - Gemini: sets
thinkingConfig.thinkBudgetTokens(default 4096) - Bedrock: sets via Anthropic thinking params
- Ollama/llama.cpp: no-op (silently ignored)
- Anthropic: sets
- Response thinking/reasoning content is included in the reply (displayed as a collapsible block in TUI/WebChat, omitted in channel adapters)
Config additions
All optional — controls per-provider defaults when !!think is active:
models:
thinking:
anthropic:
budgetTokens: 4096
openai:
reasoningEffort: medium # low | medium | high
gemini:
budgetTokens: 4096
Types changes
// src/models/types.ts — ChatRequest
export interface ChatRequest {
messages: Message[];
system?: string;
maxTokens?: number;
tools?: ToolDefinition[];
thinking?: boolean; // NEW
}
// src/models/types.ts — ChatResponse
export interface ChatResponse {
content: string;
toolCalls?: ToolCall[];
stopReason?: string;
usage?: TokenUsage;
thinkingContent?: string; // NEW — raw thinking/reasoning output
}
Provider implementation
Each client checks request.thinking and maps to native API:
anthropic.ts: Addthinking: { type: 'enabled', budget_tokens }tomessages.create()params. Parsethinkingcontent blocks from response.openai.ts: Addreasoning_efforttochat.completions.create(). Parsereasoningfrom response.github.ts: Same as OpenAI (uses OpenAI SDK).gemini.ts: AddthinkingConfigtogenerationConfig. Parse thinking parts from response.bedrock.ts: Add thinking params via Anthropic Converse API format.ollama.ts/llamacpp.ts: Ignore the flag.
Files affected
src/models/types.ts— Addthinkingto ChatRequest,thinkingContentto ChatResponsesrc/models/anthropic.ts— Wirebudget_tokens, parse thinking blockssrc/models/openai.ts— Wirereasoning_effort, parse reasoningsrc/models/github.ts— Pass through to OpenAI clientsrc/models/gemini.ts— WirethinkingConfigsrc/models/bedrock.ts— Wire thinking paramssrc/config/schema.ts— Addmodels.thinkingconfig sectionsrc/backends/native/agent.ts— Passthinkingflag from message metadata to ChatRequestsrc/frontends/tui/commands.ts— Detect and strip!!thinkprefix- Channel adapters — Detect and strip
!!thinkprefix - TUI/WebChat — Display
thinkingContentas collapsible block
2. Verbose streaming mode (/verbose)
Trigger
/verbose command toggles a boolean in the frontend's local state. Not persisted to session or config.
Effect when on
- Raw streaming chunks displayed as they arrive, including tool call JSON being generated
- Tool arguments and raw results shown in full (no summarization)
Scope
TUI and WebChat only. Channel adapters (Telegram, Discord, Slack, WhatsApp) do not support this.
Implementation
- Add
verbose: booleanto TUI and WebChat frontend state (defaultfalse) - Add
/verboseto command parser — toggles the flag, prints current status - Streaming renderer checks the flag:
- On: emit raw chunks as-is, display full tool call JSON and results
- Off: current behavior (summarized tool output, clean text display)
- No backend changes — purely a display concern
Files affected
src/frontends/tui/commands.ts— Addverbosecommand type and parsingsrc/frontends/tui/minimal.ts— Handle/verbose, toggle state, modify streaming displaysrc/gateway/ui/pages/chat.js— WebChat verbose toggle and raw display mode- WebSocket message handler — Pass raw chunks when verbose is active
3. Typing indicators
When
Immediately on receiving a user message. Sustained until the response is fully sent.
Per-adapter implementation
| Adapter | API | Notes |
|---|---|---|
| Discord | channel.sendTyping() |
Auto-expires after 10s. Re-fire on a 9s interval while processing. |
| Slack | Bolt typing indicator API | Fire on receipt, cancel on response. |
sock.sendPresenceUpdate('composing', jid) |
Fire on receipt, send 'paused' on response. |
|
| Telegram | grammY sendChatAction('typing') |
Already implemented. No changes needed. |
Implementation pattern
Each adapter's message handler calls sendTyping() before dispatching to the agent loop. A cleanup/cancel mechanism (interval clear or presence update) stops the indicator once the response is sent.
// Pseudocode for Discord adapter
async handleMessage(msg) {
const typingInterval = setInterval(() => msg.channel.sendTyping(), 9000);
msg.channel.sendTyping(); // immediate first call
try {
await this.dispatch(msg);
} finally {
clearInterval(typingInterval);
}
}
Files affected
src/channels/discord/adapter.ts— Add typing interval in message handlersrc/channels/slack/adapter.ts— Add typing indicator in message handlersrc/channels/whatsapp/adapter.ts— Add presence composing/paused in message handler
4. Session pruning (TTL-based)
Config addition
sessions:
ttl: 30d # duration string. Default: 30d. Set to 0 or false to disable.
Supported formats: "30d", "7d", "12h", "0" (disabled).
Mechanism
- Daemon startup schedules a periodic timer (every 1 hour)
- Timer calls
SessionStore.pruneStale(cutoffTimestamp) - SQLite query finds all
session_ids whereMAX(created_at) < cutoff - Deletes all messages for stale sessions
- Evicts pruned sessions from
SessionManager's in-memory cache - Logs:
"Pruned 3 stale sessions (TTL: 30d)"
Duration parsing
Simple regex parser for duration strings — no external library:
function parseDuration(s: string): number | null {
const match = s.match(/^(\d+)(h|d)$/);
if (!match) return null;
const [, n, unit] = match;
const ms = unit === 'h' ? Number(n) * 3600000 : Number(n) * 86400000;
return ms;
}
New SessionStore method
async pruneStale(beforeTimestamp: number): Promise<string[]> {
// Returns list of pruned session IDs
const stale = db.prepare(`
SELECT session_id FROM messages
GROUP BY session_id
HAVING MAX(created_at) < ?
`).all(beforeTimestamp);
for (const { session_id } of stale) {
db.prepare('DELETE FROM messages WHERE session_id = ?').run(session_id);
}
return stale.map(r => r.session_id);
}
Files affected
src/config/schema.ts— Addsessions.ttlfieldsrc/session/store.ts— AddpruneStale()methodsrc/session/manager.ts— AddevictSessions(ids)to clear in-memory cachesrc/daemon/index.ts— Schedule pruning timer on startup
5. Tool groups
Group definitions
Static map in policy.ts:
export const TOOL_GROUPS: Record<string, string[]> = {
'group:fs': ['file.read', 'file.write', 'file.edit', 'file.list'],
'group:runtime': ['shell.exec', 'process.start', 'process.output', 'process.status', 'process.kill', 'process.list'],
'group:web': ['web.fetch', 'web.search', 'browser.navigate', 'browser.click', 'browser.type', 'browser.screenshot', 'browser.evaluate'],
'group:memory': ['memory.read', 'memory.write', 'memory.search'],
};
Resolution
ToolPolicy expands group:* entries in allow/deny lists before applying filters. Expansion happens early in the resolution pipeline, before any set operations.
function expandGroups(names: string[]): string[] {
return names.flatMap(n => TOOL_GROUPS[n] ?? [n]);
}
Works in all scopes: global allow/deny, per-agent overrides, per-provider overrides.
Config usage example
tools:
profile: minimal
allow: ['group:web']
agents:
fast:
allow: ['group:fs']
deny: ['shell.exec']
providers:
ollama:
deny: ['group:web']
Files affected
src/tools/policy.ts— AddTOOL_GROUPSmap,expandGroups()helper, integrate into resolution pipelinesrc/tools/policy.test.ts— Tests for group expansion in all scopes
Implementation order
Recommended order by independence and risk:
- Tool groups — Isolated to
policy.ts, no cross-cutting concerns - Typing indicators — Per-adapter, independent changes
- Session pruning — Self-contained, touches store/manager/daemon
/verbose— Frontend-only, no backend changes!!think— Largest scope, touches all providers + agent loop + frontends
Features 1–3 can be implemented in parallel. Feature 4 is independent. Feature 5 depends on understanding the streaming path touched by feature 4.