feat: implement tier 1 quick wins (tool groups, typing, pruning, verbose, think)
Five additive features with no breaking changes: - Tool groups: group:fs, group:runtime, group:web, group:memory syntactic sugar for allow/deny lists in tool policy config - Typing indicators: Discord sendTyping() and WhatsApp sendStateTyping() on message receipt for better UX feedback - Session pruning: TTL-based auto-cleanup via sessions.ttl config with hourly daemon timer and SQLite GROUP BY pruning - /verbose command: TUI command parser toggle for raw streaming display - !!think prefix: per-message extended thinking mode wired through Anthropic (budget_tokens), OpenAI/GitHub (reasoning_effort), and Gemini (thinkingConfig) providers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,284 @@
|
||||
# Tier 1 Quick Wins — Design
|
||||
|
||||
**Date:** 2026-02-07
|
||||
**Status:** Draft
|
||||
**Scope:** 5 additive features, no breaking changes
|
||||
|
||||
---
|
||||
|
||||
## 1. Per-message thinking mode (`!!think` prefix)
|
||||
|
||||
### Trigger
|
||||
|
||||
User prefixes a message with `!!think`. The prefix is stripped before the message reaches the model.
|
||||
|
||||
### Data flow
|
||||
|
||||
1. Frontend/channel adapter detects `!!think` prefix, strips it, sets `thinking: true` on the message metadata
|
||||
2. Agent loop passes `thinking` flag through to `ChatRequest`
|
||||
3. Each provider client checks the flag:
|
||||
- **Anthropic:** sets `thinking.budget_tokens` (default 4096)
|
||||
- **OpenAI/GitHub Models:** sets `reasoning_effort` (default `'medium'`)
|
||||
- **Gemini:** sets `thinkingConfig.thinkBudgetTokens` (default 4096)
|
||||
- **Bedrock:** sets via Anthropic thinking params
|
||||
- **Ollama/llama.cpp:** no-op (silently ignored)
|
||||
4. Response thinking/reasoning content is included in the reply (displayed as a collapsible block in TUI/WebChat, omitted in channel adapters)
|
||||
|
||||
### Config additions
|
||||
|
||||
All optional — controls per-provider defaults when `!!think` is active:
|
||||
|
||||
```yaml
|
||||
models:
|
||||
thinking:
|
||||
anthropic:
|
||||
budgetTokens: 4096
|
||||
openai:
|
||||
reasoningEffort: medium # low | medium | high
|
||||
gemini:
|
||||
budgetTokens: 4096
|
||||
```
|
||||
|
||||
### Types changes
|
||||
|
||||
```typescript
|
||||
// src/models/types.ts — ChatRequest
|
||||
export interface ChatRequest {
|
||||
messages: Message[];
|
||||
system?: string;
|
||||
maxTokens?: number;
|
||||
tools?: ToolDefinition[];
|
||||
thinking?: boolean; // NEW
|
||||
}
|
||||
|
||||
// src/models/types.ts — ChatResponse
|
||||
export interface ChatResponse {
|
||||
content: string;
|
||||
toolCalls?: ToolCall[];
|
||||
stopReason?: string;
|
||||
usage?: TokenUsage;
|
||||
thinkingContent?: string; // NEW — raw thinking/reasoning output
|
||||
}
|
||||
```
|
||||
|
||||
### Provider implementation
|
||||
|
||||
Each client checks `request.thinking` and maps to native API:
|
||||
|
||||
- **`anthropic.ts`**: Add `thinking: { type: 'enabled', budget_tokens }` to `messages.create()` params. Parse `thinking` content blocks from response.
|
||||
- **`openai.ts`**: Add `reasoning_effort` to `chat.completions.create()`. Parse `reasoning` from response.
|
||||
- **`github.ts`**: Same as OpenAI (uses OpenAI SDK).
|
||||
- **`gemini.ts`**: Add `thinkingConfig` to `generationConfig`. Parse thinking parts from response.
|
||||
- **`bedrock.ts`**: Add thinking params via Anthropic Converse API format.
|
||||
- **`ollama.ts` / `llamacpp.ts`**: Ignore the flag.
|
||||
|
||||
### Files affected
|
||||
|
||||
- `src/models/types.ts` — Add `thinking` to ChatRequest, `thinkingContent` to ChatResponse
|
||||
- `src/models/anthropic.ts` — Wire `budget_tokens`, parse thinking blocks
|
||||
- `src/models/openai.ts` — Wire `reasoning_effort`, parse reasoning
|
||||
- `src/models/github.ts` — Pass through to OpenAI client
|
||||
- `src/models/gemini.ts` — Wire `thinkingConfig`
|
||||
- `src/models/bedrock.ts` — Wire thinking params
|
||||
- `src/config/schema.ts` — Add `models.thinking` config section
|
||||
- `src/backends/native/agent.ts` — Pass `thinking` flag from message metadata to ChatRequest
|
||||
- `src/frontends/tui/commands.ts` — Detect and strip `!!think` prefix
|
||||
- Channel adapters — Detect and strip `!!think` prefix
|
||||
- TUI/WebChat — Display `thinkingContent` as collapsible block
|
||||
|
||||
---
|
||||
|
||||
## 2. Verbose streaming mode (`/verbose`)
|
||||
|
||||
### Trigger
|
||||
|
||||
`/verbose` command toggles a boolean in the frontend's local state. Not persisted to session or config.
|
||||
|
||||
### Effect when on
|
||||
|
||||
- Raw streaming chunks displayed as they arrive, including tool call JSON being generated
|
||||
- Tool arguments and raw results shown in full (no summarization)
|
||||
|
||||
### Scope
|
||||
|
||||
TUI and WebChat only. Channel adapters (Telegram, Discord, Slack, WhatsApp) do not support this.
|
||||
|
||||
### Implementation
|
||||
|
||||
- Add `verbose: boolean` to TUI and WebChat frontend state (default `false`)
|
||||
- Add `/verbose` to command parser — toggles the flag, prints current status
|
||||
- Streaming renderer checks the flag:
|
||||
- **On:** emit raw chunks as-is, display full tool call JSON and results
|
||||
- **Off:** current behavior (summarized tool output, clean text display)
|
||||
- No backend changes — purely a display concern
|
||||
|
||||
### Files affected
|
||||
|
||||
- `src/frontends/tui/commands.ts` — Add `verbose` command type and parsing
|
||||
- `src/frontends/tui/minimal.ts` — Handle `/verbose`, toggle state, modify streaming display
|
||||
- `src/gateway/ui/pages/chat.js` — WebChat verbose toggle and raw display mode
|
||||
- WebSocket message handler — Pass raw chunks when verbose is active
|
||||
|
||||
---
|
||||
|
||||
## 3. Typing indicators
|
||||
|
||||
### When
|
||||
|
||||
Immediately on receiving a user message. Sustained until the response is fully sent.
|
||||
|
||||
### Per-adapter implementation
|
||||
|
||||
| Adapter | API | Notes |
|
||||
|---------|-----|-------|
|
||||
| **Discord** | `channel.sendTyping()` | Auto-expires after 10s. Re-fire on a 9s interval while processing. |
|
||||
| **Slack** | Bolt typing indicator API | Fire on receipt, cancel on response. |
|
||||
| **WhatsApp** | `sock.sendPresenceUpdate('composing', jid)` | Fire on receipt, send `'paused'` on response. |
|
||||
| **Telegram** | grammY `sendChatAction('typing')` | Already implemented. No changes needed. |
|
||||
|
||||
### Implementation pattern
|
||||
|
||||
Each adapter's message handler calls `sendTyping()` before dispatching to the agent loop. A cleanup/cancel mechanism (interval clear or presence update) stops the indicator once the response is sent.
|
||||
|
||||
```typescript
|
||||
// Pseudocode for Discord adapter
|
||||
async handleMessage(msg) {
|
||||
const typingInterval = setInterval(() => msg.channel.sendTyping(), 9000);
|
||||
msg.channel.sendTyping(); // immediate first call
|
||||
try {
|
||||
await this.dispatch(msg);
|
||||
} finally {
|
||||
clearInterval(typingInterval);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Files affected
|
||||
|
||||
- `src/channels/discord/adapter.ts` — Add typing interval in message handler
|
||||
- `src/channels/slack/adapter.ts` — Add typing indicator in message handler
|
||||
- `src/channels/whatsapp/adapter.ts` — Add presence composing/paused in message handler
|
||||
|
||||
---
|
||||
|
||||
## 4. Session pruning (TTL-based)
|
||||
|
||||
### Config addition
|
||||
|
||||
```yaml
|
||||
sessions:
|
||||
ttl: 30d # duration string. Default: 30d. Set to 0 or false to disable.
|
||||
```
|
||||
|
||||
Supported formats: `"30d"`, `"7d"`, `"12h"`, `"0"` (disabled).
|
||||
|
||||
### Mechanism
|
||||
|
||||
1. Daemon startup schedules a periodic timer (every 1 hour)
|
||||
2. Timer calls `SessionStore.pruneStale(cutoffTimestamp)`
|
||||
3. SQLite query finds all `session_id`s where `MAX(created_at) < cutoff`
|
||||
4. Deletes all messages for stale sessions
|
||||
5. Evicts pruned sessions from `SessionManager`'s in-memory cache
|
||||
6. Logs: `"Pruned 3 stale sessions (TTL: 30d)"`
|
||||
|
||||
### Duration parsing
|
||||
|
||||
Simple regex parser for duration strings — no external library:
|
||||
|
||||
```typescript
|
||||
function parseDuration(s: string): number | null {
|
||||
const match = s.match(/^(\d+)(h|d)$/);
|
||||
if (!match) return null;
|
||||
const [, n, unit] = match;
|
||||
const ms = unit === 'h' ? Number(n) * 3600000 : Number(n) * 86400000;
|
||||
return ms;
|
||||
}
|
||||
```
|
||||
|
||||
### New SessionStore method
|
||||
|
||||
```typescript
|
||||
async pruneStale(beforeTimestamp: number): Promise<string[]> {
|
||||
// Returns list of pruned session IDs
|
||||
const stale = db.prepare(`
|
||||
SELECT session_id FROM messages
|
||||
GROUP BY session_id
|
||||
HAVING MAX(created_at) < ?
|
||||
`).all(beforeTimestamp);
|
||||
|
||||
for (const { session_id } of stale) {
|
||||
db.prepare('DELETE FROM messages WHERE session_id = ?').run(session_id);
|
||||
}
|
||||
return stale.map(r => r.session_id);
|
||||
}
|
||||
```
|
||||
|
||||
### Files affected
|
||||
|
||||
- `src/config/schema.ts` — Add `sessions.ttl` field
|
||||
- `src/session/store.ts` — Add `pruneStale()` method
|
||||
- `src/session/manager.ts` — Add `evictSessions(ids)` to clear in-memory cache
|
||||
- `src/daemon/index.ts` — Schedule pruning timer on startup
|
||||
|
||||
---
|
||||
|
||||
## 5. Tool groups
|
||||
|
||||
### Group definitions
|
||||
|
||||
Static map in `policy.ts`:
|
||||
|
||||
```typescript
|
||||
export const TOOL_GROUPS: Record<string, string[]> = {
|
||||
'group:fs': ['file.read', 'file.write', 'file.edit', 'file.list'],
|
||||
'group:runtime': ['shell.exec', 'process.start', 'process.output', 'process.status', 'process.kill', 'process.list'],
|
||||
'group:web': ['web.fetch', 'web.search', 'browser.navigate', 'browser.click', 'browser.type', 'browser.screenshot', 'browser.evaluate'],
|
||||
'group:memory': ['memory.read', 'memory.write', 'memory.search'],
|
||||
};
|
||||
```
|
||||
|
||||
### Resolution
|
||||
|
||||
`ToolPolicy` expands `group:*` entries in allow/deny lists before applying filters. Expansion happens early in the resolution pipeline, before any set operations.
|
||||
|
||||
```typescript
|
||||
function expandGroups(names: string[]): string[] {
|
||||
return names.flatMap(n => TOOL_GROUPS[n] ?? [n]);
|
||||
}
|
||||
```
|
||||
|
||||
Works in all scopes: global allow/deny, per-agent overrides, per-provider overrides.
|
||||
|
||||
### Config usage example
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
profile: minimal
|
||||
allow: ['group:web']
|
||||
agents:
|
||||
fast:
|
||||
allow: ['group:fs']
|
||||
deny: ['shell.exec']
|
||||
providers:
|
||||
ollama:
|
||||
deny: ['group:web']
|
||||
```
|
||||
|
||||
### Files affected
|
||||
|
||||
- `src/tools/policy.ts` — Add `TOOL_GROUPS` map, `expandGroups()` helper, integrate into resolution pipeline
|
||||
- `src/tools/policy.test.ts` — Tests for group expansion in all scopes
|
||||
|
||||
---
|
||||
|
||||
## Implementation order
|
||||
|
||||
Recommended order by independence and risk:
|
||||
|
||||
1. **Tool groups** — Isolated to `policy.ts`, no cross-cutting concerns
|
||||
2. **Typing indicators** — Per-adapter, independent changes
|
||||
3. **Session pruning** — Self-contained, touches store/manager/daemon
|
||||
4. **`/verbose`** — Frontend-only, no backend changes
|
||||
5. **`!!think`** — Largest scope, touches all providers + agent loop + frontends
|
||||
|
||||
Features 1–3 can be implemented in parallel. Feature 4 is independent. Feature 5 depends on understanding the streaming path touched by feature 4.
|
||||
Reference in New Issue
Block a user