Commit Graph

38 Commits

Author SHA1 Message Date
William Valentin 7e7685a194 fix(agent): recover when action intent has no tool call 2026-02-22 23:01:52 -08:00
William Valentin b477705806 Isolate turn audio hydration with async local context 2026-02-22 21:41:49 -08:00
William Valentin a761813375 Bind audio.transcribe hydration to current message turn 2026-02-22 21:27:09 -08:00
William Valentin db4e52dd7e Harden audio transcription arg hydration and add rewrite audit event 2026-02-22 18:56:22 -08:00
William Valentin dafe9b4d3d fix(core): harden env loading, OpenAI compatibility, and runtime recovery 2026-02-22 15:56:21 -08:00
William Valentin b09bfc8373 Unify TUI slash commands and harden tool inventory responses 2026-02-21 12:39:27 -08:00
William Valentin 2c3a00f6dd Improve in-flight cancel latency via run abort signal propagation 2026-02-19 12:24:39 -08:00
William Valentin baa53f91d9 refactor(security): unify elevated mode handling across surfaces 2026-02-19 11:41:53 -08:00
William Valentin b6c8d8ddf4 fix(native-agent): recover textual tool_use JSON calls 2026-02-19 08:55:41 -08:00
William Valentin 55cde541ea fix: abort model retries immediately on user cancellation 2026-02-18 11:21:57 -08:00
William Valentin 9345a864f4 fix(agent): add model request timeouts and empty-response fallback 2026-02-17 23:05:21 -08:00
William Valentin 5451f8a1de fix(tooling): surface non-executable tool-use warnings 2026-02-17 16:34:54 -08:00
William Valentin 9a2f1e2bb2 chore: checkpoint browser tooling and routing updates 2026-02-17 15:18:37 -08:00
William Valentin 49b752e8b0 chore(lint): reduce warning debt across core adapters and model clients 2026-02-15 23:03:42 -08:00
William Valentin 948d589ac3 fix(audit): resolve lint global, compaction metrics, and nudge id 2026-02-15 21:54:12 -08:00
William Valentin 7f563b4bb1 fix(agent): raise max iterations default from 10 to 200
The low default caused the agent to stop mid-task prematurely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:48:16 -08:00
William Valentin ab89378fce feat(security): enforce elevated mode and sandbox execution 2026-02-15 17:02:05 -08:00
William Valentin 67058c8719 feat(security): harden tool provenance and skill isolation 2026-02-15 10:16:55 -08:00
William Valentin 46099664f0 feat(gateway): wire safe-point runtime cancellation for agent.cancel 2026-02-13 08:51:14 -08:00
William Valentin 01c3175fdb fix: normalize OpenAI/GitHub finish_reason to Flynn stopReason conventions
OpenAI-compatible providers return 'stop' and 'tool_calls' as finish_reason
values, but Flynn's agent loop expects Anthropic-style 'end_turn' and
'tool_use'. This caused the agent to exit the tool loop prematurely when
falling back to GitHub Copilot (due to Anthropic API quota exhaustion).

- openai.ts: Map 'stop' → 'end_turn', 'length' → 'max_tokens', tool_calls
  with actual tools → 'tool_use', tool_calls without tools → 'end_turn'
- github.ts: Handle edge case where finish_reason is 'tool_calls' but no
  tools were parsed
- agent.ts: Accept both 'tool_use' and 'tool_calls' as valid stop reasons
  (belt-and-suspenders), extract toolCalls to local variable for TS narrowing
- openai.test.ts: Update expectations to match new normalized values
2026-02-11 09:49:36 -08:00
William Valentin 1aab006a7f feat: improve agent loop resilience — same-tool nudging and error handling
- agent.ts: track consecutive calls to the same tool (ignoring args) and
  inject a nudge after 4 repeats telling the model to summarize and respond,
  preventing local models from endlessly retrying searches with slight
  query variations
- agent.ts: wrap the entire tool loop iteration in try-catch so model/network
  errors don't crash the daemon — returns a descriptive error message instead
- Tests for both: nudge triggers after 4 same-tool calls, error recovery
  persists to history
2026-02-11 09:33:30 -08:00
William Valentin bf9ca690f3 fix(agent): detect repeated tool call loops and make max_iterations configurable
Local LLMs often get stuck calling the same tool repeatedly because they
lack the sophistication to synthesize results. The agent loop had no
safeguard — it re-executed whatever the model requested up to 10 times.

Add fingerprint-based loop detection: if the same tool+args combination
repeats 3 consecutive times, break the loop and return the last results.
Also add agents.max_iterations to the config schema so the iteration
limit is user-configurable (default: 10).
2026-02-10 19:35:09 -08:00
William Valentin 796e143d61 fix(agent): inject tool inventory note when tools change mid-session
Stale session history can cause the model to follow old "I can't do
that" patterns even when new tools are available. NativeAgent now tracks
a tool fingerprint and appends a system prompt note listing current
tools when the inventory changes, resetting on session reset.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 11:41:31 -08:00
William Valentin 9be8f76bc7 feat: implement Tier 3 features — lane queue, credential redaction, token dashboard, xAI, Voyage AI
- Lane Queue: per-session FIFO queue in gateway replacing reject-when-busy (9 tests)
- Credential Redaction: redactConfig() expanded to cover 18+ secret fields (16 tests)
- Web UI Token Dashboard: system.tokenUsage endpoint + Usage page with summary cards
- xAI (Grok) Provider: OpenAI-compatible client with model pricing
- Voyage AI Embeddings: new embedding provider with configurable dimensions (5 tests)
- Update gap analysis: 90→95 match (70%→74%), Tier 3 section marked DONE
- Update state.json: test count 1001→1034, add tier3_completion entry

Total: 1034 tests passing across 85 files, typecheck clean
2026-02-09 10:32:57 -08:00
William Valentin 1c2f54fae3 feat: implement tier 1 quick wins (tool groups, typing, pruning, verbose, think)
Five additive features with no breaking changes:

- Tool groups: group:fs, group:runtime, group:web, group:memory syntactic
  sugar for allow/deny lists in tool policy config
- Typing indicators: Discord sendTyping() and WhatsApp sendStateTyping()
  on message receipt for better UX feedback
- Session pruning: TTL-based auto-cleanup via sessions.ttl config with
  hourly daemon timer and SQLite GROUP BY pruning
- /verbose command: TUI command parser toggle for raw streaming display
- !!think prefix: per-message extended thinking mode wired through
  Anthropic (budget_tokens), OpenAI/GitHub (reasoning_effort), and
  Gemini (thinkingConfig) providers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 13:35:00 -08:00
William Valentin 6bb424cddc feat: add agent tools and sanitize tool names for Anthropic API
Add 8 new agent-callable tools (sessions.list/history/create/delete,
agents.list, message.send, cron.list/trigger) and sanitize tool names
at the API boundary (dots → underscores) to comply with Anthropic's
`^[a-zA-Z0-9_-]{1,128}` requirement. Reverse-maps sanitized names
back to internal names for hook callbacks and tool execution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 12:23:09 -08:00
William Valentin b9bfee9c5b feat: add outbound attachment support with media.send tool
Introduces OutboundAttachment type on OutboundMessage, an
OutboundAttachmentCollector (push/drain pattern), and a media.send
tool that queues files for outbound delivery. Each channel adapter
(Telegram, Discord, Slack, WhatsApp) sends attachments after the
text reply. Includes 15 tests for collector and tool.
2026-02-07 09:09:00 -08:00
William Valentin a515912537 feat: add multimodal media pipeline for image support across all providers and channels
Widen Message.content from string to string | MessageContentPart[] to support
multimodal content. Add Attachment type to channel layer, media conversion
utilities, and image extraction to all channel adapters (Telegram, Discord,
Slack, WhatsApp). Update all model clients (Anthropic, OpenAI, Gemini, Bedrock)
to convert structured content to provider-specific formats. Fix downstream
consumers (tokens, compaction, TUI, local models) to handle the widened type
via getMessageText() helper.
2026-02-06 17:17:21 -08:00
William Valentin ee0af0cc06 feat: add tool allow/deny profiles with per-agent and per-provider filtering
Implements configurable tool filtering with four built-in profiles
(minimal, messaging, coding, full), global and per-agent/per-provider
allow/deny lists with glob pattern support, and defense-in-depth
enforcement at both tool listing and execution time.

New: src/tools/policy.ts (ToolPolicy engine), src/tools/policy.test.ts (37 tests)
Modified: config schema, tool registry, tool executor, NativeAgent,
AgentOrchestrator, daemon wiring, gateway tool handler, test mocks
2026-02-06 15:30:34 -08:00
William Valentin 4316dbd3be feat: add P2 features — retry policy, prompt templating, usage tracking, tech debt cleanup
- Extract shared splitMessage() into channels/utils.ts (dedup 4 adapters)
- Add Slack user name resolution with caching (users.info API)
- Add withRetry() with exponential backoff + jitter, isRetryable() filter
- Wire retry config into ModelRouter.chat() (non-streaming only)
- Add assembleSystemPrompt() multi-file template system (SOUL/AGENTS/IDENTITY/USER/TOOLS.md)
- Add usage tracking accumulators in NativeAgent + AgentOrchestrator
- Add estimateCost() with per-model pricing table
- Add /usage TUI command with full usage report formatting
- Add retrySchema and promptSchema to config schema

Tests: 569 passing, typecheck clean
2026-02-06 15:12:35 -08:00
William Valentin 7a35b22458 feat: wire up all Phase 2-6 features into daemon and config
Integrate all new features into the shared infrastructure:
- Config schema: add memory, discord, slack, process, web_search schemas
- Daemon wiring: memory store init, tool registration, channel adapters
- Orchestrator: memory injection into system prompt, extraction on compaction
- Agent: add setSystemPrompt() for dynamic prompt updates
- Channel/tool index: export new adapters and tool factories
- Add @slack/bolt, discord.js, turndown, linkedom, @mozilla/readability deps
- Update state.json with Phase 3b completion (494 tests passing)
2026-02-06 14:24:39 -08:00
William Valentin e4b7f96d33 fix: provider-aware model routing with fallback visibility
- Extract createClientFromConfig() to dispatch on provider field instead
  of hardcoding all tiers as AnthropicClient
- Add fallback/fallbackReason metadata to ChatResponse and ChatStreamEvent
  so callers know when a fallback model was used
- Enhance doctor check to report full model stack and warn on missing
  API keys for cloud providers
- Log fallback warnings in NativeAgent and display them in TUI
- Support tier names and local_providers entries in fallback_chain
- Add 8 tests for createClientFromConfig covering all provider types
2026-02-06 09:58:56 -08:00
William Valentin ad7fc241f1 feat(telegram): display tool execution status messages
Telegram bot now shows tool status during execution:
- Sends status message when tool starts (tool name + args snippet)
- Edits status message with result on completion
- Keeps typing indicator active during tool execution
- Adds setOnToolUse() to NativeAgent for per-message callback control
2026-02-05 17:53:54 -08:00
William Valentin 4f87643341 feat(agent): add iterative tool use loop with max iterations
Rewrites NativeAgent.process() from single-turn to an iterative tool
loop. When toolRegistry and toolExecutor are provided, the agent calls
the model, executes any requested tool calls, feeds results back, and
loops until the model returns a text response or max iterations hit.

- Backward compatible: works exactly as before without tools
- Supports onToolUse callback for frontend status display
- Max iterations (default 10) prevents infinite loops
- Handles multiple tool calls per model response
- 5 new tests (8 total)
2026-02-05 17:48:38 -08:00
William Valentin f891c7aee8 fix: add API key/auth token support across all model clients 2026-02-05 10:56:40 -08:00
William Valentin fb7575f850 refactor: integrate SessionManager into daemon and agent
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 00:43:09 -08:00
William Valentin 6e6c263e14 feat: integrate model router, session persistence, and hook engine
- NativeAgent now loads/saves messages to SessionStore
- Daemon creates ModelRouter with fallback chain support
- Telegram bot handles confirmation callbacks from HookEngine
- Session data stored in ~/.local/share/flynn/sessions.db

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 00:05:42 -08:00
William Valentin 69309e58bc feat: add native agent with conversation history 2026-02-02 20:56:46 -08:00