flynn

will/flynn

Author	SHA1	Message	Date
William Valentin	90ce622080	feat(policy): enforce truthfulness and autonomy guardrails Add runtime truthfulness modes and autonomy-level tool gating with audit metadata for overrides/denials. Wire policy through prompt assembly, tool execution context, and daemon/gateway agent paths; update tests and planning state for Phase 3 PR #2 completion.	2026-02-12 16:06:45 -08:00
William Valentin	a8a2c59313	feat: implement model persistence with per-session overrides - Add session_config SQLite table for per-session settings - Update routing to support session override → agent config → global default resolution chain - Upgrade WebChat SessionBridge from NativeAgent to AgentOrchestrator - Add /model, /local, /cloud commands to Telegram adapter - Add /model command to WebChat gateway handlers - Clear session overrides on /reset command - Pass memoryStore and config through to SessionBridge - Add comprehensive tests for all new functionality Fixes model persistence bug where TUI model changes didn't affect WebChat/Telegram sessions. Now: - TUI /model sets global default (persists across restarts, affects all new sessions) - WebChat/Telegram /model sets session override (only that conversation, cleared on /reset) - WebChat sessions gain AgentOrchestrator features (delegation, compaction, memory)	2026-02-11 21:51:38 -08:00
William Valentin	d62e836b5d	feat(audit): Add core audit logging infrastructure - Add AuditLogger class with rotation support - Add audit configuration to config schema - Instrument tool execution with full audit logging - Instrument session lifecycle (create, message, delete, transfer, compact) - Add audit logger initialization in daemon - Add cron scheduler audit logging Audit events captured: - tool.start/success/error/denied - session.create/message/delete/transfer/compact - cron.trigger/add/remove All logs go to ~/.local/share/flynn/audit.log (JSON lines) with rotation (10MB files, 30-day retention)	2026-02-11 15:58:07 -08:00
William Valentin	6090508bad	style: auto-fix ESLint issues (curly braces and formatting) - Add curly braces to all if/else/for/while statements - Fix indentation and trailing spaces - Auto-fixed 372 linting errors using eslint --fix - Remaining issues are warnings only (non-null assertions, explicit any types)	2026-02-11 10:30:24 -08:00
William Valentin	01c3175fdb	fix: normalize OpenAI/GitHub finish_reason to Flynn stopReason conventions OpenAI-compatible providers return 'stop' and 'tool_calls' as finish_reason values, but Flynn's agent loop expects Anthropic-style 'end_turn' and 'tool_use'. This caused the agent to exit the tool loop prematurely when falling back to GitHub Copilot (due to Anthropic API quota exhaustion). - openai.ts: Map 'stop' → 'end_turn', 'length' → 'max_tokens', tool_calls with actual tools → 'tool_use', tool_calls without tools → 'end_turn' - github.ts: Handle edge case where finish_reason is 'tool_calls' but no tools were parsed - agent.ts: Accept both 'tool_use' and 'tool_calls' as valid stop reasons (belt-and-suspenders), extract toolCalls to local variable for TS narrowing - openai.test.ts: Update expectations to match new normalized values	2026-02-11 09:49:36 -08:00
William Valentin	1aab006a7f	feat: improve agent loop resilience — same-tool nudging and error handling - agent.ts: track consecutive calls to the same tool (ignoring args) and inject a nudge after 4 repeats telling the model to summarize and respond, preventing local models from endlessly retrying searches with slight query variations - agent.ts: wrap the entire tool loop iteration in try-catch so model/network errors don't crash the daemon — returns a descriptive error message instead - Tests for both: nudge triggers after 4 same-tool calls, error recovery persists to history	2026-02-11 09:33:30 -08:00
William Valentin	bf9ca690f3	fix(agent): detect repeated tool call loops and make max_iterations configurable Local LLMs often get stuck calling the same tool repeatedly because they lack the sophistication to synthesize results. The agent loop had no safeguard — it re-executed whatever the model requested up to 10 times. Add fingerprint-based loop detection: if the same tool+args combination repeats 3 consecutive times, break the loop and return the last results. Also add agents.max_iterations to the config schema so the iteration limit is user-configurable (default: 10).	2026-02-10 19:35:09 -08:00
William Valentin	796e143d61	fix(agent): inject tool inventory note when tools change mid-session Stale session history can cause the model to follow old "I can't do that" patterns even when new tools are available. NativeAgent now tracks a tool fingerprint and appends a system prompt note listing current tools when the inventory changes, resetting on session reset. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 11:41:31 -08:00
William Valentin	9be8f76bc7	feat: implement Tier 3 features — lane queue, credential redaction, token dashboard, xAI, Voyage AI - Lane Queue: per-session FIFO queue in gateway replacing reject-when-busy (9 tests) - Credential Redaction: redactConfig() expanded to cover 18+ secret fields (16 tests) - Web UI Token Dashboard: system.tokenUsage endpoint + Usage page with summary cards - xAI (Grok) Provider: OpenAI-compatible client with model pricing - Voyage AI Embeddings: new embedding provider with configurable dimensions (5 tests) - Update gap analysis: 90→95 match (70%→74%), Tier 3 section marked DONE - Update state.json: test count 1001→1034, add tier3_completion entry Total: 1034 tests passing across 85 files, typecheck clean	2026-02-09 10:32:57 -08:00
William Valentin	1c2f54fae3	feat: implement tier 1 quick wins (tool groups, typing, pruning, verbose, think) Five additive features with no breaking changes: - Tool groups: group:fs, group:runtime, group:web, group:memory syntactic sugar for allow/deny lists in tool policy config - Typing indicators: Discord sendTyping() and WhatsApp sendStateTyping() on message receipt for better UX feedback - Session pruning: TTL-based auto-cleanup via sessions.ttl config with hourly daemon timer and SQLite GROUP BY pruning - /verbose command: TUI command parser toggle for raw streaming display - !!think prefix: per-message extended thinking mode wired through Anthropic (budget_tokens), OpenAI/GitHub (reasoning_effort), and Gemini (thinkingConfig) providers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 13:35:00 -08:00
William Valentin	6bb424cddc	feat: add agent tools and sanitize tool names for Anthropic API Add 8 new agent-callable tools (sessions.list/history/create/delete, agents.list, message.send, cron.list/trigger) and sanitize tool names at the API boundary (dots → underscores) to comply with Anthropic's `^[a-zA-Z0-9_-]{1,128}` requirement. Reverse-maps sanitized names back to internal names for hook callbacks and tool execution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 12:23:09 -08:00
William Valentin	b9bfee9c5b	feat: add outbound attachment support with media.send tool Introduces OutboundAttachment type on OutboundMessage, an OutboundAttachmentCollector (push/drain pattern), and a media.send tool that queues files for outbound delivery. Each channel adapter (Telegram, Discord, Slack, WhatsApp) sends attachments after the text reply. Includes 15 tests for collector and tool.	2026-02-07 09:09:00 -08:00
William Valentin	a515912537	feat: add multimodal media pipeline for image support across all providers and channels Widen Message.content from string to string \| MessageContentPart[] to support multimodal content. Add Attachment type to channel layer, media conversion utilities, and image extraction to all channel adapters (Telegram, Discord, Slack, WhatsApp). Update all model clients (Anthropic, OpenAI, Gemini, Bedrock) to convert structured content to provider-specific formats. Fix downstream consumers (tokens, compaction, TUI, local models) to handle the widened type via getMessageText() helper.	2026-02-06 17:17:21 -08:00
William Valentin	ee0af0cc06	feat: add tool allow/deny profiles with per-agent and per-provider filtering Implements configurable tool filtering with four built-in profiles (minimal, messaging, coding, full), global and per-agent/per-provider allow/deny lists with glob pattern support, and defense-in-depth enforcement at both tool listing and execution time. New: src/tools/policy.ts (ToolPolicy engine), src/tools/policy.test.ts (37 tests) Modified: config schema, tool registry, tool executor, NativeAgent, AgentOrchestrator, daemon wiring, gateway tool handler, test mocks	2026-02-06 15:30:34 -08:00
William Valentin	4316dbd3be	feat: add P2 features — retry policy, prompt templating, usage tracking, tech debt cleanup - Extract shared splitMessage() into channels/utils.ts (dedup 4 adapters) - Add Slack user name resolution with caching (users.info API) - Add withRetry() with exponential backoff + jitter, isRetryable() filter - Wire retry config into ModelRouter.chat() (non-streaming only) - Add assembleSystemPrompt() multi-file template system (SOUL/AGENTS/IDENTITY/USER/TOOLS.md) - Add usage tracking accumulators in NativeAgent + AgentOrchestrator - Add estimateCost() with per-model pricing table - Add /usage TUI command with full usage report formatting - Add retrySchema and promptSchema to config schema Tests: 569 passing, typecheck clean	2026-02-06 15:12:35 -08:00
William Valentin	7a35b22458	feat: wire up all Phase 2-6 features into daemon and config Integrate all new features into the shared infrastructure: - Config schema: add memory, discord, slack, process, web_search schemas - Daemon wiring: memory store init, tool registration, channel adapters - Orchestrator: memory injection into system prompt, extraction on compaction - Agent: add setSystemPrompt() for dynamic prompt updates - Channel/tool index: export new adapters and tool factories - Add @slack/bolt, discord.js, turndown, linkedom, @mozilla/readability deps - Update state.json with Phase 3b completion (494 tests passing)	2026-02-06 14:24:39 -08:00
William Valentin	306e11bd2e	feat: add multi-model delegation (Phase 0) and context compaction (Phase 1) Phase 0 — Multi-Model Delegation: - AgentOrchestrator wraps NativeAgent with delegate() for stateless single-turn calls to any model tier (fast/default/complex/local) - DelegationConfig maps task types (compaction, classification, etc.) to model tiers - Delegation prompts for compaction, memory extraction, classification, and tool summarisation - Per-tier usage tracking for cost visibility - Config schema: agents.delegation and agents.primary_tier Phase 1 — Context Compaction: - Token estimation (char/4 heuristic) with context window lookup - shouldCompact() threshold check against context window percentage - compactHistory() splits old/recent messages, delegates summary to fast tier, returns CompactionResult - Automatic compaction in AgentOrchestrator.process() when configured - Force-compact via orchestrator.compact() with session persistence - Session.replaceHistory() with atomic SQLite transaction - /compact TUI command with feedback on compacted token counts - Config schema: compaction.enabled, threshold_pct, keep_turns, summary_max_tokens Tests: 385 passing across 50 files (22 new tests in 2 new test files)	2026-02-06 13:17:02 -08:00
William Valentin	e4b7f96d33	fix: provider-aware model routing with fallback visibility - Extract createClientFromConfig() to dispatch on provider field instead of hardcoding all tiers as AnthropicClient - Add fallback/fallbackReason metadata to ChatResponse and ChatStreamEvent so callers know when a fallback model was used - Enhance doctor check to report full model stack and warn on missing API keys for cloud providers - Log fallback warnings in NativeAgent and display them in TUI - Support tier names and local_providers entries in fallback_chain - Add 8 tests for createClientFromConfig covering all provider types	2026-02-06 09:58:56 -08:00
William Valentin	ad7fc241f1	feat(telegram): display tool execution status messages Telegram bot now shows tool status during execution: - Sends status message when tool starts (tool name + args snippet) - Edits status message with result on completion - Keeps typing indicator active during tool execution - Adds setOnToolUse() to NativeAgent for per-message callback control	2026-02-05 17:53:54 -08:00
William Valentin	4f87643341	feat(agent): add iterative tool use loop with max iterations Rewrites NativeAgent.process() from single-turn to an iterative tool loop. When toolRegistry and toolExecutor are provided, the agent calls the model, executes any requested tool calls, feeds results back, and loops until the model returns a text response or max iterations hit. - Backward compatible: works exactly as before without tools - Supports onToolUse callback for frontend status display - Max iterations (default 10) prevents infinite loops - Handles multiple tool calls per model response - 5 new tests (8 total)	2026-02-05 17:48:38 -08:00
William Valentin	f891c7aee8	fix: add API key/auth token support across all model clients	2026-02-05 10:56:40 -08:00
William Valentin	fb7575f850	refactor: integrate SessionManager into daemon and agent Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 00:43:09 -08:00
William Valentin	6e6c263e14	feat: integrate model router, session persistence, and hook engine - NativeAgent now loads/saves messages to SessionStore - Daemon creates ModelRouter with fallback chain support - Telegram bot handles confirmation callbacks from HookEngine - Session data stored in ~/.local/share/flynn/sessions.db Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 00:05:42 -08:00
William Valentin	69309e58bc	feat: add native agent with conversation history	2026-02-02 20:56:46 -08:00

24 Commits