Commit Graph

44 Commits

Author SHA1 Message Date
William Valentin 955b9e28e0 feat: add OpenAI OAuth, strict model overrides, and Gmail pull mode 2026-02-13 14:55:40 -08:00
William Valentin 9f81c01603 feat(session): persist model tier overrides per session
Store per-session config in SQLite and route /model and /reset through command fast-paths so channel sessions keep independent model selection across reconnects and restarts.
2026-02-13 01:04:26 -08:00
William Valentin 148219153e feat(audio): add tests, token estimation, and config override for native audio
- Add capabilities.test.ts (18 tests) for supportsAudioInput()
- Add 15 audio tests to media.test.ts (hasAudio, stripAudioParts, attachmentToAudioSource)
- Add estimateAudioTokens() to tokens.ts (base64→bytes→duration→tokens)
- Update estimateMessageTokens() to include audio content parts
- Add 5 audio token tests to tokens.test.ts
- Add supports_audio config override to model schema
- Wire supports_audio from tier config through routing to capability check

Total tests: 1369 (was 1331, +38 audio-related)
2026-02-11 18:27:19 -08:00
William Valentin 32ac4df20a feat(audio): add smart routing for native vs transcribed audio
- Create capabilities.ts with supportsAudioInput() detection
- Gemini, OpenAI, and GitHub Models get native audio passthrough
- Anthropic, Bedrock, Ollama, llama.cpp fall back to Whisper transcription
- routing.ts now checks model capability before deciding to transcribe
- Audio attachments are stripped for non-native models (only transcript text passed)
- Remove deprecated audioConfig from createMessageRouter deps (read from config.audio)
2026-02-11 18:20:10 -08:00
William Valentin 32e1a2724a feat(audio): add native audio support to type system and model clients
- Add AudioSource interface and 'audio' variant to MessageContentPart union
- Update buildUserMessage() to create audio content parts from attachments
- Add attachmentToAudioSource(), hasAudio(), stripAudioParts() helpers
- Gemini: native audio via inlineData (same format as images)
- OpenAI/GitHub: native audio via input_audio content parts
- Anthropic/Bedrock: graceful fallback to transcript text
- Update getMessageTextWithTools() to handle audio blocks for local models
2026-02-11 18:17:33 -08:00
William Valentin 6090508bad style: auto-fix ESLint issues (curly braces and formatting)
- Add curly braces to all if/else/for/while statements
- Fix indentation and trailing spaces
- Auto-fixed 372 linting errors using eslint --fix
- Remaining issues are warnings only (non-null assertions, explicit any types)
2026-02-11 10:30:24 -08:00
William Valentin 85d7a6bfec test: add stopReason edge case tests; update state.json with recent fixes
- Added tests for finish_reason 'tool_calls' with empty array → 'end_turn'
- Added test for finish_reason 'length' → 'max_tokens'
- Updated state.json with 4 new entries for today's fixes (SOUL.md, message
  normalization, agent loop resilience, stopReason normalization)
- Test count: 1329 → 1331
2026-02-11 09:51:19 -08:00
William Valentin 01c3175fdb fix: normalize OpenAI/GitHub finish_reason to Flynn stopReason conventions
OpenAI-compatible providers return 'stop' and 'tool_calls' as finish_reason
values, but Flynn's agent loop expects Anthropic-style 'end_turn' and
'tool_use'. This caused the agent to exit the tool loop prematurely when
falling back to GitHub Copilot (due to Anthropic API quota exhaustion).

- openai.ts: Map 'stop' → 'end_turn', 'length' → 'max_tokens', tool_calls
  with actual tools → 'tool_use', tool_calls without tools → 'end_turn'
- github.ts: Handle edge case where finish_reason is 'tool_calls' but no
  tools were parsed
- agent.ts: Accept both 'tool_use' and 'tool_calls' as valid stop reasons
  (belt-and-suspenders), extract toolCalls to local variable for TS narrowing
- openai.test.ts: Update expectations to match new normalized values
2026-02-11 09:49:36 -08:00
William Valentin c01de7d097 feat: native tool calling message normalization for Ollama and llama.cpp
- ollama.ts: add normalizeMessagesForOllama() converting Anthropic-style
  tool_use/tool_result blocks to Ollama's native tool_calls + role:tool format
- llamacpp.ts: add normalizeMessagesForLlamaCpp() with hybrid approach —
  assistant tool_calls in native format, but tool results as structured user
  messages (many GGUF templates silently drop role:tool messages)
- llamacpp.ts: add configurable requestTimeout with AbortController (default 3min)
- Both use fast-path when no tool blocks are present (zero overhead)
- Full test coverage for both normalizers: plain text passthrough, tool_use
  conversion, tool_result mapping, multi-tool round trips, error results
2026-02-11 09:33:21 -08:00
William Valentin 6761dca1c2 fix: normalize message roles for local model backends (llama.cpp, Ollama)
Local backends using strict chat templates (e.g. Mistral 3) rejected
Flynn's Anthropic-style tool_use/tool_result content blocks, causing
'roles must alternate' errors. Added getMessageTextWithTools() and
normalizeMessagesForLocal() to serialize structured blocks to plain
text, drop empty messages, and merge consecutive same-role messages.
Also fixed compaction to ensure kept messages start with user role.
2026-02-10 22:04:17 -08:00
William Valentin 7a69794418 fix: sync model tier between TUI and WebChat when switching models
ModelRouter now supports multiple tier-change listeners via addOnTierChange(),
SessionBridge subscribes to tier changes and propagates them to all WebChat
agents (both existing and newly created), and the fullscreen TUI now also
updates the agent's tier when switching models (matching minimal TUI behavior).
2026-02-10 20:22:40 -08:00
William Valentin 411c6d84a2 feat(tui): persist model tier selection and fix formatting
Persist /model tier choice to ~/.local/share/flynn/preferences.json so
it survives restarts. Decode HTML entities (e.g. ') in markdown
renderer output. Suppress noisy logger.info and punycode deprecation
warnings in TUI startup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 12:23:12 -08:00
William Valentin 35f4cab0dc feat: add log-level system to suppress noisy fallback debug output
Replace console.debug/log/warn calls in model router, retry, and daemon
startup with a structured logger that respects a configurable log_level.
Default level is 'info', suppressing verbose fallback debug messages in
the TUI while keeping them available via config when needed.

- Add src/logger.ts with debug/info/warn/error/silent levels
- Wire log_level into config schema (default: 'info')
- Initialize log level in both daemon and TUI startup paths
- Convert all console.debug in router.ts and retry.ts to logger.debug
- Convert console.log/warn in daemon/models.ts to logger.info/warn
2026-02-09 21:23:07 -08:00
William Valentin 9be8f76bc7 feat: implement Tier 3 features — lane queue, credential redaction, token dashboard, xAI, Voyage AI
- Lane Queue: per-session FIFO queue in gateway replacing reject-when-busy (9 tests)
- Credential Redaction: redactConfig() expanded to cover 18+ secret fields (16 tests)
- Web UI Token Dashboard: system.tokenUsage endpoint + Usage page with summary cards
- xAI (Grok) Provider: OpenAI-compatible client with model pricing
- Voyage AI Embeddings: new embedding provider with configurable dimensions (5 tests)
- Update gap analysis: 90→95 match (70%→74%), Tier 3 section marked DONE
- Update state.json: test count 1001→1034, add tier3_completion entry

Total: 1034 tests passing across 85 files, typecheck clean
2026-02-09 10:32:57 -08:00
William Valentin 6ed8a4a8bf fix: gracefully handle Ollama models without tool support
Check model capabilities via /api/show before sending tools.
Models without 'tools' capability get requests without tools
(they can still answer, just without tool use). Result is cached
per client instance. Defense-in-depth: 'does not support' added
to retry nonRetryablePatterns to avoid wasting retries on
permanent errors.
2026-02-07 17:44:47 -08:00
William Valentin fb20acfbcd feat: add tool calling support to Ollama and llama.cpp clients
- Ollama: pass tools to API, parse tool_calls responses, handle thinking field from reasoning models (deepseek-r1, glm-4.7-flash)
- llama.cpp: pass tools via OpenAI-compatible endpoint, parse tool_calls, accumulate streaming tool call deltas
- Both clients now set stopReason to 'tool_use' when tool calls are present
- Tests: 12 new tests (8 Ollama + 5 llama.cpp, total 983→995)
2026-02-07 17:20:27 -08:00
William Valentin b322e8f29c fix: GitHub Copilot fallback — remove stale API version header and fix model name mapping
Two issues prevented the GitHub Models fallback from working:

1. The X-GitHub-Api-Version: 2022-11-28 header caused '400 invalid
   apiVersion' errors. The Copilot chat completions endpoint does not
   use this header — removed from both constructor and rebuildClient.

2. The anthropicToGitHubModel mapping was incomplete: it only knew
   three models and the generic date-stripping fallback produced wrong
   names (e.g. 'claude-sonnet-4-5' instead of 'claude-sonnet-4.5').
   GitHub Copilot uses dots for sub-versions, not hyphens.

   Updated with explicit mappings for all current models (sonnet 4,
   4.5; opus 4, 4.5, 4.6; haiku 4.5) and a smarter generic fallback
   that converts digit-hyphen-digit to digit.digit at the end.

3. createClientFromConfig now auto-maps Anthropic-style model names
   when the provider is 'github', so users can copy model names from
   their Anthropic config into fallback blocks without manual renaming.
2026-02-07 14:04:54 -08:00
William Valentin 1c2f54fae3 feat: implement tier 1 quick wins (tool groups, typing, pruning, verbose, think)
Five additive features with no breaking changes:

- Tool groups: group:fs, group:runtime, group:web, group:memory syntactic
  sugar for allow/deny lists in tool policy config
- Typing indicators: Discord sendTyping() and WhatsApp sendStateTyping()
  on message receipt for better UX feedback
- Session pruning: TTL-based auto-cleanup via sessions.ttl config with
  hourly daemon timer and SQLite GROUP BY pruning
- /verbose command: TUI command parser toggle for raw streaming display
- !!think prefix: per-message extended thinking mode wired through
  Anthropic (budget_tokens), OpenAI/GitHub (reasoning_effort), and
  Gemini (thinkingConfig) providers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 13:35:00 -08:00
William Valentin b9b70ce2b1 feat: add per-tier fallback support to ModelRouter
The router now accepts a tierFallbacks map so each model tier can have
its own fallback providers. Tier fallbacks are tried before the global
fallback chain in both chat() and chatStream().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 12:09:38 -08:00
William Valentin 2a962abcd0 feat: add audio transcription pipeline for voice messages
Adds Whisper-compatible audio transcription via configurable endpoint.
New functions: isSupportedAudio(), mimeToExtension(), transcribeAudio(),
buildUserMessageWithAudio(). Config schema gains audio section with
transcription_endpoint, api_key, and model. Daemon wires transcription
into the message router. Channel adapters extract audio from voice/audio
messages (Telegram voice+audio, Discord audio/*, Slack audio/*, WhatsApp
ptt+audio). Includes 57 media tests (was 25, now covers all audio paths).
2026-02-07 09:09:13 -08:00
William Valentin d4530a7034 feat: add runtime provider/model switching via /model <tier> <provider/model>
- ModelRouter: add setClient(), labels map, getLabel(), getAllLabels()
- TUI commands: parse /model <tier> <provider/model> syntax with autocompletion
- TUI minimal: handle provider switching via createClientFromConfig factory
- Daemon: wire initial labels into router config
- Fix /model alias mappings (opus=complex, sonnet=default, haiku=fast)
- Add design doc and update state.json with feature status
2026-02-06 23:42:14 -08:00
William Valentin 73fc5d173d feat: add auto-login for GitHub Copilot when no token is available
GitHubModelsClient now lazily resolves tokens at first API call. If no
token exists (env var, stored OAuth, or config), it triggers the OAuth
device flow automatically via an onLoginRequired callback wired in both
the TUI and daemon entry points.
2026-02-06 22:33:48 -08:00
William Valentin f363717f5f feat: add GitHub Copilot model provider with OAuth device flow
Add a new 'github' model provider backed by the Copilot API
(api.githubcopilot.com), with OAuth device flow for authentication.

- New src/auth/github.ts: device flow login, token storage at
  ~/.config/flynn/auth.json with 0600 permissions
- New src/models/github.ts: OpenAI-compatible client with streaming,
  tool calling, and Copilot-specific headers
- Add 'github' to provider enum in config schema
- Register provider in daemon factory and TUI client factory
- Refactor TUI to use provider-agnostic client factory (was hardcoded
  to AnthropicClient for all tiers)
- Add /login command to TUI for interactive OAuth authorization
- Add Copilot model cost tracking entries
2026-02-06 22:26:52 -08:00
William Valentin a515912537 feat: add multimodal media pipeline for image support across all providers and channels
Widen Message.content from string to string | MessageContentPart[] to support
multimodal content. Add Attachment type to channel layer, media conversion
utilities, and image extraction to all channel adapters (Telegram, Discord,
Slack, WhatsApp). Update all model clients (Anthropic, OpenAI, Gemini, Bedrock)
to convert structured content to provider-specific formats. Fix downstream
consumers (tokens, compaction, TUI, local models) to handle the widened type
via getMessageText() helper.
2026-02-06 17:17:21 -08:00
William Valentin 0eb1f7a073 feat: add Gemini and Bedrock model providers
Add native GeminiClient using @google/generative-ai SDK and BedrockClient
using @aws-sdk/client-bedrock-runtime. Replace the previous Gemini fallback
(OpenAI-compatible shim) with the real implementation. Add OpenRouter as a
provider option (OpenAI-compatible with custom baseURL). Update model costs,
doctor CLI checks, and client factory tests.
2026-02-06 16:51:32 -08:00
William Valentin 4316dbd3be feat: add P2 features — retry policy, prompt templating, usage tracking, tech debt cleanup
- Extract shared splitMessage() into channels/utils.ts (dedup 4 adapters)
- Add Slack user name resolution with caching (users.info API)
- Add withRetry() with exponential backoff + jitter, isRetryable() filter
- Wire retry config into ModelRouter.chat() (non-streaming only)
- Add assembleSystemPrompt() multi-file template system (SOUL/AGENTS/IDENTITY/USER/TOOLS.md)
- Add usage tracking accumulators in NativeAgent + AgentOrchestrator
- Add estimateCost() with per-model pricing table
- Add /usage TUI command with full usage report formatting
- Add retrySchema and promptSchema to config schema

Tests: 569 passing, typecheck clean
2026-02-06 15:12:35 -08:00
William Valentin e4b7f96d33 fix: provider-aware model routing with fallback visibility
- Extract createClientFromConfig() to dispatch on provider field instead
  of hardcoding all tiers as AnthropicClient
- Add fallback/fallbackReason metadata to ChatResponse and ChatStreamEvent
  so callers know when a fallback model was used
- Enhance doctor check to report full model stack and warn on missing
  API keys for cloud providers
- Log fallback warnings in NativeAgent and display them in TUI
- Support tier names and local_providers entries in fallback_chain
- Add 8 tests for createClientFromConfig covering all provider types
2026-02-06 09:58:56 -08:00
William Valentin 96ade25e98 feat(models): add tool use support to OpenAIClient 2026-02-05 17:44:04 -08:00
William Valentin 36c1cfc768 feat(models): add tool use support to AnthropicClient 2026-02-05 17:44:00 -08:00
William Valentin b00706325b feat: add tool framework foundation (types, registry, executor, shell tool, model types, SOUL.md)
- Task 0: SOUL.md + loadSystemPrompt() in daemon
- Task 1: Tool type definitions (Tool, ToolCall, ToolResult, etc.)
- Task 2: ToolRegistry with Anthropic/OpenAI serialization
- Task 3: ToolExecutor with hooks, timeout, truncation
- Task 4: shell.exec builtin tool
- Task 8: Model types updated for tool use (ToolDefinition, ModelToolCall, etc.)
- Task 15: Model index exports for tool types
2026-02-05 17:39:40 -08:00
William Valentin d2a597d49d fix: add model parameter to LlamaCppClient requests 2026-02-05 15:51:33 -08:00
William Valentin dbf1acd822 feat: add streaming support and num_gpu option to Ollama client 2026-02-05 15:51:28 -08:00
William Valentin dbeaa78e2c feat: add setLocalClient and getLocalProviderName to ModelRouter 2026-02-05 13:34:25 -08:00
William Valentin d86710577d feat: wire up LlamaCppClient to model router
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 13:20:20 -08:00
William Valentin 8e7fa24fd6 feat: add clear error message when llama-server not running 2026-02-05 13:17:56 -08:00
William Valentin e8079347c7 feat: add streaming support to LlamaCppClient
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 13:15:18 -08:00
William Valentin a20156f8db feat: add LlamaCppClient with basic chat support
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 13:11:56 -08:00
William Valentin 9a48c39b07 feat(models): add streaming and tier switching to ModelRouter 2026-02-05 10:48:41 -08:00
William Valentin 896a0da10e feat(models): add streaming support to AnthropicClient 2026-02-05 10:47:42 -08:00
William Valentin 1f0cf28d1f feat(models): add streaming types for chat responses 2026-02-05 10:46:53 -08:00
William Valentin 26bd6ce65d feat: add model router with fallback chain support
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 00:29:52 -08:00
William Valentin bb16732562 feat: add Ollama client for local LLM support
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 00:27:09 -08:00
William Valentin 633cfcf713 feat: add OpenAI client for fallback support
Implements ModelClient interface with OpenAI SDK to support GPT models
as fallback when local inference is unavailable. Includes tests with
mocked OpenAI API responses.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 21:10:46 -08:00
William Valentin 75e64b534d feat: add Anthropic client wrapper
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 20:54:17 -08:00