flynn

will/flynn

Author	SHA1	Message	Date
William Valentin	3d59e5ea9d	Add localhost->127.0.0.1 fallback for transcription fetch	2026-02-22 20:26:38 -08:00
William Valentin	487f26e36d	Harden audio transcription fetch path with retries and timeout	2026-02-22 19:54:58 -08:00
William Valentin	948d4ac6d8	chore(lint): burn down remaining warnings to zero	2026-02-15 23:14:21 -08:00
William Valentin	148219153e	feat(audio): add tests, token estimation, and config override for native audio - Add capabilities.test.ts (18 tests) for supportsAudioInput() - Add 15 audio tests to media.test.ts (hasAudio, stripAudioParts, attachmentToAudioSource) - Add estimateAudioTokens() to tokens.ts (base64→bytes→duration→tokens) - Update estimateMessageTokens() to include audio content parts - Add 5 audio token tests to tokens.test.ts - Add supports_audio config override to model schema - Wire supports_audio from tier config through routing to capability check Total tests: 1369 (was 1331, +38 audio-related)	2026-02-11 18:27:19 -08:00
William Valentin	6761dca1c2	fix: normalize message roles for local model backends (llama.cpp, Ollama) Local backends using strict chat templates (e.g. Mistral 3) rejected Flynn's Anthropic-style tool_use/tool_result content blocks, causing 'roles must alternate' errors. Added getMessageTextWithTools() and normalizeMessagesForLocal() to serialize structured blocks to plain text, drop empty messages, and merge consecutive same-role messages. Also fixed compaction to ensure kept messages start with user role.	2026-02-10 22:04:17 -08:00
William Valentin	2a962abcd0	feat: add audio transcription pipeline for voice messages Adds Whisper-compatible audio transcription via configurable endpoint. New functions: isSupportedAudio(), mimeToExtension(), transcribeAudio(), buildUserMessageWithAudio(). Config schema gains audio section with transcription_endpoint, api_key, and model. Daemon wires transcription into the message router. Channel adapters extract audio from voice/audio messages (Telegram voice+audio, Discord audio/, Slack audio/, WhatsApp ptt+audio). Includes 57 media tests (was 25, now covers all audio paths).	2026-02-07 09:09:13 -08:00
William Valentin	a515912537	feat: add multimodal media pipeline for image support across all providers and channels Widen Message.content from string to string \| MessageContentPart[] to support multimodal content. Add Attachment type to channel layer, media conversion utilities, and image extraction to all channel adapters (Telegram, Discord, Slack, WhatsApp). Update all model clients (Anthropic, OpenAI, Gemini, Bedrock) to convert structured content to provider-specific formats. Fix downstream consumers (tokens, compaction, TUI, local models) to handle the widened type via getMessageText() helper.	2026-02-06 17:17:21 -08:00

7 Commits