Commit Graph

3 Commits

Author SHA1 Message Date
William Valentin 6761dca1c2 fix: normalize message roles for local model backends (llama.cpp, Ollama)
Local backends using strict chat templates (e.g. Mistral 3) rejected
Flynn's Anthropic-style tool_use/tool_result content blocks, causing
'roles must alternate' errors. Added getMessageTextWithTools() and
normalizeMessagesForLocal() to serialize structured blocks to plain
text, drop empty messages, and merge consecutive same-role messages.
Also fixed compaction to ensure kept messages start with user role.
2026-02-10 22:04:17 -08:00
William Valentin 2a962abcd0 feat: add audio transcription pipeline for voice messages
Adds Whisper-compatible audio transcription via configurable endpoint.
New functions: isSupportedAudio(), mimeToExtension(), transcribeAudio(),
buildUserMessageWithAudio(). Config schema gains audio section with
transcription_endpoint, api_key, and model. Daemon wires transcription
into the message router. Channel adapters extract audio from voice/audio
messages (Telegram voice+audio, Discord audio/*, Slack audio/*, WhatsApp
ptt+audio). Includes 57 media tests (was 25, now covers all audio paths).
2026-02-07 09:09:13 -08:00
William Valentin a515912537 feat: add multimodal media pipeline for image support across all providers and channels
Widen Message.content from string to string | MessageContentPart[] to support
multimodal content. Add Attachment type to channel layer, media conversion
utilities, and image extraction to all channel adapters (Telegram, Discord,
Slack, WhatsApp). Update all model clients (Anthropic, OpenAI, Gemini, Bedrock)
to convert structured content to provider-specific formats. Fix downstream
consumers (tokens, compaction, TUI, local models) to handle the widened type
via getMessageText() helper.
2026-02-06 17:17:21 -08:00