- Add AudioSource interface and 'audio' variant to MessageContentPart union
- Update buildUserMessage() to create audio content parts from attachments
- Add attachmentToAudioSource(), hasAudio(), stripAudioParts() helpers
- Gemini: native audio via inlineData (same format as images)
- OpenAI/GitHub: native audio via input_audio content parts
- Anthropic/Bedrock: graceful fallback to transcript text
- Update getMessageTextWithTools() to handle audio blocks for local models
OpenAI-compatible providers return 'stop' and 'tool_calls' as finish_reason
values, but Flynn's agent loop expects Anthropic-style 'end_turn' and
'tool_use'. This caused the agent to exit the tool loop prematurely when
falling back to GitHub Copilot (due to Anthropic API quota exhaustion).
- openai.ts: Map 'stop' → 'end_turn', 'length' → 'max_tokens', tool_calls
with actual tools → 'tool_use', tool_calls without tools → 'end_turn'
- github.ts: Handle edge case where finish_reason is 'tool_calls' but no
tools were parsed
- agent.ts: Accept both 'tool_use' and 'tool_calls' as valid stop reasons
(belt-and-suspenders), extract toolCalls to local variable for TS narrowing
- openai.test.ts: Update expectations to match new normalized values
Five additive features with no breaking changes:
- Tool groups: group:fs, group:runtime, group:web, group:memory syntactic
sugar for allow/deny lists in tool policy config
- Typing indicators: Discord sendTyping() and WhatsApp sendStateTyping()
on message receipt for better UX feedback
- Session pruning: TTL-based auto-cleanup via sessions.ttl config with
hourly daemon timer and SQLite GROUP BY pruning
- /verbose command: TUI command parser toggle for raw streaming display
- !!think prefix: per-message extended thinking mode wired through
Anthropic (budget_tokens), OpenAI/GitHub (reasoning_effort), and
Gemini (thinkingConfig) providers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Widen Message.content from string to string | MessageContentPart[] to support
multimodal content. Add Attachment type to channel layer, media conversion
utilities, and image extraction to all channel adapters (Telegram, Discord,
Slack, WhatsApp). Update all model clients (Anthropic, OpenAI, Gemini, Bedrock)
to convert structured content to provider-specific formats. Fix downstream
consumers (tokens, compaction, TUI, local models) to handle the widened type
via getMessageText() helper.
Implements ModelClient interface with OpenAI SDK to support GPT models
as fallback when local inference is unavailable. Includes tests with
mocked OpenAI API responses.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>