Add runtime truthfulness modes and autonomy-level tool gating with audit metadata for overrides/denials.
Wire policy through prompt assembly, tool execution context, and daemon/gateway agent paths; update tests and planning state for Phase 3 PR #2 completion.
Replace manual process management with systemctl --user commands.
Uses ollama.service and llama-server.service units for proper lifecycle
management, VRAM cleanup, and integration with system services.
Track PIDs of backends started by /backend command and only kill those
specific PIDs. Previous implementation used pkill which would kill all
Ollama/llama-server processes including those started by the user or
systemd services. Now we only terminate processes we started.
- Add local_providers with ollama and llamacpp configurations
- /backend command now stops current daemon before starting new one
- Start backends as detached processes to avoid blocking TUI
- Wait 500ms for daemon to initialize before switching
- Add session_config SQLite table for per-session settings
- Update routing to support session override → agent config → global default resolution chain
- Upgrade WebChat SessionBridge from NativeAgent to AgentOrchestrator
- Add /model, /local, /cloud commands to Telegram adapter
- Add /model command to WebChat gateway handlers
- Clear session overrides on /reset command
- Pass memoryStore and config through to SessionBridge
- Add comprehensive tests for all new functionality
Fixes model persistence bug where TUI model changes didn't affect WebChat/Telegram sessions. Now:
- TUI /model sets global default (persists across restarts, affects all new sessions)
- WebChat/Telegram /model sets session override (only that conversation, cleared on /reset)
- WebChat sessions gain AgentOrchestrator features (delegation, compaction, memory)
- Send user feedback when voice/audio download fails instead of silent failure
- Send graceful message when audio transcription is not configured instead of empty text which crashes API
- Add capabilities.test.ts (18 tests) for supportsAudioInput()
- Add 15 audio tests to media.test.ts (hasAudio, stripAudioParts, attachmentToAudioSource)
- Add estimateAudioTokens() to tokens.ts (base64→bytes→duration→tokens)
- Update estimateMessageTokens() to include audio content parts
- Add 5 audio token tests to tokens.test.ts
- Add supports_audio config override to model schema
- Wire supports_audio from tier config through routing to capability check
Total tests: 1369 (was 1331, +38 audio-related)
- Create capabilities.ts with supportsAudioInput() detection
- Gemini, OpenAI, and GitHub Models get native audio passthrough
- Anthropic, Bedrock, Ollama, llama.cpp fall back to Whisper transcription
- routing.ts now checks model capability before deciding to transcribe
- Audio attachments are stripped for non-native models (only transcript text passed)
- Remove deprecated audioConfig from createMessageRouter deps (read from config.audio)
- Add AudioSource interface and 'audio' variant to MessageContentPart union
- Update buildUserMessage() to create audio content parts from attachments
- Add attachmentToAudioSource(), hasAudio(), stripAudioParts() helpers
- Gemini: native audio via inlineData (same format as images)
- OpenAI/GitHub: native audio via input_audio content parts
- Anthropic/Bedrock: graceful fallback to transcript text
- Update getMessageTextWithTools() to handle audio blocks for local models
- Add createAudioTranscribeTool with OpenAI/Groq/Ollama/llama.cpp provider support
- Refactor audio config schema to nested audio.enabled + audio.provider structure
- Move audio tool registration to initTools() for conditional enablement
- Fix duplication bug in audio-transcribe.ts URL download handler
- Support base64 data and URL-based audio input with format detection
- Add curly braces to all if/else/for/while statements
- Fix indentation and trailing spaces
- Auto-fixed 372 linting errors using eslint --fix
- Remaining issues are warnings only (non-null assertions, explicit any types)
OpenAI-compatible providers return 'stop' and 'tool_calls' as finish_reason
values, but Flynn's agent loop expects Anthropic-style 'end_turn' and
'tool_use'. This caused the agent to exit the tool loop prematurely when
falling back to GitHub Copilot (due to Anthropic API quota exhaustion).
- openai.ts: Map 'stop' → 'end_turn', 'length' → 'max_tokens', tool_calls
with actual tools → 'tool_use', tool_calls without tools → 'end_turn'
- github.ts: Handle edge case where finish_reason is 'tool_calls' but no
tools were parsed
- agent.ts: Accept both 'tool_use' and 'tool_calls' as valid stop reasons
(belt-and-suspenders), extract toolCalls to local variable for TS narrowing
- openai.test.ts: Update expectations to match new normalized values
- agent.ts: track consecutive calls to the same tool (ignoring args) and
inject a nudge after 4 repeats telling the model to summarize and respond,
preventing local models from endlessly retrying searches with slight
query variations
- agent.ts: wrap the entire tool loop iteration in try-catch so model/network
errors don't crash the daemon — returns a descriptive error message instead
- Tests for both: nudge triggers after 4 same-tool calls, error recovery
persists to history
- ollama.ts: add normalizeMessagesForOllama() converting Anthropic-style
tool_use/tool_result blocks to Ollama's native tool_calls + role:tool format
- llamacpp.ts: add normalizeMessagesForLlamaCpp() with hybrid approach —
assistant tool_calls in native format, but tool results as structured user
messages (many GGUF templates silently drop role:tool messages)
- llamacpp.ts: add configurable requestTimeout with AbortController (default 3min)
- Both use fast-path when no tool blocks are present (zero overhead)
- Full test coverage for both normalizers: plain text passthrough, tool_use
conversion, tool_result mapping, multi-tool round trips, error results
- SOUL.md: list all available tools (web.search, memory.*, cron.*, etc.)
and add Tool Usage Rules section enforcing 'act, don't narrate'
- cron.ts: add getJob(), addJob(), removeJob() to CronScheduler for
runtime (ephemeral) cron job management
- cron tools: add cron.create and cron.delete tools, enhance cron.list
to show schedule/output/message details
- policy.ts: add cron tools to messaging and coding profiles, add
group:cron to tool groups
Fixes issue where models would narrate tool intent ('let me search...')
then stop without actually calling tools.
Allow cron jobs to specify a `model_tier` field that controls which LLM
tier handles the job, without needing separate agent configs. Precedence:
cron job model_tier > agent config > global primary_tier > 'default'.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Local backends using strict chat templates (e.g. Mistral 3) rejected
Flynn's Anthropic-style tool_use/tool_result content blocks, causing
'roles must alternate' errors. Added getMessageTextWithTools() and
normalizeMessagesForLocal() to serialize structured blocks to plain
text, drop empty messages, and merge consecutive same-role messages.
Also fixed compaction to ensure kept messages start with user role.
Adds process.loadEnvFile() to CLI entry point so API keys (ZHIPUAI_API_KEY,
OPENROUTER_API_KEY, XAI_API_KEY, etc.) can be stored in a project .env file
instead of shell environment or systemd service config. Uses Node >= 20.12
built-in — no dotenv dependency needed. Silent no-op if .env doesn't exist.
Updates .env.example with placeholders for all provider API keys.
Previously, switching to zhipuai/openrouter/xai via /model would throw a
confusing 'OPENAI_API_KEY missing' error from the OpenAI SDK. Now
createClientFromConfig validates API keys before constructing the client,
throwing errors that name the correct env var (e.g. ZHIPUAI_API_KEY).
Also fixes the misleading 'as anthropic' type cast in the /model handler
to validate against MODEL_PROVIDERS and use the ModelProvider type.
Wrap each message in a .message-wrapper div and render copy/edit buttons
below the bubble instead of overlapping inside it. Improves readability
and prevents buttons from covering message content.
Extract MODEL_PROVIDERS const from config schema as single source of truth
for provider names. PROVIDER_NAMES in TUI commands now imports from schema
instead of maintaining a hardcoded list. Adds tests verifying sync.
Updates README TUI Commands section with /model hot-swap documentation,
supported providers, and runtime model switching examples.
Copy button on all messages (clipboard API with checkmark feedback).
Edit button on user messages populates the input textarea.
Buttons appear on hover (desktop) or always visible (mobile).
ModelRouter now supports multiple tier-change listeners via addOnTierChange(),
SessionBridge subscribes to tier changes and propagates them to all WebChat
agents (both existing and newly created), and the fullscreen TUI now also
updates the agent's tier when switching models (matching minimal TUI behavior).
Local LLMs often get stuck calling the same tool repeatedly because they
lack the sophistication to synthesize results. The agent loop had no
safeguard — it re-executed whatever the model requested up to 10 times.
Add fingerprint-based loop detection: if the same tool+args combination
repeats 3 consecutive times, break the loop and return the last results.
Also add agents.max_iterations to the config schema so the iteration
limit is user-configurable (default: 10).
Gmail API returns snippets with HTML entities (&, ', <br>, etc.)
that leaked into LLM responses as raw HTML. Added shared sanitizeHtml()
utility in src/utils/html.ts and applied it to gmail tool snippets,
HTML body fallback, and gmail watcher snippets.