Format tool names as human-readable labels (e.g. "Gmail: List") and
show args as compact key-value pairs instead of raw JSON. Also register
Gmail tools in the TUI when automation.gmail is enabled.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stale session history can cause the model to follow old "I can't do
that" patterns even when new tools are available. NativeAgent now tracks
a tool fingerprint and appends a system prompt note listing current
tools when the inventory changes, resetting on session reset.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add calendar.today, calendar.list, calendar.search tools mirroring the
Gmail tool pattern. Includes gcal-auth CLI command, config schema, tool
policy entries (messaging/coding profiles + group:gcal), and 17 tests.
Also wires up gmail and gcal tool registration in the daemon and TUI.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Block-level renderer methods (paragraph, heading, blockquote, list) were
using raw token.text instead of this.parser.parseInline(tokens), causing
bold, italic, and inline code to never render. Add table renderer with
aligned columns and box-drawing separators. Remove unused marked-terminal
dependency (incompatible with marked v17).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements `flynn gmail-auth` to complete the OAuth2 flow that
GmailWatcher references but was never built. Supports local callback
server (default) and --manual paste mode. Adds Gmail health check
to `flynn doctor`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds comprehensive integration tests for the first-run wizard verifying config
generation for different provider/channel combinations. Updates shell completion
to include the 'setup' command with its options.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Register setup command in CLI index
- Offer setup wizard when running `flynn start` with no config
- Guard telegram log output since telegram is now optional
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements Task 6 of the setup wizard:
- orchestrator.ts: runMenu() for interactive configuration loop
- orchestrator.ts: runFirstRunWizard() for new user onboarding
- orchestrator.test.ts: test for menu exit behavior
- setup.ts: registerSetupCommand() and runSetup() handler
- Handles both first-run and existing config scenarios
- Saves YAML config to disk
- Optional daemon startup after first-run
All tests pass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement setupChannels function with support for Telegram, Discord, Slack, and WhatsApp.
Includes WebChat gateway configuration and channel choice loop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ConfigBuilder class to accumulate wizard answers into config objects with YAML
serialization, and renderSummary function to display configuration summary. Includes
9 test cases covering provider setup, channel configuration, and feature flags.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Created a Prompter interface and factory function for interactive CLI prompts:
- ask(): text input with optional default values
- confirm(): yes/no confirmation with default
- choose(): numbered menu selection with fallback
- password(): text input (no echo planned in TUI)
- println(): simple output helper
All 9 tests pass (ask, confirm, choose, password scenarios).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clean up the once('close') listener on the readline Interface when
rl.question() resolves normally. Previously, each prompt loop iteration
accumulated a close listener that was never removed, triggering
MaxListenersExceededWarning after 11 prompts.
- Core counters: messages processed, sessions, queue depth, uptime, active requests, errors
- Model performance table: recent calls with latency, tokens/sec, provider, status
- Event stream: scrollable log with color-coded levels (error/warn/info)
- Active requests: in-flight request table with session, channel, duration
- Channels grid: existing channel status cards preserved
- Dual timer refresh: 3s for metrics/events/requests, 10s for health/channels
- Targeted DOM updates via getElementById for flicker-free fast updates
- Track active requests with startRequest/endRequest around lane queue work
- Increment messagesProcessed on successful agent.process completion
- Record errors and error events on agent.send failures
- Record tool failure events with tool name and error details
- Add MetricsCollector class with counters, model call ring buffer, event ring buffer, and active request tracking
- Add system.metrics, system.events, system.activeRequests RPC handlers
- Add GET /health unauthenticated HTTP endpoint for Docker HEALTHCHECK
- Add totalPending() to LaneQueue for queue depth metrics
- Add 20 tests for MetricsCollector
Replace console.debug/log/warn calls in model router, retry, and daemon
startup with a structured logger that respects a configurable log_level.
Default level is 'info', suppressing verbose fallback debug messages in
the TUI while keeping them available via config when needed.
- Add src/logger.ts with debug/info/warn/error/silent levels
- Wire log_level into config schema (default: 'info')
- Initialize log level in both daemon and TUI startup paths
- Convert all console.debug in router.ts and retry.ts to logger.debug
- Convert console.log/warn in daemon/models.ts to logger.info/warn
- Import resolveOverlayPath from shared.ts
- Add checkOverlayExists check (skip when no FLYNN_ENV, pass/fail for overlay file)
- Insert after checkConfigExists in allChecks array
- All 1087 tests pass, typecheck clean
- Add resolveOverlayPath() that maps FLYNN_ENV to {configDir}/{env}.yaml
- Update loadConfigSafe to pass overlay path through to loadConfig
- All CLI commands using loadConfigSafe() automatically get overlay support
- No FLYNN_ENV = exact same behavior as before (backward compatible)
- Full test suite passes (1087 tests, zero regressions)
- Add deepMerge utility for recursive object merging (arrays replace, not concat)
- Extend loadConfig with optional overlayPath parameter
- Merge happens before env var expansion and Zod validation
- Add 6 deepMerge unit tests and 4 overlay integration tests
- Re-export deepMerge from config/index.ts
- All 1087 existing tests still pass
- Create initAgents() function encapsulating AgentConfigRegistry, AgentRouter, SandboxManager init
- Replace ~26 lines in startDaemon() with single initAgents() call
- Lifecycle shutdown handler for sandbox cleanup included in agents.ts
- Zero type errors, routing tests pass
- Move createMessageRouter function (~220 lines) to dedicated routing module
- Add import from ./routing.js in daemon/index.ts
- routing.test.ts passes without modification
- Zero type errors
Adds zhipuai as a new provider using the OpenAI-compatible API at
api.z.ai. Supports api_key config or ZHIPUAI_API_KEY env var, with
optional endpoint override.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Check model capabilities via /api/show before sending tools.
Models without 'tools' capability get requests without tools
(they can still answer, just without tool use). Result is cached
per client instance. Defense-in-depth: 'does not support' added
to retry nonRetryablePatterns to avoid wasting retries on
permanent errors.
- Ollama: pass tools to API, parse tool_calls responses, handle thinking field from reasoning models (deepseek-r1, glm-4.7-flash)
- llama.cpp: pass tools via OpenAI-compatible endpoint, parse tool_calls, accumulate streaming tool call deltas
- Both clients now set stopReason to 'tool_use' when tool calls are present
- Tests: 12 new tests (8 Ollama + 5 llama.cpp, total 983→995)
- assembleSystemPrompt() now injects '# Runtime Context' with current date/time
- New system.info tool: date, time, hostname, platform, arch, uptime, memory, Node.js version
- Tool available in all profiles (minimal/messaging/coding/full)
- 983 tests passing (+7 new)
New ChannelAdapter that monitors Gmail via Google Cloud Pub/Sub push
notifications with polling fallback. Supports OAuth2 auth, configurable
watch labels, template rendering with email metadata placeholders
(from, to, subject, snippet, date, id, labels).
Wired into daemon lifecycle and gateway (POST /gmail/push endpoint).
Includes 16 tests covering auth, templates, push notifications, and
channel routing.
Implements apply_patch equivalent: a single tool call can make multiple
line-based edits (replacements, insertions, deletions) across one or more
files. Hunks are applied bottom-up to preserve line numbers.
Includes 10 tests covering replacement, multi-hunk, insertion, deletion,
multi-file, overlapping hunks error, OOB error, and edge cases.
Two issues prevented the GitHub Models fallback from working:
1. The X-GitHub-Api-Version: 2022-11-28 header caused '400 invalid
apiVersion' errors. The Copilot chat completions endpoint does not
use this header — removed from both constructor and rebuildClient.
2. The anthropicToGitHubModel mapping was incomplete: it only knew
three models and the generic date-stripping fallback produced wrong
names (e.g. 'claude-sonnet-4-5' instead of 'claude-sonnet-4.5').
GitHub Copilot uses dots for sub-versions, not hyphens.
Updated with explicit mappings for all current models (sonnet 4,
4.5; opus 4, 4.5, 4.6; haiku 4.5) and a smarter generic fallback
that converts digit-hyphen-digit to digit.digit at the end.
3. createClientFromConfig now auto-maps Anthropic-style model names
when the provider is 'github', so users can copy model names from
their Anthropic config into fallback blocks without manual renaming.
The TUI was building its own ModelRouter with a duplicated client factory
that lacked auto same-model fallback, local_providers resolution, retry
config, and per-tier fallback logic. When Anthropic failed, it skipped
GitHub Models and fell straight to the local Ollama model.
Replace the duplicated ~50-line createClient + router setup in tui.ts
with a single call to the daemon's createModelRouter(), which already
handles all of these correctly. This removes ~50 lines of duplicated
code and ensures TUI and daemon have identical fallback behavior.