Commit Graph

566 Commits

Author SHA1 Message Date
William Valentin fae3565480 docs(skills): add skills infrastructure plan
- Three-phase plan for skills system improvements
- Phase 1: Command Dispatch (flynn skills CLI commands)
- Phase 2: Skills Watcher (auto-reload with chokidar)
- Phase 3: Installer Specs (auto-install brew/node/go/download)
- Model strategy: glm-4.7-flash for mechanical, glm-4.7 for complex
- Estimated 8-11 hours total
2026-02-11 14:48:21 -08:00
William Valentin 6090508bad style: auto-fix ESLint issues (curly braces and formatting)
- Add curly braces to all if/else/for/while statements
- Fix indentation and trailing spaces
- Auto-fixed 372 linting errors using eslint --fix
- Remaining issues are warnings only (non-null assertions, explicit any types)
2026-02-11 10:30:24 -08:00
William Valentin 0578a87d85 feat: add ESLint 9 configuration with TypeScript support
- Add eslint.config.js using new flat config format
- Configure @typescript-eslint/parser and plugin for TypeScript files
- Add separate config for vanilla JavaScript files (gateway/ui)
- Include Node.js and browser globals
- Enable strict rules: curly braces, no-eval, eqeqeq, etc.
- Configure TypeScript-specific rules (no-explicit-any, no-non-null-assertion)
- Add @typescript-eslint/parser and @typescript-eslint/eslint-plugin dependencies
2026-02-11 10:30:13 -08:00
William Valentin df4120f4a7 feat: add Makefile with pnpm integration and systemd daemon management
- Use pnpm for all build, dev, test, and quality check commands
- Replace manual PID file handling with systemd service control
- Add daemon-start, daemon-stop, daemon-restart, daemon-status, daemon-logs targets
- Add enable/disable targets for boot startup management
- Provide convenience aliases (stop, restart, status, logs) for common operations
- Integrate with existing flynn.service systemd user service
2026-02-11 10:22:43 -08:00
William Valentin 1a3ae3020f fix: copy webchat UI assets to dist/ during build
tsc only compiles .ts files — the webchat static files (HTML, CSS, JS)
in src/gateway/ui/ were never copied to dist/gateway/ui/, causing 404s
when running the production build via 'pnpm start'.
2026-02-11 09:58:15 -08:00
William Valentin 85d7a6bfec test: add stopReason edge case tests; update state.json with recent fixes
- Added tests for finish_reason 'tool_calls' with empty array → 'end_turn'
- Added test for finish_reason 'length' → 'max_tokens'
- Updated state.json with 4 new entries for today's fixes (SOUL.md, message
  normalization, agent loop resilience, stopReason normalization)
- Test count: 1329 → 1331
2026-02-11 09:51:19 -08:00
William Valentin 01c3175fdb fix: normalize OpenAI/GitHub finish_reason to Flynn stopReason conventions
OpenAI-compatible providers return 'stop' and 'tool_calls' as finish_reason
values, but Flynn's agent loop expects Anthropic-style 'end_turn' and
'tool_use'. This caused the agent to exit the tool loop prematurely when
falling back to GitHub Copilot (due to Anthropic API quota exhaustion).

- openai.ts: Map 'stop' → 'end_turn', 'length' → 'max_tokens', tool_calls
  with actual tools → 'tool_use', tool_calls without tools → 'end_turn'
- github.ts: Handle edge case where finish_reason is 'tool_calls' but no
  tools were parsed
- agent.ts: Accept both 'tool_use' and 'tool_calls' as valid stop reasons
  (belt-and-suspenders), extract toolCalls to local variable for TS narrowing
- openai.test.ts: Update expectations to match new normalized values
2026-02-11 09:49:36 -08:00
William Valentin 1aab006a7f feat: improve agent loop resilience — same-tool nudging and error handling
- agent.ts: track consecutive calls to the same tool (ignoring args) and
  inject a nudge after 4 repeats telling the model to summarize and respond,
  preventing local models from endlessly retrying searches with slight
  query variations
- agent.ts: wrap the entire tool loop iteration in try-catch so model/network
  errors don't crash the daemon — returns a descriptive error message instead
- Tests for both: nudge triggers after 4 same-tool calls, error recovery
  persists to history
2026-02-11 09:33:30 -08:00
William Valentin c01de7d097 feat: native tool calling message normalization for Ollama and llama.cpp
- ollama.ts: add normalizeMessagesForOllama() converting Anthropic-style
  tool_use/tool_result blocks to Ollama's native tool_calls + role:tool format
- llamacpp.ts: add normalizeMessagesForLlamaCpp() with hybrid approach —
  assistant tool_calls in native format, but tool results as structured user
  messages (many GGUF templates silently drop role:tool messages)
- llamacpp.ts: add configurable requestTimeout with AbortController (default 3min)
- Both use fast-path when no tool blocks are present (zero overhead)
- Full test coverage for both normalizers: plain text passthrough, tool_use
  conversion, tool_result mapping, multi-tool round trips, error results
2026-02-11 09:33:21 -08:00
William Valentin 5270234bbb feat: improve tool usage guidance in SOUL.md and add cron.create/cron.delete tools
- SOUL.md: list all available tools (web.search, memory.*, cron.*, etc.)
  and add Tool Usage Rules section enforcing 'act, don't narrate'
- cron.ts: add getJob(), addJob(), removeJob() to CronScheduler for
  runtime (ephemeral) cron job management
- cron tools: add cron.create and cron.delete tools, enhance cron.list
  to show schedule/output/message details
- policy.ts: add cron tools to messaging and coding profiles, add
  group:cron to tool groups

Fixes issue where models would narrate tool intent ('let me search...')
then stop without actually calling tools.
2026-02-11 09:32:36 -08:00
William Valentin eea7ca62a8 chore: increase GmailWatcher default poll interval from 60s to 300s 2026-02-11 08:43:48 -08:00
William Valentin 60b214e7c4 feat: add per-cron-job model tier selection
Allow cron jobs to specify a `model_tier` field that controls which LLM
tier handles the job, without needing separate agent configs. Precedence:
cron job model_tier > agent config > global primary_tier > 'default'.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 22:31:18 -08:00
William Valentin 6761dca1c2 fix: normalize message roles for local model backends (llama.cpp, Ollama)
Local backends using strict chat templates (e.g. Mistral 3) rejected
Flynn's Anthropic-style tool_use/tool_result content blocks, causing
'roles must alternate' errors. Added getMessageTextWithTools() and
normalizeMessagesForLocal() to serialize structured blocks to plain
text, drop empty messages, and merge consecutive same-role messages.
Also fixed compaction to ensure kept messages start with user role.
2026-02-10 22:04:17 -08:00
William Valentin 2f6d045e2a feat: load .env file at startup using Node built-in loadEnvFile
Adds process.loadEnvFile() to CLI entry point so API keys (ZHIPUAI_API_KEY,
OPENROUTER_API_KEY, XAI_API_KEY, etc.) can be stored in a project .env file
instead of shell environment or systemd service config. Uses Node >= 20.12
built-in — no dotenv dependency needed. Silent no-op if .env doesn't exist.

Updates .env.example with placeholders for all provider API keys.
2026-02-10 21:43:09 -08:00
William Valentin 5c90640e2a fix: clear error messages for missing API keys on provider switch
Previously, switching to zhipuai/openrouter/xai via /model would throw a
confusing 'OPENAI_API_KEY missing' error from the OpenAI SDK. Now
createClientFromConfig validates API keys before constructing the client,
throwing errors that name the correct env var (e.g. ZHIPUAI_API_KEY).

Also fixes the misleading 'as anthropic' type cast in the /model handler
to validate against MODEL_PROVIDERS and use the ModelProvider type.
2026-02-10 21:32:44 -08:00
William Valentin aaaf4a361a fix(webchat): move action buttons outside message bubble
Wrap each message in a .message-wrapper div and render copy/edit buttons
below the bubble instead of overlapping inside it. Improves readability
and prevents buttons from covering message content.
2026-02-10 21:26:22 -08:00
William Valentin 25482b8516 feat: sync PROVIDER_NAMES with config schema and update README docs
Extract MODEL_PROVIDERS const from config schema as single source of truth
for provider names. PROVIDER_NAMES in TUI commands now imports from schema
instead of maintaining a hardcoded list. Adds tests verifying sync.

Updates README TUI Commands section with /model hot-swap documentation,
supported providers, and runtime model switching examples.
2026-02-10 21:26:18 -08:00
William Valentin 27ee3b2c10 feat(webchat): add copy and edit buttons on chat messages
Copy button on all messages (clipboard API with checkmark feedback).
Edit button on user messages populates the input textarea.
Buttons appear on hover (desktop) or always visible (mobile).
2026-02-10 20:53:49 -08:00
William Valentin 4c8ba3f20c feat(webchat): add slash commands, autocomplete popup, and web search button
Add 6 slash commands (/help, /reset, /compact, /usage, /status, /model)
with autocomplete popup (arrow keys, Enter/Tab/Escape navigation).
Search button toggles web search mode by prepending instruction to message.
Backend agent.send extended with metadata for server-side command routing.
2026-02-10 20:45:14 -08:00
William Valentin 7a69794418 fix: sync model tier between TUI and WebChat when switching models
ModelRouter now supports multiple tier-change listeners via addOnTierChange(),
SessionBridge subscribes to tier changes and propagates them to all WebChat
agents (both existing and newly created), and the fullscreen TUI now also
updates the agent's tier when switching models (matching minimal TUI behavior).
2026-02-10 20:22:40 -08:00
William Valentin bf9ca690f3 fix(agent): detect repeated tool call loops and make max_iterations configurable
Local LLMs often get stuck calling the same tool repeatedly because they
lack the sophistication to synthesize results. The agent loop had no
safeguard — it re-executed whatever the model requested up to 10 times.

Add fingerprint-based loop detection: if the same tool+args combination
repeats 3 consecutive times, break the loop and return the last results.
Also add agents.max_iterations to the config schema so the iteration
limit is user-configurable (default: 10).
2026-02-10 19:35:09 -08:00
William Valentin 4ce8e81c01 fix(gmail): sanitize HTML entities and tags in tool output
Gmail API returns snippets with HTML entities (&amp;, &#39;, <br>, etc.)
that leaked into LLM responses as raw HTML. Added shared sanitizeHtml()
utility in src/utils/html.ts and applied it to gmail tool snippets,
HTML body fallback, and gmail watcher snippets.
2026-02-10 16:30:14 -08:00
William Valentin 4317492e4b docs: update state.json with TUI fullscreen improvements and test count (1268) 2026-02-10 13:29:14 -08:00
William Valentin 2644ed269e fix(tui): reset agent state on /reset command in fullscreen mode
Use agent.reset() instead of session.clear() when agent is available,
ensuring tool fingerprint, usage counters, and agent history are all
cleared properly alongside the session.
2026-02-10 13:27:58 -08:00
William Valentin 671ec035e9 fix(tui): show tool activity in fullscreen mode via Ink-compatible callback
Replace process.stdout.write-based onToolUse callback (which corrupts
Ink rendering) with a React state-driven approach that shows tool names,
args, and completion status in the streaming content area.
2026-02-10 13:26:47 -08:00
William Valentin e46e8740a1 fix(tui): enable tool access in fullscreen mode via NativeAgent
Fullscreen TUI was calling modelClient directly, bypassing the NativeAgent
tool loop entirely. Pass the agent through FullscreenTuiConfig → App and
use agent.process() for message handling, which enables the full tool
registry and executor.
2026-02-10 13:21:22 -08:00
William Valentin f892bbe6ca feat(tui): add ASCII art banner on startup 2026-02-10 13:11:32 -08:00
William Valentin f204ff1dd7 feat(tools): add Google Docs, Drive, and Tasks read-only tools
Add three new Google service integrations following the established
Gmail/GCal pattern:

- Google Docs (docs.list, docs.search, docs.read): list, search, and
  read document content as plain text via Docs + Drive APIs
- Google Drive (drive.list, drive.search, drive.read): list, search,
  and read files with export support for Workspace files (Docs→text,
  Sheets→CSV, Slides→text)
- Google Tasks (tasks.lists, tasks.list): list task lists and tasks
  with status, due dates, and notes

Each service has its own config section, OAuth auth command, tool
policy group, and test suite (53 new tests). The setup wizard now
offers to configure all Google services together and run OAuth auth
flows automatically after saving config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 12:59:15 -08:00
William Valentin 411c6d84a2 feat(tui): persist model tier selection and fix formatting
Persist /model tier choice to ~/.local/share/flynn/preferences.json so
it survives restarts. Decode HTML entities (e.g. &#39;) in markdown
renderer output. Suppress noisy logger.info and punycode deprecation
warnings in TUI startup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 12:23:12 -08:00
William Valentin 50471d63af feat(tools): add gmail.read tool for full email content
The existing gmail.list and gmail.search tools only return snippets.
gmail.read fetches the full message by ID using format: 'full', decodes
base64url body parts (preferring text/plain, falling back to stripped
HTML), and returns headers + body text. This enables workflows like
searching for invoices and extracting amounts from the full content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 12:01:49 -08:00
William Valentin d39d3ac367 docs: add Google Calendar section and new-tool checklist
Add GCal tools setup guide to README (prerequisites, config, fields).
Add gmail-auth, gcal-auth, setup to the CLI commands table. Add
"Adding a New Tool" checklist to CLAUDE.md covering the full wiring
chain including the TUI registration gotcha.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 11:51:56 -08:00
William Valentin f6dedf0fbe fix(tui): register Google Calendar tools when gcal is enabled
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 11:45:05 -08:00
William Valentin 55d35c80b4 feat(tui): improve tool use display and register Gmail tools
Format tool names as human-readable labels (e.g. "Gmail: List") and
show args as compact key-value pairs instead of raw JSON. Also register
Gmail tools in the TUI when automation.gmail is enabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 11:41:35 -08:00
William Valentin 796e143d61 fix(agent): inject tool inventory note when tools change mid-session
Stale session history can cause the model to follow old "I can't do
that" patterns even when new tools are available. NativeAgent now tracks
a tool fingerprint and appends a system prompt note listing current
tools when the inventory changes, resetting on session reset.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 11:41:31 -08:00
William Valentin 94264e848c feat(tools): add Google Calendar tools and register Gmail/GCal in daemon
Add calendar.today, calendar.list, calendar.search tools mirroring the
Gmail tool pattern. Includes gcal-auth CLI command, config schema, tool
policy entries (messaging/coding profiles + group:gcal), and 17 tests.
Also wires up gmail and gcal tool registration in the daemon and TUI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 11:40:53 -08:00
William Valentin 4cc29f534a fix(tui): render inline markdown formatting with ANSI codes
Block-level renderer methods (paragraph, heading, blockquote, list) were
using raw token.text instead of this.parser.parseInline(tokens), causing
bold, italic, and inline code to never render. Add table renderer with
aligned columns and box-drawing separators. Remove unused marked-terminal
dependency (incompatible with marked v17).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 11:29:57 -08:00
William Valentin ff03f74404 feat(cli): add gmail-auth command for OAuth2 token setup
Implements `flynn gmail-auth` to complete the OAuth2 flow that
GmailWatcher references but was never built. Supports local callback
server (default) and --manual paste mode. Adds Gmail health check
to `flynn doctor`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 10:33:01 -08:00
William Valentin f4b9c850ab feat(setup): add contextual help text to all wizard flows
Each setup section now explains what's needed before prompting:
- Providers: links to API key consoles (Anthropic, OpenAI, Gemini, etc.)
- Channels: step-by-step bot creation (Telegram @BotFather, Discord dev
  portal, Slack app setup, WhatsApp QR)
- Gmail: Google Cloud Console OAuth setup walkthrough
- Memory: explains what vector search does and key reuse
- Security: describes each option (sandbox, pairing, tool profiles)
- Gateway: explains auth token, Tailscale Serve, lock mode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 10:08:44 -08:00
William Valentin f9446a4d67 docs: update gap analysis and state.json for setup wizard
Mark onboard wizard as MATCH (100/128, 78%). Update test count to 1151.
Add setup-wizard plan entry to state.json.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:44:52 -08:00
William Valentin 7620616c7c test(setup): add integration tests and update shell completion
Adds comprehensive integration tests for the first-run wizard verifying config
generation for different provider/channel combinations. Updates shell completion
to include the 'setup' command with its options.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:38:53 -08:00
William Valentin f50d7d69fb feat(setup): wire setup command into CLI and start command
- Register setup command in CLI index
- Offer setup wizard when running `flynn start` with no config
- Guard telegram log output since telegram is now optional

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:36:58 -08:00
William Valentin d8b7b08270 feat(setup): add main orchestrator, menu, and CLI command
Implements Task 6 of the setup wizard:
- orchestrator.ts: runMenu() for interactive configuration loop
- orchestrator.ts: runFirstRunWizard() for new user onboarding
- orchestrator.test.ts: test for menu exit behavior
- setup.ts: registerSetupCommand() and runSetup() handler
  - Handles both first-run and existing config scenarios
  - Saves YAML config to disk
  - Optional daemon startup after first-run

All tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:35:32 -08:00
William Valentin 182d86957b feat(setup): add memory, automation, security, and gateway setup flows 2026-02-10 09:34:04 -08:00
William Valentin b673632b0f feat(setup): add channel setup flows
Implement setupChannels function with support for Telegram, Discord, Slack, and WhatsApp.
Includes WebChat gateway configuration and channel choice loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:32:52 -08:00
William Valentin 573cb43534 feat(setup): add model provider setup flows 2026-02-10 09:31:43 -08:00
William Valentin d35ce2beb5 feat(setup): add config builder and summary renderer
Add ConfigBuilder class to accumulate wizard answers into config objects with YAML
serialization, and renderSummary function to display configuration summary. Includes
9 test cases covering provider setup, channel configuration, and feature flags.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:29:56 -08:00
William Valentin 9cc03187b0 feat(setup): add prompt helpers for setup wizard
Created a Prompter interface and factory function for interactive CLI prompts:
- ask(): text input with optional default values
- confirm(): yes/no confirmation with default
- choose(): numbered menu selection with fallback
- password(): text input (no echo planned in TUI)
- println(): simple output helper

All 9 tests pass (ask, confirm, choose, password scenarios).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:28:19 -08:00
William Valentin 213dba855a refactor: make telegram config optional for non-telegram setups
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:27:18 -08:00
William Valentin 48fab11066 docs: add setup wizard implementation plan
9 tasks with TDD approach: prompt helpers, config builder, provider/channel
flows, menu sections, orchestrator, CLI wiring, integration tests. ~29 new
tests, 13 new files, 0 new dependencies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:19:21 -08:00
William Valentin 6b426a1e52 docs: add setup wizard design
Interactive setup wizard with two entry points: auto-trigger on
first run (no config detected) and explicit `flynn setup` command.
Minimal-first flow for quick start, menu-driven for reconfiguration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:14:45 -08:00