Commit Graph

299 Commits

Author SHA1 Message Date
William Valentin 81d04357a1 feat(skills): validate manifest installer specs 2026-02-12 17:52:53 -08:00
William Valentin bd29afeaff chore(skills): improve watcher event observability 2026-02-12 17:40:41 -08:00
William Valentin 333e33f30f feat(skills): target watcher updates with safe fallback 2026-02-12 17:36:32 -08:00
William Valentin 2fb5c9adab feat(skills): reload registry on watcher change events 2026-02-12 17:30:23 -08:00
William Valentin b773e2bbf3 feat(skills): enable watcher wiring through daemon lifecycle 2026-02-12 17:18:22 -08:00
William Valentin 95091cc198 feat(skills): add debounced watcher foundation for phase 2 2026-02-12 17:15:46 -08:00
William Valentin 0a19f01639 feat(doctor): surface skill directory health in diagnostics 2026-02-12 17:05:04 -08:00
William Valentin fc3d2ab4d8 feat(skills): add refresh summary for discovery health 2026-02-12 17:02:23 -08:00
William Valentin 2d753321b3 feat(skills): guard uninstall with explicit confirmation 2026-02-12 16:59:50 -08:00
William Valentin d5b7d72e5d feat(skills): add install dispatch for local skill setup 2026-02-12 16:50:25 -08:00
William Valentin 0d84a6bccc feat(skills): add info command for skill inspection 2026-02-12 16:44:46 -08:00
William Valentin b3e5aee333 feat(skills): expose list command for skill visibility 2026-02-12 16:42:00 -08:00
William Valentin 90ce622080 feat(policy): enforce truthfulness and autonomy guardrails
Add runtime truthfulness modes and autonomy-level tool gating with audit metadata for overrides/denials.

Wire policy through prompt assembly, tool execution context, and daemon/gateway agent paths; update tests and planning state for Phase 3 PR #2 completion.
2026-02-12 16:06:45 -08:00
William Valentin 125af4e832 refactor(backend): use systemd for daemon management
Replace manual process management with systemctl --user commands.
Uses ollama.service and llama-server.service units for proper lifecycle
management, VRAM cleanup, and integration with system services.
2026-02-12 00:24:43 -08:00
William Valentin 1c8da30905 fix(backend): only kill processes started by TUI
Track PIDs of backends started by /backend command and only kill those
specific PIDs. Previous implementation used pkill which would kill all
Ollama/llama-server processes including those started by the user or
systemd services. Now we only terminate processes we started.
2026-02-12 00:19:26 -08:00
William Valentin 636f4b3311 feat(deploy): add whisper.cpp Kubernetes deployment
Add Dockerfile and K8s manifests for whisper.cpp transcription service.
Deploys to ai-stack namespace with worker node affinity for GPU access.
2026-02-12 00:14:41 -08:00
William Valentin e0ce07ac43 feat(makefile): add llama-server systemd management targets
Add llama-start, llama-stop, llama-restart, llama-status, llama-logs,
llama-enable, llama-disable targets for managing llama-server as a systemd service.
Matches existing daemon management pattern for consistency.
2026-02-12 00:14:28 -08:00
William Valentin 05037a917e feat(backend): auto-stop/start daemon when switching backends
- Add local_providers with ollama and llamacpp configurations
- /backend command now stops current daemon before starting new one
- Start backends as detached processes to avoid blocking TUI
- Wait 500ms for daemon to initialize before switching
2026-02-12 00:13:59 -08:00
William Valentin 0b44adbaea fix(audio): add SSRF protection, MIME type fix, and tests for audio-transcribe tool
- Add URL validation blocking localhost, private IPs, and non-http protocols
- Use response Content-Type header instead of hardcoded audio/wav for URL downloads
- Add 25 tests covering validation, SSRF, config errors, transcription paths, and error handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 21:57:45 -08:00
William Valentin a8a2c59313 feat: implement model persistence with per-session overrides
- Add session_config SQLite table for per-session settings
- Update routing to support session override → agent config → global default resolution chain
- Upgrade WebChat SessionBridge from NativeAgent to AgentOrchestrator
- Add /model, /local, /cloud commands to Telegram adapter
- Add /model command to WebChat gateway handlers
- Clear session overrides on /reset command
- Pass memoryStore and config through to SessionBridge
- Add comprehensive tests for all new functionality

Fixes model persistence bug where TUI model changes didn't affect WebChat/Telegram sessions. Now:
- TUI /model sets global default (persists across restarts, affects all new sessions)
- WebChat/Telegram /model sets session override (only that conversation, cleared on /reset)
- WebChat sessions gain AgentOrchestrator features (delegation, compaction, memory)
2026-02-11 21:51:38 -08:00
William Valentin b0092c8284 docs: add whisper-server to docker-compose.yml
- Add commented-out whisper-server service to docker-compose.yml
- Update README to show both manual docker run and docker-compose options
2026-02-11 19:49:56 -08:00
William Valentin 28c78d469d docs: update audio config docs and add voice message failure fix to changelog
- README.md: Update audio config format to match schema (enabled + provider.* fields instead of old transcription_endpoint fields), add whisper.cpp server Docker example
- CHANGELOG.md: Add '### Fixed' section with voice message failure handling details
- config/default.yaml: Update audio section with new schema format and Docker setup example
2026-02-11 19:47:52 -08:00
William Valentin 2e235213d9 fix(audio): handle voice message failures gracefully
- Send user feedback when voice/audio download fails instead of silent failure
- Send graceful message when audio transcription is not configured instead of empty text which crashes API
2026-02-11 19:44:42 -08:00
William Valentin 5c531a760d docs: document native audio support across README, CHANGELOG, config, and planning docs
- README: add audio.transcribe to tool list, update media pipeline description,
  add Native Audio Support and Audio Transcription config sections, add
  supports_audio per-tier override example
- SOUL.md: add audio.transcribe to available tools list
- CHANGELOG: add native audio support and audio.transcribe tool entries
- config/default.yaml: add commented audio config section, supports_audio hint
- INTEGRATIONS.md: expand audio section with native passthrough, capabilities,
  smart routing, AudioSource type, token estimation, audio.transcribe tool
- STRUCTURE.md: add capabilities.ts and audio-transcribe.ts to key file listings
- ARCHITECTURE.md: update data flow step 5 to describe smart audio routing
2026-02-11 18:41:53 -08:00
William Valentin 819ac26b3b Merge branch 'feature/native-audio-support' 2026-02-11 18:28:12 -08:00
William Valentin c62dad2e2e docs: update state.json with native audio support feature and test count (1369) 2026-02-11 18:27:50 -08:00
William Valentin 148219153e feat(audio): add tests, token estimation, and config override for native audio
- Add capabilities.test.ts (18 tests) for supportsAudioInput()
- Add 15 audio tests to media.test.ts (hasAudio, stripAudioParts, attachmentToAudioSource)
- Add estimateAudioTokens() to tokens.ts (base64→bytes→duration→tokens)
- Update estimateMessageTokens() to include audio content parts
- Add 5 audio token tests to tokens.test.ts
- Add supports_audio config override to model schema
- Wire supports_audio from tier config through routing to capability check

Total tests: 1369 (was 1331, +38 audio-related)
2026-02-11 18:27:19 -08:00
William Valentin 32ac4df20a feat(audio): add smart routing for native vs transcribed audio
- Create capabilities.ts with supportsAudioInput() detection
- Gemini, OpenAI, and GitHub Models get native audio passthrough
- Anthropic, Bedrock, Ollama, llama.cpp fall back to Whisper transcription
- routing.ts now checks model capability before deciding to transcribe
- Audio attachments are stripped for non-native models (only transcript text passed)
- Remove deprecated audioConfig from createMessageRouter deps (read from config.audio)
2026-02-11 18:20:10 -08:00
William Valentin 32e1a2724a feat(audio): add native audio support to type system and model clients
- Add AudioSource interface and 'audio' variant to MessageContentPart union
- Update buildUserMessage() to create audio content parts from attachments
- Add attachmentToAudioSource(), hasAudio(), stripAudioParts() helpers
- Gemini: native audio via inlineData (same format as images)
- OpenAI/GitHub: native audio via input_audio content parts
- Anthropic/Bedrock: graceful fallback to transcript text
- Update getMessageTextWithTools() to handle audio blocks for local models
2026-02-11 18:17:33 -08:00
William Valentin a875bcc4ae feat(audio): add audio.transcribe tool with Whisper-compatible API support
- Add createAudioTranscribeTool with OpenAI/Groq/Ollama/llama.cpp provider support
- Refactor audio config schema to nested audio.enabled + audio.provider structure
- Move audio tool registration to initTools() for conditional enablement
- Fix duplication bug in audio-transcribe.ts URL download handler
- Support base64 data and URL-based audio input with format detection
2026-02-11 18:13:19 -08:00
William Valentin 5491d5a82a Merge branch 'feature/audit-logging' 2026-02-11 16:06:52 -08:00
William Valentin 2dddae8f9b feat(audit): Add automation component logging
Add audit logging to:
- WebhookHandler: connect/disconnect, receive, not_found, denied, HMAC verified
- HeartbeatMonitor: start/stop, cycle, check, fail, recover
- GmailWatcher: connect/disconnect lifecycle events

All automation events now captured in audit log with proper context
2026-02-11 16:04:33 -08:00
William Valentin d62e836b5d feat(audit): Add core audit logging infrastructure
- Add AuditLogger class with rotation support
- Add audit configuration to config schema
- Instrument tool execution with full audit logging
- Instrument session lifecycle (create, message, delete, transfer, compact)
- Add audit logger initialization in daemon
- Add cron scheduler audit logging

Audit events captured:
- tool.start/success/error/denied
- session.create/message/delete/transfer/compact
- cron.trigger/add/remove

All logs go to ~/.local/share/flynn/audit.log (JSON lines)
with rotation (10MB files, 30-day retention)
2026-02-11 15:58:07 -08:00
William Valentin fae3565480 docs(skills): add skills infrastructure plan
- Three-phase plan for skills system improvements
- Phase 1: Command Dispatch (flynn skills CLI commands)
- Phase 2: Skills Watcher (auto-reload with chokidar)
- Phase 3: Installer Specs (auto-install brew/node/go/download)
- Model strategy: glm-4.7-flash for mechanical, glm-4.7 for complex
- Estimated 8-11 hours total
2026-02-11 14:48:21 -08:00
William Valentin 6090508bad style: auto-fix ESLint issues (curly braces and formatting)
- Add curly braces to all if/else/for/while statements
- Fix indentation and trailing spaces
- Auto-fixed 372 linting errors using eslint --fix
- Remaining issues are warnings only (non-null assertions, explicit any types)
2026-02-11 10:30:24 -08:00
William Valentin 0578a87d85 feat: add ESLint 9 configuration with TypeScript support
- Add eslint.config.js using new flat config format
- Configure @typescript-eslint/parser and plugin for TypeScript files
- Add separate config for vanilla JavaScript files (gateway/ui)
- Include Node.js and browser globals
- Enable strict rules: curly braces, no-eval, eqeqeq, etc.
- Configure TypeScript-specific rules (no-explicit-any, no-non-null-assertion)
- Add @typescript-eslint/parser and @typescript-eslint/eslint-plugin dependencies
2026-02-11 10:30:13 -08:00
William Valentin df4120f4a7 feat: add Makefile with pnpm integration and systemd daemon management
- Use pnpm for all build, dev, test, and quality check commands
- Replace manual PID file handling with systemd service control
- Add daemon-start, daemon-stop, daemon-restart, daemon-status, daemon-logs targets
- Add enable/disable targets for boot startup management
- Provide convenience aliases (stop, restart, status, logs) for common operations
- Integrate with existing flynn.service systemd user service
2026-02-11 10:22:43 -08:00
William Valentin 1a3ae3020f fix: copy webchat UI assets to dist/ during build
tsc only compiles .ts files — the webchat static files (HTML, CSS, JS)
in src/gateway/ui/ were never copied to dist/gateway/ui/, causing 404s
when running the production build via 'pnpm start'.
2026-02-11 09:58:15 -08:00
William Valentin 85d7a6bfec test: add stopReason edge case tests; update state.json with recent fixes
- Added tests for finish_reason 'tool_calls' with empty array → 'end_turn'
- Added test for finish_reason 'length' → 'max_tokens'
- Updated state.json with 4 new entries for today's fixes (SOUL.md, message
  normalization, agent loop resilience, stopReason normalization)
- Test count: 1329 → 1331
2026-02-11 09:51:19 -08:00
William Valentin 01c3175fdb fix: normalize OpenAI/GitHub finish_reason to Flynn stopReason conventions
OpenAI-compatible providers return 'stop' and 'tool_calls' as finish_reason
values, but Flynn's agent loop expects Anthropic-style 'end_turn' and
'tool_use'. This caused the agent to exit the tool loop prematurely when
falling back to GitHub Copilot (due to Anthropic API quota exhaustion).

- openai.ts: Map 'stop' → 'end_turn', 'length' → 'max_tokens', tool_calls
  with actual tools → 'tool_use', tool_calls without tools → 'end_turn'
- github.ts: Handle edge case where finish_reason is 'tool_calls' but no
  tools were parsed
- agent.ts: Accept both 'tool_use' and 'tool_calls' as valid stop reasons
  (belt-and-suspenders), extract toolCalls to local variable for TS narrowing
- openai.test.ts: Update expectations to match new normalized values
2026-02-11 09:49:36 -08:00
William Valentin 1aab006a7f feat: improve agent loop resilience — same-tool nudging and error handling
- agent.ts: track consecutive calls to the same tool (ignoring args) and
  inject a nudge after 4 repeats telling the model to summarize and respond,
  preventing local models from endlessly retrying searches with slight
  query variations
- agent.ts: wrap the entire tool loop iteration in try-catch so model/network
  errors don't crash the daemon — returns a descriptive error message instead
- Tests for both: nudge triggers after 4 same-tool calls, error recovery
  persists to history
2026-02-11 09:33:30 -08:00
William Valentin c01de7d097 feat: native tool calling message normalization for Ollama and llama.cpp
- ollama.ts: add normalizeMessagesForOllama() converting Anthropic-style
  tool_use/tool_result blocks to Ollama's native tool_calls + role:tool format
- llamacpp.ts: add normalizeMessagesForLlamaCpp() with hybrid approach —
  assistant tool_calls in native format, but tool results as structured user
  messages (many GGUF templates silently drop role:tool messages)
- llamacpp.ts: add configurable requestTimeout with AbortController (default 3min)
- Both use fast-path when no tool blocks are present (zero overhead)
- Full test coverage for both normalizers: plain text passthrough, tool_use
  conversion, tool_result mapping, multi-tool round trips, error results
2026-02-11 09:33:21 -08:00
William Valentin 5270234bbb feat: improve tool usage guidance in SOUL.md and add cron.create/cron.delete tools
- SOUL.md: list all available tools (web.search, memory.*, cron.*, etc.)
  and add Tool Usage Rules section enforcing 'act, don't narrate'
- cron.ts: add getJob(), addJob(), removeJob() to CronScheduler for
  runtime (ephemeral) cron job management
- cron tools: add cron.create and cron.delete tools, enhance cron.list
  to show schedule/output/message details
- policy.ts: add cron tools to messaging and coding profiles, add
  group:cron to tool groups

Fixes issue where models would narrate tool intent ('let me search...')
then stop without actually calling tools.
2026-02-11 09:32:36 -08:00
William Valentin eea7ca62a8 chore: increase GmailWatcher default poll interval from 60s to 300s 2026-02-11 08:43:48 -08:00
William Valentin 60b214e7c4 feat: add per-cron-job model tier selection
Allow cron jobs to specify a `model_tier` field that controls which LLM
tier handles the job, without needing separate agent configs. Precedence:
cron job model_tier > agent config > global primary_tier > 'default'.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 22:31:18 -08:00
William Valentin 6761dca1c2 fix: normalize message roles for local model backends (llama.cpp, Ollama)
Local backends using strict chat templates (e.g. Mistral 3) rejected
Flynn's Anthropic-style tool_use/tool_result content blocks, causing
'roles must alternate' errors. Added getMessageTextWithTools() and
normalizeMessagesForLocal() to serialize structured blocks to plain
text, drop empty messages, and merge consecutive same-role messages.
Also fixed compaction to ensure kept messages start with user role.
2026-02-10 22:04:17 -08:00
William Valentin 2f6d045e2a feat: load .env file at startup using Node built-in loadEnvFile
Adds process.loadEnvFile() to CLI entry point so API keys (ZHIPUAI_API_KEY,
OPENROUTER_API_KEY, XAI_API_KEY, etc.) can be stored in a project .env file
instead of shell environment or systemd service config. Uses Node >= 20.12
built-in — no dotenv dependency needed. Silent no-op if .env doesn't exist.

Updates .env.example with placeholders for all provider API keys.
2026-02-10 21:43:09 -08:00
William Valentin 5c90640e2a fix: clear error messages for missing API keys on provider switch
Previously, switching to zhipuai/openrouter/xai via /model would throw a
confusing 'OPENAI_API_KEY missing' error from the OpenAI SDK. Now
createClientFromConfig validates API keys before constructing the client,
throwing errors that name the correct env var (e.g. ZHIPUAI_API_KEY).

Also fixes the misleading 'as anthropic' type cast in the /model handler
to validate against MODEL_PROVIDERS and use the ModelProvider type.
2026-02-10 21:32:44 -08:00
William Valentin aaaf4a361a fix(webchat): move action buttons outside message bubble
Wrap each message in a .message-wrapper div and render copy/edit buttons
below the bubble instead of overlapping inside it. Improves readability
and prevents buttons from covering message content.
2026-02-10 21:26:22 -08:00
William Valentin 25482b8516 feat: sync PROVIDER_NAMES with config schema and update README docs
Extract MODEL_PROVIDERS const from config schema as single source of truth
for provider names. PROVIDER_NAMES in TUI commands now imports from schema
instead of maintaining a hardcoded list. Adds tests verifying sync.

Updates README TUI Commands section with /model hot-swap documentation,
supported providers, and runtime model switching examples.
2026-02-10 21:26:18 -08:00