flynn/.planning/codebase/INTEGRATIONS.md

# External Integrations

**Analysis Date:** 2026-02-09

## AI Model Providers

Flynn supports 10 model providers via a unified `ModelClient` interface (`src/models/types.ts`). Each provider implements `chat()` and optionally `chatStream()`. The `ModelRouter` (`src/models/router.ts`) manages tier-based routing (fast/default/complex/local) with fallback chains.

**Anthropic:**
- SDK: `@anthropic-ai/sdk` (`src/models/anthropic.ts`)
- Auth: `ANTHROPIC_API_KEY` env var or `api_key` in config
- Features: Streaming, tool use, extended thinking mode, multimodal (images)
- Extended thinking: `{ type: 'enabled', budget_tokens: 4096 }` on request

**OpenAI:**
- SDK: `openai` (`src/models/openai.ts`)
- Auth: `OPENAI_API_KEY` env var or `api_key` in config
- Features: Tool use, multimodal (images via data URIs or URLs)
- Also powers: OpenRouter, ZhipuAI, xAI via `baseURL` override

**Google Gemini:**
- SDK: `@google/generative-ai` (`src/models/gemini.ts`)
- Auth: `GOOGLE_API_KEY` env var or `api_key` in config
- Features: Streaming, tool use, extended thinking, multimodal

**AWS Bedrock:**
- SDK: `@aws-sdk/client-bedrock-runtime` (`src/models/bedrock.ts`)
- Auth: `AWS_REGION` env var + IAM credentials or explicit `accessKeyId`/`secretAccessKey` in config
- Features: Streaming (ConverseStream), tool use, multimodal
- Models: Meta Llama, Amazon Titan (cost-tracked in `src/models/costs.ts`)

**GitHub Models (Copilot):**
- SDK: `openai` (OpenAI-compatible API) (`src/models/github.ts`)
- Auth: `GITHUB_TOKEN` env var or OAuth device flow (`src/auth/github.ts`)
- Endpoint: `https://api.githubcopilot.com`
- Auto-fallback: When an Anthropic tier fails, Flynn automatically tries the same model via GitHub Models before the global fallback chain (`src/daemon/index.ts` `createAutoFallbackClient()`)
- OAuth device flow: Uses client ID `Ov23li8tweQw6odWQebz`, stores token at `~/.config/flynn/auth.json`

**OpenRouter:**
- SDK: `openai` with `baseURL: https://openrouter.ai/api/v1` (`src/daemon/index.ts`)
- Auth: `OPENROUTER_API_KEY` env var or `api_key` in config

**ZhipuAI:**
- SDK: `openai` with `baseURL: https://api.z.ai/api/paas/v4` (`src/daemon/index.ts`)
- Auth: `ZHIPUAI_API_KEY` env var or `api_key` in config

**xAI (Grok):**
- SDK: `openai` with `baseURL: https://api.x.ai/v1` (`src/daemon/index.ts`)
- Auth: `XAI_API_KEY` env var or `api_key` in config

**Ollama (Local):**
- SDK: `ollama` (`src/models/local/ollama.ts`)
- Auth: None (local server)
- Endpoint: Configurable `host` (default: `http://localhost:11434`)
- Config: `num_gpu` option for GPU layer control

**llama.cpp (Local):**
- SDK: Raw `fetch` HTTP calls (`src/models/local/llamacpp.ts`)
- Auth: Optional `auth_token` header
- Endpoint: Configurable (default: `http://localhost:8080`)

## Embedding Providers

Embedding providers (`src/memory/embeddings.ts`) power the hybrid vector + keyword search system. Factory function: `createEmbeddingProvider()`.

**OpenAI Embeddings:**
- SDK: `openai` (lazy import)
- Auth: `OPENAI_API_KEY` or config `api_key`
- Default model: `text-embedding-3-small`, default dims: 1536

**Gemini Embeddings:**
- SDK: `@google/generative-ai` (lazy import)
- Auth: `GOOGLE_API_KEY` or config `api_key`
- Uses `batchEmbedContents` for efficiency, default dims: 768

**Ollama Embeddings:**
- SDK: `ollama` (lazy import)
- Auth: None (local)
- Configurable host endpoint, default dims: 768

**LlamaCpp Embeddings:**
- SDK: Raw `fetch` to `/embedding` endpoint
- Auth: None
- Default endpoint: `http://localhost:8080`, default dims: 768

**Voyage AI Embeddings:**
- SDK: `openai` (OpenAI-compatible API, lazy import)
- Auth: `VOYAGE_API_KEY` env var or config `api_key`
- Endpoint: `https://api.voyageai.com/v1`, default dims: 1024

## Data Storage

**Session Database (SQLite):**
- Library: `better-sqlite3` (`src/session/store.ts`)
- Location: `{dataDir}/sessions.db`
- Schema: `messages` table with `id`, `session_id`, `role`, `content`, `created_at`
- TTL-based pruning: Configurable via `sessions.ttl` (default: 30 days), hourly cleanup

**Vector Database (SQLite):**
- Library: `better-sqlite3` (`src/memory/vector-store.ts`)
- Location: `{dataDir}/vectors.db`
- Stores embedding chunks as `Float32Array` BLOBs
- Content hashing for deduplication
- Background indexer runs every 30 seconds

**Memory Store (Filesystem):**
- Location: `{dataDir}/memory/` (`src/memory/store.ts`)
- Format: Markdown files organized by namespace
- Layout: `global.md`, `user.md`, `sessions/{id}.md`
- Hybrid search: Keyword + vector (configurable weight via `hybrid_weight`, default 0.7)

**File Storage:**
- Local filesystem only — no cloud object storage

**Caching:**
- In-memory response cache for web fetch tool (5-minute TTL) (`src/tools/builtin/web-fetch.ts`)
- No external cache service (Redis, etc.)

## Channel Adapters (Messaging Platforms)

All adapters implement `ChannelAdapter` interface (`src/channels/types.ts`): `connect()`, `disconnect()`, `send()`, `onMessage()`.

**Telegram:**
- SDK: `grammy` (`src/channels/telegram/`)
- Auth: Bot token via `telegram.bot_token` config
- Features: Long polling, chat ID allowlist, mention requirement, pairing codes, image/audio attachments

**Discord:**
- SDK: `discord.js` (`src/channels/discord/`)
- Auth: Bot token via `discord.bot_token` config
- Features: Guild/channel allowlists, mention requirement, pairing codes

**Slack:**
- SDK: `@slack/bolt` (`src/channels/slack/`)
- Auth: `bot_token`, `app_token`, `signing_secret` in config
- Features: Socket mode, channel allowlists, mention requirement, pairing codes

**WhatsApp:**
- SDK: `whatsapp-web.js` (`src/channels/whatsapp/`)
- Auth: QR code scanning (web client emulation)
- Features: Number/group allowlists, mention requirement, custom data directory, pairing codes

**WebChat:**
- Implementation: Gateway WebSocket bridge (`src/channels/webchat/`)
- Auth: Gateway token or Tailscale identity
- UI: Vanilla JS dashboard at `src/gateway/ui/` (HTML + CSS + JS, no framework)

## Authentication & Identity

**GitHub OAuth (Device Flow):**
- Implementation: `src/auth/github.ts`
- Client ID: `Ov23li8tweQw6odWQebz` (GitHub Copilot)
- Flow: Device code → User authorization → Token polling
- Storage: `~/.config/flynn/auth.json` (600 permissions)
- Priority: `GITHUB_TOKEN` env → stored OAuth token → `null`

**Gateway Auth:**
- Static bearer token (`server.token` in config)
- Tailscale identity header trust (`server.tailscale_identity`)
- HTTP auth optional (`server.auth_http`)
- Gateway lock: Single-client WebSocket mode (`server.lock`)

**DM Pairing Codes:**
- Implementation: `src/channels/pairing.ts`, `src/session/store.ts` (SQLite persistence)
- Purpose: Authenticate unknown senders via one-time codes
- Config: `pairing.enabled`, `pairing.code_ttl` (default 5m), `pairing.code_length` (default 6)
- Gateway handlers for code generation/verification
- TUI `/pair` command execution (generate/list/revoke) in `src/frontends/tui/minimal.ts`
- Persistence: `PairingStore` interface with SQLite `pairing_approved` table -- approved senders survive daemon restarts

**Gmail OAuth2:**
- SDK: `googleapis` (`src/automation/gmail.ts`)
- Credentials: `~/.config/flynn/gmail-credentials.json`
- Token: `~/.config/flynn/gmail-token.json`
- Setup: `flynn gmail-auth` CLI command

## Automation

**Cron Scheduler:**
- Library: `croner` (`src/automation/cron.ts`)
- Config: `automation.cron[]` — each job has `name`, `schedule`, `message`, `output.channel`, `output.peer`
- Implements `ChannelAdapter` to inject cron-triggered messages into the channel registry
- Features: Enable/disable per job, timezone support, runtime management tools

**Webhooks:**
- Implementation: `src/automation/webhooks.ts`
- Auth: HMAC-SHA256 signature verification (`X-Webhook-Signature` header)
- Templates: `{{body}}` and `{{json.field}}` placeholders
- Route: `POST /webhooks/{name}` on the gateway HTTP server
- Config: `automation.webhooks[]` with `name`, `secret`, `message`, `output`

**Gmail Watcher:**
- SDK: `googleapis` (`src/automation/gmail.ts`)
- Modes: Pub/Sub push notifications or polling fallback
- Pub/Sub topic: `projects/flynn-agent/topics/gmail-push`
- Watch renewal: Every 6 days (Google watch expires at ~7 days)
- Config: `automation.gmail` with `watch_labels`, `poll_interval`, `history_start`
- Route: `POST /gmail/push` on gateway for Pub/Sub push

**Heartbeat Monitor:**
- Implementation: `src/automation/heartbeat.ts`
- Checks: gateway, model, channels, memory, disk
- Config: `automation.heartbeat` with `interval`, `checks`, `failure_threshold`, `disk_threshold_mb`
- Notification: Sends to configured channel/peer on failures

## Web & Content Tools

**Web Search (Brave / SearXNG):**
- Implementation: `src/tools/builtin/web-search.ts`
- Brave Search API: `https://api.search.brave.com/res/v1/web/search`
  - Auth: `X-Subscription-Token` header via `web_search.api_key`
- SearXNG: Self-hosted instance via `web_search.endpoint`
  - Auth: None (private instance)
- Config: `web_search.provider` (`brave` or `searxng`), `web_search.max_results`

**Web Fetch (Readability):**
- Libraries: `linkedom`, `@mozilla/readability`, `turndown` (`src/tools/builtin/web-fetch.ts`)
- Features: HTML → Markdown conversion, article extraction, response caching (5min TTL)
- Truncation: 50,000 character max

**Browser Automation:**
- Library: `puppeteer-core` (`src/tools/builtin/browser/`)
- Config: `browser.executable_path` or `browser.ws_endpoint`
- Features: Headless browsing, page management, screenshots
- Limits: `browser.max_pages` (default 5), `browser.default_timeout` (default 30s)

## Audio Transcription

**Whisper-Compatible API:**
- Implementation: `src/models/media.ts`
- Endpoint: Configurable via `audio.transcription_endpoint`
- Auth: `audio.transcription_api_key` (Bearer token)
- Model: `audio.transcription_model` (default: `whisper-1`)
- Supported formats: OGG, MP3, WAV, WebM, MP4, M4A
- Integration: Auto-transcribes audio attachments from channels before model processing

**Native Audio Passthrough:**
- Implementation: `src/models/capabilities.ts`, `src/daemon/routing.ts`
- Capability check: `supportsAudioInput(provider, model, override?)` determines if a model can process raw audio
- Audio-capable providers: Gemini (`inlineData`), OpenAI (`input_audio`), GitHub (`input_audio`)
- Non-audio providers: Anthropic, Bedrock, Ollama, llama.cpp (fall back to Whisper transcription)
- Config override: `supports_audio: true/false` per model tier overrides auto-detection
- Smart routing: `createMessageRouter()` checks capability, passes raw `AudioSource` for capable models or transcribes via Whisper for others
- Audio content types: `AudioSource` (`{ type: 'audio', data: string, mimeType: string }`) in `src/models/types.ts`
- Token estimation: `estimateAudioTokens()` in `src/context/tokens.ts` (base64 length -> bytes -> duration at 16kbps -> tokens at 32/sec)

**Agent Tool: audio.transcribe:**
- Implementation: `src/tools/builtin/audio-transcribe.ts`
- Transcribes audio files on-demand via the configured Whisper-compatible endpoint
- Input: file path or base64 data with MIME type
- Output: transcribed text

## MCP (Model Context Protocol)

**MCP Client:**
- SDK: `@modelcontextprotocol/sdk` (`src/mcp/client.ts`)
- Transport: stdio (spawns external processes)
- Config: `mcp.servers[]` with `name`, `command`, `args`, `env`, `cwd`
- Bridge: MCP tools auto-registered in Flynn's tool registry (`src/mcp/bridge.ts`)
- Management: `McpManager` starts/stops all configured servers (`src/mcp/manager.ts`)

## Docker Sandbox

**Per-Session Containers:**
- Implementation: `src/sandbox/manager.ts`, `src/sandbox/docker.ts`
- Config: `sandbox.image` (default: `node:22-slim`), `sandbox.network` (default: `none`), `sandbox.memory_limit`, `sandbox.cpu_limit`
- Features: Lazily created per session, replaces `shell.exec` and `process.start` tools with sandboxed versions
- Prerequisite: Docker daemon available

## Networking & Exposure

**Gateway Server:**
- Protocol: WebSocket (JSON-RPC) + HTTP (`src/gateway/server.ts`)
- Default port: 18800
- Binding: `127.0.0.1` (localhost only) or `0.0.0.0`
- Features: LaneQueue for request ordering, session bridge, static file serving for dashboard

**Tailscale Serve:**
- Implementation: `src/gateway/tailscale.ts`
- Purpose: Expose gateway HTTPS endpoint on tailnet
- Config: `server.tailscale.serve`, `server.tailscale.hostname`, `server.tailscale.port`
- Prerequisite: Tailscale CLI installed and daemon running

## Monitoring & Observability

**Error Tracking:**
- None (console.error only)

**Logging:**
- `console.log` / `console.error` / `console.debug` throughout
- No structured logging framework

**Cost Tracking:**
- Built-in: `src/models/costs.ts` with per-million-token pricing for known models
- Tracks: Anthropic, OpenAI, Gemini, xAI, Bedrock models
- GitHub Copilot models tracked at $0 (subscription-included)
- Usage exposed via `/usage` command and gateway `system.usage` RPC

## CI/CD & Deployment

**Hosting:**
- Self-hosted (designed for personal deployment)
- Process supervisor expected for restarts (exit code 75 = restart signal)

**CI Pipeline:**
- Not detected in repository

## Environment Configuration

**Required env vars (minimum viable):**
- `ANTHROPIC_API_KEY` (or other model provider key)
- `FLYNN_TELEGRAM_TOKEN` (if using default Telegram channel)

**Optional env vars (by feature):**
- `OPENAI_API_KEY` - OpenAI models and embeddings
- `GOOGLE_API_KEY` - Gemini models and embeddings
- `GITHUB_TOKEN` - GitHub Models / Copilot access
- `AWS_REGION` - Bedrock region
- `OPENROUTER_API_KEY` - OpenRouter access
- `ZHIPUAI_API_KEY` - ZhipuAI access
- `XAI_API_KEY` - xAI (Grok) access
- `VOYAGE_API_KEY` - Voyage AI embeddings
- `FLYNN_DATA_DIR` - Custom data directory

**Secrets location:**
- API keys: YAML config (with `${ENV_VAR}` expansion) or environment variables
- OAuth tokens: `~/.config/flynn/auth.json` (GitHub), `~/.config/flynn/gmail-token.json` (Gmail)
- `.env.example` present at project root

## Webhooks & Callbacks

**Incoming:**
- `POST /webhooks/{name}` - Named webhooks with HMAC-SHA256 verification (`src/automation/webhooks.ts`)
- `POST /gmail/push` - Google Pub/Sub push notifications for Gmail (`src/automation/gmail.ts`)

**Outgoing:**
- None (no outbound webhooks — all communication goes through channel adapters)

---

*Integration audit: 2026-02-09*