5c531a760d
- README: add audio.transcribe to tool list, update media pipeline description, add Native Audio Support and Audio Transcription config sections, add supports_audio per-tier override example - SOUL.md: add audio.transcribe to available tools list - CHANGELOG: add native audio support and audio.transcribe tool entries - config/default.yaml: add commented audio config section, supports_audio hint - INTEGRATIONS.md: expand audio section with native passthrough, capabilities, smart routing, AudioSource type, token estimation, audio.transcribe tool - STRUCTURE.md: add capabilities.ts and audio-transcribe.ts to key file listings - ARCHITECTURE.md: update data flow step 5 to describe smart audio routing
17 KiB
17 KiB
Codebase Structure
Analysis Date: 2025-02-09
Directory Layout
flynn/
├── src/ # All TypeScript source code
│ ├── agents/ # Named agent configs + routing
│ ├── auth/ # GitHub device flow auth
│ ├── automation/ # Cron, webhooks, heartbeat, Gmail
│ ├── backends/ # AI agent implementations
│ │ └── native/ # NativeAgent + AgentOrchestrator
│ ├── channels/ # Multi-platform messaging adapters
│ │ ├── discord/ # Discord adapter
│ │ ├── slack/ # Slack adapter
│ │ ├── telegram/ # Telegram adapter
│ │ ├── webchat/ # WebChat adapter (wraps gateway)
│ │ └── whatsapp/ # WhatsApp adapter
│ ├── cli/ # CLI commands (commander.js)
│ ├── config/ # YAML config loading + Zod schema
│ ├── context/ # Token estimation + compaction
│ ├── daemon/ # Daemon bootstrap + lifecycle
│ ├── frontends/ # UI frontends
│ │ ├── telegram/ # Telegram bot (legacy direct integration)
│ │ └── tui/ # Terminal UI (readline + React/Ink)
│ │ └── components/ # Ink React components
│ ├── gateway/ # WebSocket JSON-RPC server
│ │ ├── handlers/ # JSON-RPC method handlers
│ │ └── ui/ # Vanilla JS dashboard
│ │ ├── lib/ # Shared JS (ws-client)
│ │ └── pages/ # Page JS (chat, dashboard, sessions, settings, usage)
│ ├── hooks/ # Tool confirmation engine
│ ├── mcp/ # Model Context Protocol bridge
│ ├── memory/ # Persistent memory + vector search
│ ├── models/ # LLM provider clients + router
│ │ └── local/ # Ollama + llama.cpp clients
│ ├── prompt/ # System prompt template assembly
│ ├── sandbox/ # Docker sandbox for tool execution
│ ├── session/ # SQLite session persistence
│ ├── skills/ # Pluggable skill system
│ └── tools/ # Tool registry, executor, policy
│ └── builtin/ # Built-in tool implementations
│ ├── browser/ # Puppeteer browser tools
│ └── process/ # Background process tools
├── config/ # Example/default config files
├── docs/ # Documentation
│ └── plans/ # Planning docs
├── .planning/ # GSD planning documents
│ └── codebase/ # Codebase analysis (this file)
├── AGENTS.md # Agent instructions for Claude Code
├── CHANGELOG.md # Version changelog
├── CLAUDE.md # Claude Code shared memory
├── SOUL.md # Flynn's AI personality/identity
├── Dockerfile # Docker build config
├── docker-compose.yml # Docker Compose config
├── package.json # Node.js package manifest
├── tsconfig.json # TypeScript configuration
└── pnpm-lock.yaml # Lockfile
Directory Purposes
src/agents/:
- Purpose: Named agent configurations and routing logic
- Contains:
AgentConfigRegistry(stores named configs),AgentRouter(resolves channel+sender → agent config) - Key files:
src/agents/registry.ts,src/agents/router.ts
src/auth/:
- Purpose: GitHub device flow authentication for GitHub Models provider
- Contains: OAuth device flow implementation
- Key files:
src/auth/github.ts
src/automation/:
- Purpose: Scheduled jobs, incoming webhooks, health monitoring, Gmail watching
- Contains: CronScheduler (croner-based), WebhookHandler (HMAC auth), HeartbeatMonitor, GmailWatcher
- Key files:
src/automation/cron.ts,src/automation/webhooks.ts,src/automation/heartbeat.ts,src/automation/gmail.ts
src/backends/native/:
- Purpose: Core AI agent implementation — message processing and tool execution loop
- Contains:
NativeAgent(tool loop),AgentOrchestrator(delegation/compaction/memory wrapper), prompt templates - Key files:
src/backends/native/agent.ts,src/backends/native/orchestrator.ts,src/backends/native/prompts.ts,src/backends/native/attachments.ts
src/channels/:
- Purpose: Platform-agnostic messaging layer with uniform adapter interface
- Contains:
ChannelAdapterinterface,ChannelRegistry, per-platform adapter directories,PairingManager - Key files:
src/channels/types.ts,src/channels/registry.ts,src/channels/pairing.ts,src/channels/utils.ts
src/cli/:
- Purpose: CLI command definitions and entry point
- Contains: Commander.js command registrations, config loading helpers
- Key files:
src/cli/index.ts(entry point),src/cli/start.ts,src/cli/tui.ts,src/cli/send.ts,src/cli/sessions.ts,src/cli/doctor.ts,src/cli/config-cmd.ts,src/cli/completion.ts,src/cli/shared.ts
src/config/:
- Purpose: Configuration loading and validation
- Contains: YAML loader with
${ENV_VAR}expansion, comprehensive Zod schema - Key files:
src/config/schema.ts(all config types),src/config/loader.ts(YAML parse + validate)
src/context/:
- Purpose: Conversation context management — token counting and history compaction
- Contains: Token estimator (rule-based, no tokenizer), compaction logic using delegation
- Key files:
src/context/tokens.ts,src/context/compaction.ts
src/daemon/:
- Purpose: Daemon bootstrap — wires all subsystems together and manages lifecycle
- Contains:
startDaemon()function (1088 lines),Lifecyclemanager, model client factory, message router - Key files:
src/daemon/index.ts(main orchestration),src/daemon/lifecycle.ts
src/frontends/telegram/:
- Purpose: Legacy direct Telegram bot integration with confirmation UI
- Contains: Bot handlers, confirmation keyboard management
- Key files:
src/frontends/telegram/bot.ts,src/frontends/telegram/handlers.ts,src/frontends/telegram/confirmations.ts
src/frontends/tui/:
- Purpose: Terminal user interface with two modes
- Contains: Minimal readline TUI, fullscreen React/Ink TUI, markdown rendering, slash commands
- Key files:
src/frontends/tui/minimal.ts,src/frontends/tui/fullscreen.ts,src/frontends/tui/commands.ts,src/frontends/tui/markdown.ts
src/gateway/:
- Purpose: WebSocket JSON-RPC server + HTTP server + dashboard
- Contains:
GatewayServer,Router(method dispatch),SessionBridge(WS → NativeAgent),LaneQueue(request serialization), auth, protocol, static file serving, Tailscale Serve integration - Key files:
src/gateway/server.ts,src/gateway/router.ts,src/gateway/session-bridge.ts,src/gateway/lane-queue.ts,src/gateway/protocol.ts,src/gateway/auth.ts,src/gateway/static.ts,src/gateway/tailscale.ts
src/gateway/handlers/:
- Purpose: JSON-RPC method handler implementations
- Contains: Handler factories for system, session, tool, agent, config, and pairing methods
- Key files:
src/gateway/handlers/system.ts,src/gateway/handlers/sessions.ts,src/gateway/handlers/agent.ts,src/gateway/handlers/tools.ts,src/gateway/handlers/config.ts,src/gateway/handlers/pairing.ts
src/gateway/ui/:
- Purpose: Vanilla JS web dashboard served by gateway HTTP server
- Contains: HTML pages, CSS, client-side JavaScript
- Key files:
src/gateway/ui/index.html,src/gateway/ui/chat.html,src/gateway/ui/style.css,src/gateway/ui/app.js,src/gateway/ui/lib/ws-client.js
src/hooks/:
- Purpose: Tool execution confirmation engine with glob-pattern matching
- Contains:
HookEnginewith pending confirmation queue - Key files:
src/hooks/engine.ts
src/mcp/:
- Purpose: Model Context Protocol integration — start external MCP servers and bridge their tools
- Contains:
McpClient,McpManager, tool bridging utilities - Key files:
src/mcp/client.ts,src/mcp/manager.ts,src/mcp/bridge.ts
src/memory/:
- Purpose: Persistent memory system with keyword + vector hybrid search
- Contains:
MemoryStore(namespace-based markdown files),VectorStore(SQLite),HybridSearch, embedding providers, text chunker - Key files:
src/memory/store.ts,src/memory/vector-store.ts,src/memory/hybrid-search.ts,src/memory/embeddings.ts,src/memory/chunker.ts
src/models/:
- Purpose: LLM provider client implementations and tier-based routing
- Contains: Provider clients,
ModelRouter, retry logic, cost estimation, media helpers - Key files:
src/models/types.ts(core interfaces),src/models/router.ts,src/models/anthropic.ts,src/models/openai.ts,src/models/gemini.ts,src/models/bedrock.ts,src/models/github.ts,src/models/retry.ts,src/models/costs.ts,src/models/media.ts,src/models/capabilities.ts
src/models/local/:
- Purpose: Local model provider clients
- Contains: Ollama and llama.cpp client implementations
- Key files:
src/models/local/ollama.ts,src/models/local/llamacpp.ts
src/prompt/:
- Purpose: System prompt assembly from template files
- Contains: Template search across directories (SOUL.md, AGENTS.md, IDENTITY.md, USER.md, TOOLS.md)
- Key files:
src/prompt/template.ts
src/sandbox/:
- Purpose: Docker container isolation for shell/process tool execution
- Contains:
DockerSandbox(container lifecycle),SandboxManager(per-session containers), sandboxed tool wrappers - Key files:
src/sandbox/docker.ts,src/sandbox/manager.ts,src/sandbox/tools.ts
src/session/:
- Purpose: Conversation history persistence
- Contains:
SessionStore(SQLite),SessionManager(in-memory cache),ManagedSession - Key files:
src/session/store.ts,src/session/manager.ts
src/skills/:
- Purpose: Pluggable skill system — load skills from bundled, managed, and workspace directories
- Contains:
SkillRegistry,SkillInstaller, skill loader, skill type definitions - Key files:
src/skills/registry.ts,src/skills/installer.ts,src/skills/loader.ts,src/skills/types.ts
src/tools/:
- Purpose: Tool abstraction layer — registry, execution, policy enforcement
- Contains:
ToolRegistry,ToolExecutor,ToolPolicy, type definitions - Key files:
src/tools/types.ts,src/tools/registry.ts,src/tools/executor.ts,src/tools/policy.ts
src/tools/builtin/:
- Purpose: Built-in tool implementations shipped with Flynn
- Contains: Shell exec, file operations, web fetch, memory ops, web search, media send, image analysis, session management, agent listing, cross-channel messaging, cron management
- Key files:
src/tools/builtin/shell.ts,src/tools/builtin/file-read.ts,src/tools/builtin/file-write.ts,src/tools/builtin/file-edit.ts,src/tools/builtin/file-patch.ts,src/tools/builtin/file-list.ts,src/tools/builtin/web-fetch.ts,src/tools/builtin/web-search.ts,src/tools/builtin/memory-read.ts,src/tools/builtin/memory-write.ts,src/tools/builtin/memory-search.ts,src/tools/builtin/media-send.ts,src/tools/builtin/image-analyze.ts,src/tools/builtin/audio-transcribe.ts,src/tools/builtin/system-info.ts,src/tools/builtin/sessions.ts,src/tools/builtin/agents-list.ts,src/tools/builtin/message-send.ts,src/tools/builtin/cron.ts
src/tools/builtin/browser/:
- Purpose: Puppeteer-based browser automation tools
- Contains:
BrowserManager(page lifecycle), browser tool implementations (navigate, screenshot, click, type, content, eval) - Key files:
src/tools/builtin/browser/manager.ts,src/tools/builtin/browser/tools.ts
src/tools/builtin/process/:
- Purpose: Background process management tools
- Contains:
ProcessManager, tools for start/status/output/kill/list - Key files:
src/tools/builtin/process/manager.ts,src/tools/builtin/process/start.ts,src/tools/builtin/process/status.ts,src/tools/builtin/process/output.ts,src/tools/builtin/process/kill.ts,src/tools/builtin/process/list.ts
Key File Locations
Entry Points:
src/cli/index.ts: CLI entry point (binary:flynn)src/daemon/index.ts: Daemon bootstrap (startDaemon()) — the central wiring pointsrc/gateway/server.ts: Gateway WebSocket + HTTP serversrc/frontends/tui/minimal.ts: TUI readline modesrc/frontends/tui/fullscreen.ts: TUI fullscreen Ink mode
Configuration:
src/config/schema.ts: Complete Zod config schema — all types defined heresrc/config/loader.ts: YAML parse + env expansion + Zod validationtsconfig.json: TypeScript compiler config (strict, ES2022, NodeNext)package.json: Dependencies and scriptsconfig/: Example/default config files directory
Core Logic:
src/backends/native/agent.ts: NativeAgent — the AI tool loopsrc/backends/native/orchestrator.ts: AgentOrchestrator — delegation, compaction, memorysrc/models/router.ts: ModelRouter — tier-based model selection with fallbacksrc/tools/executor.ts: ToolExecutor — policy check → hook check → executesrc/channels/registry.ts: ChannelRegistry — adapter lifecycle + message routingsrc/daemon/index.ts: startDaemon() — wires everything together
Testing:
- Test files are co-located with source:
src/path/to/file.test.tsalongsidesrc/path/to/file.ts - No separate test directory
Naming Conventions
Files:
- Source files:
camelCase.ts(e.g.,session-bridge.ts,lane-queue.ts) — actuallykebab-case.ts - React components:
camelCase.tsinsrc/frontends/tui/components/ - Test files:
*.test.tssuffix (e.g.,agent.test.ts,registry.test.ts) - Index files:
index.tsbarrel exports in every directory - Type-only files:
types.tsfor pure type definitions
Directories:
- Feature-based:
kebab-case/(e.g.,web-search,file-read) - Platform subdirs:
lowercase/(e.g.,telegram/,discord/,slack/) - Nested features:
parent/child/(e.g.,tools/builtin/browser/)
Exports:
- Every directory has an
index.tsbarrel file that re-exports public APIs - Types use
export typefor type-only exports - Registration chain flows: implementation →
builtin/index.ts→tools/index.ts→daemon/index.ts
Where to Add New Code
New Channel Adapter:
- Create directory:
src/channels/<platform>/ - Create:
adapter.ts(implementsChannelAdapter),index.ts(re-exports) - Add test:
adapter.test.ts - Register in:
src/channels/index.ts(export),src/daemon/index.ts(instantiate + register) - Add config:
src/config/schema.ts(new optional schema block)
New Model Provider:
- Create:
src/models/<provider>.ts(implementsModelClient) - Add export:
src/models/index.ts - Add case:
src/daemon/index.ts→createClientFromConfig()switch statement - Add config:
src/config/schema.ts→modelConfigBaseSchema.providerenum
New Tool:
- Static tool (no deps): Create
src/tools/builtin/<name>.ts, export const - Factory tool (needs deps): Create
src/tools/builtin/<name>.ts, export function - Add to:
src/tools/builtin/index.ts(export + add toallBuiltinToolsif static) - Add to:
src/tools/index.ts(re-export) - Register in:
src/daemon/index.ts(call factory + register withtoolRegistry) - Add to profiles:
src/tools/policy.ts→PROFILE_TOOLSif needed
New Tool Group (multi-tool):
- Create directory:
src/tools/builtin/<group>/ - Create:
manager.ts(shared state), individual tool files,index.ts - Follow pattern of
src/tools/builtin/process/orsrc/tools/builtin/browser/
New Gateway Handler:
- Create:
src/gateway/handlers/<domain>.ts(exportcreateXxxHandlers()) - Add to:
src/gateway/handlers/index.ts - Register in:
src/gateway/server.ts→registerHandlers()
New Automation Type:
- Create:
src/automation/<type>.ts - If it produces messages: implement
ChannelAdapterinterface - Add to:
src/automation/index.ts - Register in:
src/daemon/index.ts - Add config:
src/config/schema.ts→automationSchema
New CLI Command:
- Create:
src/cli/<command>.ts→ exportregisterXxxCommand(program) - Register in:
src/cli/index.ts
Utilities:
- Shared helpers: Place in the most specific layer that uses them
- Cross-cutting:
src/channels/utils.tsfor channel utils,src/models/media.tsfor media utils - No global
utils/directory — utilities are co-located with their domain
Special Directories
dist/:
- Purpose: Compiled JavaScript output
- Generated: Yes (by
pnpm build/tsc) - Committed: No (in
.gitignore)
node_modules/:
- Purpose: Installed dependencies
- Generated: Yes (by
pnpm install) - Committed: No (in
.gitignore)
config/:
- Purpose: Example/default configuration files
- Generated: No
- Committed: Yes
src/gateway/ui/:
- Purpose: Static web dashboard (vanilla HTML/CSS/JS, not compiled)
- Generated: No — hand-written vanilla JS
- Committed: Yes
- Note: Served by the gateway HTTP server at runtime from
dist/gateway/ui/
.planning/:
- Purpose: GSD planning and analysis documents
- Generated: By analysis tools
- Committed: Yes
docs/plans/:
- Purpose: Feature planning documents and state tracking
- Generated: No
- Committed: Yes
Structure analysis: 2025-02-09