Flynn AGENTS.md

General Rules

Parallelise with subagents: For every task, use multiple subagents with the appropriate model to work more efficiently. Dispatch independent subtasks in parallel rather than executing them sequentially.
Subagent model selection (MANDATORY): You MUST use the right model tier for each subagent — claude-haiku-4.5 for fast/simple/mechanical tasks, claude-sonnet-4.6 for default/standard implementation work, and claude-opus-4.6 for complex reasoning or architecture decisions. Never use the same model for all subagents.
Minimize main agent context: Always delegate tasks to subagents with the right model to execute each task more efficiently and keep the main agent context window usage minimum. The main agent should coordinate and synthesize, not perform detailed implementation work.
Project maturity + risk posture: Flynn is a local, work-in-progress project and is not in production. Breaking code is acceptable when necessary, but every change must include the right tests to validate expected behavior.
Commit often: git commit frequently — after each meaningful unit of work, not just at the end of a task.
Commit structure: Prefer small, atomic commits that each represent one meaningful unit of work. Keep unrelated changes out of the same commit. When a change affects docs/diagrams/state, update and commit those alongside the code change (not later).
Branch + merge workflow: Do work on a feature branch (git switch -c feature/<descriptive-slug>). Keep branches rebased onto main (avoid merge commits). When finished: git rebase main, then fast-forward merge back to main: git switch main && git merge --ff-only feature/<slug>, then delete the merged feature branch: git branch -d feature/<slug>.
Update state.json: After every feature implementation, modification, or significant change, update docs/plans/state.json accordingly — add new phases/entries, update test counts, adjust the overall_progress section, and update the feature_gap_scorecard if the gap analysis is affected. Commit state.json alongside the feature change, not as a separate afterthought.
Keep docs + diagrams current: When behavior, config keys, APIs, or architecture changes, update the relevant docs in the same change (README + docs/). If the change affects a documented flow, also update the corresponding Mermaid diagrams (e.g. docs/architecture/AGENT_DIAGRAM.md, docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md, docs/api/PROTOCOL.md) so they stay source-of-truth.
AI-optimized diagram updates are mandatory on any code change: Any code added or modified MUST include diagram review and updates in the same PR for docs/architecture/AGENT_DIAGRAM.md, docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md, and docs/api/PROTOCOL.md (or an explicit note in the PR/commit message explaining why no diagram change was needed).
Track OpenClaw evolution: Review OpenClaw updates (repo + docs + ecosystem) on a regular cadence (at least monthly, and before major Flynn planning cycles) to identify high-value assistant patterns and candidate features for Flynn. Capture actionable findings in docs/plans/ and reflect prioritized gaps in docs/plans/state.json.

Build, Lint, and Test Commands

# Build
pnpm build                    # Compile TypeScript to dist/

# Development
pnpm dev                      # Run daemon with watch mode
pnpm tui                      # Run TUI in minimal mode (readline)
pnpm tui:fs                   # Run TUI in fullscreen mode (React/Ink)
pnpm tui:dev                  # Run TUI with watch mode

# Run
pnpm start                    # Start production build

# Testing
pnpm test                     # Run tests in watch mode
pnpm test:run                 # Run tests once (no watch)
pnpm test:run src/path/to/file.test.ts  # Run a single test file

# Linting and Type Checking
pnpm lint                     # Run ESLint
pnpm typecheck                # Run TypeScript compiler (no emit)

Architecture

Flynn is a multi-channel AI assistant daemon. Messages flow: Channel Adapter → AgentOrchestrator → NativeAgent → ModelClient, with tools executed in the agent loop.

Core Abstractions

ModelClient (src/models/types.ts): chat(request): Promise<ChatResponse>. Providers: Anthropic, OpenAI, Gemini, Bedrock, Ollama, llama.cpp, GitHub Models, OpenRouter, Zhipu, xAI. Factory in src/daemon/index.ts (createClientFromConfig()). ModelRouter (src/models/router.ts) manages tiers (default/fast/complex/local) with fallback chains.

ChannelAdapter (src/channels/types.ts): connect(), disconnect(), send(), onMessage(). Adapters: Telegram, Discord, Slack, WhatsApp, Matrix, WebChat. Registered in ChannelRegistry, each channel+sender pair gets its own session.

Tool (src/tools/types.ts): { name, description, inputSchema, execute(args): Promise<ToolResult> }. Three patterns:

Static: export const fooTool: Tool = { ... } (no deps)
Factory: export function createFooTool(dep): Tool (single tool needing deps)
Multi-factory: export function createFooTools(dep): Tool[] (related tool set)

Registration chain: tool file → src/tools/builtin/index.ts → src/tools/index.ts → registered in src/daemon/index.ts.

Tool Policy (src/tools/policy.ts): Profiles (minimal/messaging/coding/full), groups (group:fs/runtime/web/memory), allow/deny with glob patterns.

NativeAgent (src/backends/native/agent.ts): Core agent loop with tool execution. AgentOrchestrator (src/backends/native/orchestrator.ts) wraps it with session management, compaction, memory extraction, and delegation to different model tiers.

Other Key Systems

Config: YAML + Zod validation (src/config/schema.ts). Supports ${ENV_VAR} expansion.
Sessions: SQLite via SessionStore (src/session/store.ts). TTL-based pruning.
Memory: Namespace-based files + hybrid search (keyword + vector). Embedding providers configurable.
Hooks: Pattern-based confirmation engine (src/hooks/). Actions: confirm/log/silent.
Sandbox: Docker per-session containers (src/sandbox/manager.ts).
Automation: Cron scheduler, webhooks (HMAC), heartbeat monitor, Gmail watcher (src/automation/).
Gateway: WebSocket JSON-RPC + HTTP server + vanilla JS dashboard (src/gateway/). Lane queue, gateway lock, Tailscale Serve.
Pairing: DM pairing codes for unknown sender auth (src/channels/pairing.ts). Gateway handlers + TUI command.
System Prompt: Template search for SOUL.md/AGENTS.md/IDENTITY.md/USER.md/TOOLS.md (src/prompt/template.ts).

Code Style Guidelines

Imports

Use .js extensions for imports (e.g., import { readFileSync } from 'fs';)
Use type keyword for type-only imports: import type { Config } from './schema.js';
Group imports: stdlib → third-party → local
Export all public APIs from index.ts

Naming Conventions

Classes/Interfaces/Types: PascalCase (AnthropicClient, NativeAgentConfig)
Functions: camelCase (loadConfig, expandEnvVars)
Private fields: camelCase with underscore (_client, _model)
Files: camelCase for .ts files, PascalCase for .tsx files
Test files: *.test.ts suffix

TypeScript Configuration

tsconfig.json uses strict mode
Target: ES2022
Module resolution: NodeNext
Module format: NodeNext
Enables declaration, declarationMap, and sourceMap for all builds
Requires Node.js >=22

Error Handling

Throw errors with descriptive messages
Check for undefined/null with context: throw new Error('Environment variable ${envVar} is not set')
In stream handlers, wrap errors in try-catch and yield error events
Use instanceof Error for error type checking: error instanceof Error ? error.message : 'Unknown error'
Catch errors and convert to appropriate error types with context

React/Ink Patterns

Use useCallback for event handlers to prevent unnecessary re-renders
Use useRef for mutable values that don't trigger re-renders
Define interface props explicitly
Event handlers receive (inputChar: string, key: KeyInfo) as arguments

Project Structure

src/
├── config/          # YAML config + validation
├── models/          # Model providers + router
├── session/         # SQLite persistence
├── hooks/           # Confirmation engine
├── daemon/          # Lifecycle management
├── frontends/       # Telegram bot, TUI
└── backends/native/ # Agent implementation

Testing

Use Vitest for testing
Follow describe/it pattern
Use expect() for assertions
Test both success and failure cases
Clean up test resources (files, dirs) in afterEach or it cleanup blocks
Mock file system and environment variables when needed

8.7 KiB Raw Permalink Blame History