# Agent-Oriented Project Diagram This is a high-signal, agent-oriented view of Flynn's structure and execution flow. If you're new to the codebase, start here, then jump to the referenced files. ## System Map (Boundaries + Trust) This is the fastest way to understand what runs where, and where the security boundaries sit. ```mermaid flowchart LR subgraph EXT[External Systems] MP[Model Providers\nAnthropic/OpenAI/Gemini/...\nvia ModelClient] CH[Chat Networks\nTelegram/Discord/Slack/WhatsApp/...] WEB[Web\nsearch/fetch targets] GOOG[Google APIs\nGmail/Calendar/Docs/Drive/Tasks] end subgraph CFG[Config Sources] CD[config/default.yaml] CO[config/profiles/*.overlay.yaml] CG[Generated config profile\nconfig/paas.yaml] CE[ENV vars + expansion] end subgraph HOST[Host (Flynn Daemon)] CA[ChannelAdapters] GW[Gateway\nHTTP + WS JSON-RPC + Web UI] MC[MetricsCollector\nrun state + cancel latency + reactions] RT[Routing\ncreateMessageRouter()] PF[Preferences\n~/.local/share/flynn/preferences.json\nmodelTier + backendMode] SM[SessionManager\nSQLite] OR[AgentOrchestrator] NA[NativeAgent\n(tool loop)] EB[Optional External Backends\nclaude_code/opencode/codex/gemini/pi_embedded] MR[ModelRouter] TP[ToolPolicy + ToolRegistry] TE[ToolExecutor\nhooks + enforcement + audit] MEM[Memory Store\nfiles + vector/keyword\nuser/profile + user/working] AU[Audit Logger\nredacted] HS[Hooks/Autonomy\nconfirm/log/silent] GA[Google OAuth Runtime\nsrc/google/oauth.ts] AS[Auth Store\n~/.config/flynn/auth.json] TF[Legacy Token Files\n~/.config/flynn/*-token.json] end subgraph SBX[Sandbox (per-session Docker)] ST[Sandboxed Tools\nshell/process/fs writes] FS[Sandbox FS\nworkspace mount (scoped)] NET[Sandbox Network\n(egress policy)] end CD --> CG CO --> CG CE --> CG CG --> RT CE --> RT PF --> RT CH --> CA GW --> RT GW --> MC CA --> RT RT --> SM RT --> OR RT --> EB OR --> NA OR -->|session-start memory| MEM EB --> MP NA --> MR MR --> MP NA --> TP TP --> TE TE --> HS TE --> AU TE --> MEM TE -->|high-risk tools| ST ST --> FS ST --> NET TE -->|web tools| WEB TE -->|google tools| GA GA --> GOOG GA <--> AS GA --> TF ``` ## Big Picture (Runtime Data Flow) ```text Inbound Message (Telegram/Discord/Slack/WhatsApp/WebChat) | v ChannelAdapter -> ChannelRegistry | | | v | createMessageRouter() | | | +----> Runtime backend mode overrides | (/runtime status|activate pi|deactivate pi|use config) | (TUI `/runtime` is forwarded via gateway/daemon command path) | | | v | SessionManager | | | v | AgentOrchestrator | | | v | NativeAgent | | | ModelRouter.chat() | | | v | ModelClient | +----> (optional, non-tool turns) ExternalBackend (claude_code/opencode/codex/gemini/pi_embedded) | +----> (optional) PairingManager gate for unknown senders Tool Calls (inside NativeAgent loop) NativeAgent -> ToolRegistry (policy-filtered) -> ToolExecutor | | | v | HookEngine + autonomy | | | v | Tool.execute() | | | v +---------------------------> AuditLogger (redacted) Subagent sessions (multi-turn child agents) parent AgentOrchestrator -> subagent.* tools -> SubagentManager (TTL cleanup + queue/budget controls) SubagentManager -> child AgentOrchestrator (session namespace: subagent::, trace_id) SubagentManager -> AuditLogger (subagent.lifecycle + subagent.turn events) child AgentOrchestrator -> NativeAgent/tool loop (same policy engine, recursion tools removed) Session start (when `memory.user_namespace` is set) AgentOrchestrator -> MemoryStore (user/profile + user/working) AgentOrchestrator -> System prompt (session context injection) Outbound Reply -> ChannelAdapter.send() (text + optional attachments) ``` Gateway streaming UX signals: - WebSocket `agent.send` emits `run_state` lifecycle events (`start`, `cancel_requested`, `cancelled`, `complete`, `error`) for UI/state rendering. - Routing applies reaction rules with deterministic priority/cooldown (and recursion guard) before intent routing. - Companion nodes re-register `node.*` capabilities after reconnect; runtime clients can auto-reconnect, optionally replay cached node state (`register/status/location/push`), and surface connection events. - `flynn companion --export-bootstrap ` can emit a resolved companion bootstrap manifest (gateway/node/runtime contract) for desktop/mobile packaging flows without opening a runtime connection. - `flynn companion --export-release-bundle ` can emit a distributable shell bundle (bootstrap JSON + launcher + README + SHA-256 checksums) for desktop/mobile packaging pipelines. - `flynn companion --export-release-bundle ... --signing-key ` can additionally emit `CHECKSUMS.sha256.sig` for signed artifact verification pipelines. - `flynn companion --verify-release-bundle ` can validate bundle checksums and optional signatures before installation or rollout. - `flynn companion --export-shell-template ` can emit platform starter shell templates (macOS/iOS/Android native scaffold files + bootstrap JSON), including iOS/Android runtime skeletons for registration, heartbeat/status, location/push updates, and handoff. - `pnpm companion:bundle -- --output ...` runs a build-and-verify release pipeline for repeatable companion artifact generation. - `pnpm companion:reference-apps -- --output apps/companion` regenerates in-repo macOS/iOS/Android reference app starter surfaces plus a runnable `macos-app` Swift Package menu-bar scaffold. - `pnpm companion:reference-apps:check` verifies committed `apps/companion` artifacts match generator output. - `.github/workflows/companion-release-bundle.yml` provides CI artifact generation for companion release bundles using the same build-and-verify pipeline. - `.github/workflows/companion-reference-apps-check.yml` enforces reference-app generator sync in CI. - `flynn companion` can bootstrap status/location/push metadata on connect (`node.status.set` + optional `node.location.set`/`node.push_token.set`) so thin companion shells can register operational context in one launch. - `pnpm audit:phase0-baseline:live` captures anonymized channel-origin live run/reaction baseline artifacts from real audit logs. - `pnpm audit:phase0-baseline:live:pi` and `pnpm audit:phase0-baseline:live:native` capture backend-scoped channel windows using `backend.route` timelines. - `pnpm audit:phase0-baseline:live:gateway` captures gateway-origin baseline windows by auto-selecting the latest cancel/cancelled session window (or use `scripts/capture-phase0-live-baseline.ts --source gateway --since ... --until ...` for explicit windows). - `pnpm audit:phase0-baseline:live:refresh` runs channel + gateway + backend-scoped (`pi_embedded` and `native`) capture commands in one cadence step. - `pnpm audit:phase0-baseline:live:drift` evaluates backend-scoped artifact freshness/drift gates and writes `docs/plans/artifacts/phase0_baseline_live_backend_drift_.md/.json`; `pnpm audit:phase0-baseline:live:refresh:drift` runs capture + drift checks in one cadence step. - `pnpm audit:phase0-baseline:live:refresh:drift:rolling` runs the same full refresh+drift flow with a shared UTC timestamp tag (`YYYY-MM-DD-HHMMSS`) so each cadence run keeps distinct backend/drift artifacts for immediate baseline-vs-prior comparisons. - `pnpm audit:phase0-baseline:live:prune` provides dry-run retention planning for rolling-tag artifacts; `pnpm audit:phase0-baseline:live:prune:apply` deletes older rolling snapshots while keeping the newest tags per artifact family. - `pnpm audit:phase0-baseline:live:refresh:drift:rolling:prune` chains rolling refresh+drift with retention apply for one-command scheduled cadence runs (`KEEP_PER_FAMILY` controls retention depth). - `audit:phase0-baseline:live*` scripts are cadence-safe by default (UTC-date tags auto-generated unless explicitly overridden). - Canvas artifacts are persisted by the gateway so session UI surfaces can recover after daemon restarts. - TTS synthesis uses an ordered provider chain with health cooldown tracking; if all providers fail, replies degrade to text-only without dropping the response. - Talk mode accepts spoken/text `stop`/`cancel` while active and maps it onto the same `/stop` run-control cancellation path used for text sessions. - Setup/onboarding now applies a first-run Personal Assistant Mode preset, prints model/channel/memory/automation live readiness checks, and emits a guided first-success task path after config save. Key files: - Routing + per-session agent creation: `src/daemon/routing.ts` - Subagent session manager (child orchestrators): `src/backends/native/subagents.ts` - Runtime preference persistence (`modelTier`, `backendMode`): `src/preferences.ts` - Orchestration: `src/backends/native/orchestrator.ts` - Tool loop: `src/backends/native/agent.ts` - External backend adapters: `src/backends/external.ts`, `src/backends/piEmbedded.ts` - Model routing: `src/models/router.ts` - Tool policy + execution: `src/tools/policy.ts`, `src/tools/executor.ts` - Canary backend telemetry summarization (offline evaluation): `src/audit/backendCanarySummary.ts`, `scripts/summarize-backend-canary.ts` - Phase 0 baseline telemetry summarization: `src/audit/phase0BaselineSummary.ts`, `scripts/summarize-phase0-baseline.ts` ## Component Graph (Agent-Safety Boundary) ```text +---------------------------+ | Config | | (Zod schema + YAML) | | src/config/schema.ts | +-------------+-------------+ | v +-------------------+ +-------------+ +------------------+ | SkillRegistry | | ToolPolicy | | HookEngine | | src/skills/* | | src/tools/* | | src/hooks/* | +---------+---------+ +------+------+ +---------+--------+ | | | | (system prompt) | (allow/deny) | (confirm/log/silent) v v v +-------------------+ +-------------+ +------------------+ | System Prompt | | ToolRegistry| | ToolExecutor | | src/daemon/services.ts| src/tools/* | | src/tools/executor.ts +---------+---------+ +------+------+ +---------+--------+ | | | v | | +-------------------+ | v | AgentOrchestrator | | +-----------+ | src/backends/* | +------------> | AuditLogger| +---------+---------+ | src/audit/*| | v +-------------------+ | NativeAgent | | src/backends/* | +---------+---------+ | v +-------------------+ | ModelRouter | | src/models/* | +-------------------+ ``` ## Skills + Capabilities (What Gets Enforced) Skills are local directories with: - `SKILL.md` (instructions injected into the system prompt) - `manifest.json` (metadata + optional `permissions`) ### Skill permissions enforcement points - Tool availability: `ToolPolicy.resolveAllowedNames()` intersects allowed tools with `manifest.json.permissions`. - Tool execution (defense in depth): `ToolExecutor.execute()` enforces: - fs allowlists (`permissions.fs.read` / `permissions.fs.write`) - net allowlists (best-effort for `web.fetch`) - secret scopes (tools declare `requiredSecretScopes`, skills allow `permissions.secrets`) - injection guard when untrusted content is present Important default: - If a request is routed into a skill context but the skill has no `permissions` manifest, **tool access is denied**. Key files: - Skill manifest types: `src/skills/types.ts` - Loader validation: `src/skills/loader.ts` - Policy intersection: `src/tools/policy.ts` - Executor enforcement: `src/tools/executor.ts` ## Sandbox Execution (High-Risk Tools) Flynn supports per-session Docker sandboxes. Where sandboxing is applied today: - `shell.exec` and `process.start` can be replaced with sandboxed implementations. - Replacement is wired in `src/daemon/routing.ts` by cloning the ToolRegistry and swapping the tool implementations. Skill context default: - High-risk tool execution defaults to `sandbox` in skill context (when available). - A skill can opt into host execution only by setting `permissions.execution_environment: "host"`. Key files: - Sandbox lifecycle: `src/sandbox/manager.ts`, `src/sandbox/docker.ts` - Sandboxed tool wrappers: `src/sandbox/tools.ts` - Wiring: `src/daemon/routing.ts` ## Prompt Injection Hardening (Practical) Flynn treats content provenance as part of the control boundary: - `web.fetch`, `web.search`, and `browser.content` outputs are treated as untrusted "fetched_content". - Tool results are wrapped in provenance markers inside the tool loop. - Once untrusted content is seen, ToolExecutor applies stricter gating (blocks obvious injection patterns for high-risk tools). - Browser workflow tools add execution guardrails in the tool layer: `allowed_domains`, explicit high-risk confirmations, bounded retry policies, and step-budget enforcement. Key files: - Provenance wrapping: `src/backends/native/agent.ts` - Tool-call guard: `src/tools/executor.ts` - System prompt safety guidance: `src/daemon/services.ts` ## Mermaid (For Fast Visual Scanning) If your renderer supports Mermaid, this is the same information as a sequence diagram. ```mermaid sequenceDiagram autonumber participant U as User participant CA as ChannelAdapter participant CR as ChannelRegistry participant SM as SessionManager participant AR as AgentOrchestrator participant NA as NativeAgent participant MR as ModelRouter participant MC as ModelClient participant FC as Fallback Client participant TP as ToolPolicy/Registry participant TE as ToolExecutor participant HE as HookEngine participant AL as AuditLogger participant GA as Google OAuth Runtime participant AS as Auth Store participant TF as Token Files participant GP as Google APIs U->>CA: message CA->>CR: onMessage(InboundMessage) CR->>SM: getSession(channel, sender) SM-->>CR: Session CR->>AR: getOrCreateAgent(session + routing) AR->>NA: process(userMessage) NA->>MR: chat(messages + tools) MR->>MC: provider request alt primary model success MC-->>MR: response (content or tool_calls) else primary model error Note over MR: retry + tier/global fallback\n(skip duplicate clients) MR->>FC: fallback provider request FC-->>MR: fallback response end MR-->>NA: ChatResponse alt model requests tool use NA->>TP: filtered tool list (skill + policy) NA->>TE: execute(tool, args, context) TE->>HE: confirm/log/silent (autonomy) HE-->>TE: approved/denied alt google.* tool execution TE->>GA: createGoogleOAuth2Client(service) GA->>AS: load stored token alt auth store token missing GA->>TF: read legacy token file TF-->>GA: token GA->>AS: migrate token record end GA->>GP: API request with refreshed OAuth creds end TE->>AL: audit (redacted) TE-->>NA: ToolResult NA->>MR: chat(tool_result blocks) end NA-->>AR: assistant response AR-->>CR: OutboundMessage CR-->>CA: send() CA-->>U: reply ```