docs(audit): add report, google auth runbook, and agent repo map

This commit is contained in:
William Valentin
2026-02-23 17:12:41 -08:00
parent 092a9baeae
commit d8188b5425
7 changed files with 344 additions and 1 deletions
+47
View File
@@ -51,6 +51,10 @@ make install
cp config/default.yaml ~/.config/flynn/config.yaml
# Edit config with your API keys and Telegram bot token
# Optional: regenerate/check derived config profiles
pnpm config:profiles:generate
pnpm config:profiles:check
# Run
flynn start
@@ -76,6 +80,10 @@ Flynn provides a full CLI via the `flynn` binary (or `npx tsx src/cli/index.ts`
| `flynn onboard` | Guided onboarding alias for setup wizard |
| `flynn gmail-auth` | Authenticate with Gmail via OAuth2 |
| `flynn gcal-auth` | Authenticate with Google Calendar via OAuth2 |
| `flynn gdocs-auth` | Authenticate with Google Docs via OAuth2 |
| `flynn gdrive-auth` | Authenticate with Google Drive via OAuth2 |
| `flynn gtasks-auth` | Authenticate with Google Tasks via OAuth2 |
| `flynn google-auth --service <name>` | Unified Google OAuth entrypoint (`gmail`, `gcal`, `gdocs`, `gdrive`, `gtasks`) |
| `flynn gemini-auth` | Store a Gemini API key in `~/.config/flynn/auth.json` |
| `flynn skills` | List/install/manage skills |
| `flynn companion` | Run a minimal companion node client against the gateway |
@@ -1155,6 +1163,7 @@ Supported delivery modes:
2. Create OAuth2 credentials (Desktop application type) and download the JSON file
3. Run `flynn gmail-auth` to complete the OAuth2 flow and store the refresh token
- Requests Gmail scopes for settings + read access (`gmail.settings.basic` + `gmail.readonly`)
- Flynn stores service tokens in `~/.config/flynn/auth.json` and keeps per-service token files for compatibility
For Pub/Sub delivery (push/pull), also enable the Pub/Sub API and create:
- A topic (e.g. `projects/your-project/topics/gmail-push`)
@@ -1241,6 +1250,7 @@ Query Google Calendar events from within conversations. Provides three tools: `c
1. A Google Cloud project with the **Calendar API** enabled
2. OAuth2 credentials (Desktop application type) — the same credentials file used for Gmail works
3. Run `flynn gcal-auth` to complete the OAuth2 flow and store the refresh token
- Also persisted in `~/.config/flynn/auth.json` for shared runtime refresh handling
### Configuration
@@ -1262,6 +1272,43 @@ automation:
| `token_file` | no | Path to stored OAuth2 refresh token (default: `~/.config/flynn/gcal-token.json`) |
| `calendar_ids` | no | Calendar IDs available for queries (default: `[primary]`) |
For full local operation guidance (token acquisition, storage, migration, refresh/renewal, and service scopes), see [Google OAuth Runbook](docs/operations/GOOGLE_AUTH.md).
## Google Docs, Drive, and Tasks Tools
Flynn also supports Google Docs, Drive, and Tasks tools:
- Docs: `docs.list`, `docs.search`, `docs.read`
- Drive: `drive.list`, `drive.search`, `drive.read`
- Tasks: `tasks.lists`, `tasks.list`
Enable in config:
```yaml
automation:
gdocs:
enabled: true
credentials_file: ~/.config/flynn/gmail-credentials.json
token_file: ~/.config/flynn/gdocs-token.json
gdrive:
enabled: true
credentials_file: ~/.config/flynn/gmail-credentials.json
token_file: ~/.config/flynn/gdrive-token.json
gtasks:
enabled: true
credentials_file: ~/.config/flynn/gmail-credentials.json
token_file: ~/.config/flynn/gtasks-token.json
```
Authenticate with:
```bash
flynn gdocs-auth
flynn gdrive-auth
flynn gtasks-auth
# or: flynn google-auth --service gdocs|gdrive|gtasks
```
## Vector Memory Search
The memory system supports hybrid search combining keyword matching with semantic vector similarity. When embeddings are enabled, `memory.search` uses both approaches and merges results with configurable weighting.
+103
View File
@@ -0,0 +1,103 @@
# Flynn Codebase Audit + Improvement Report
Date: 2026-02-24
Branch: `feature/full-audit-hardening-and-config-consolidation`
## Executive Summary
I audited core Flynn wiring across config -> daemon -> router/providers -> automation/tools -> CLI/docs, then implemented high-safety fixes for auth hardening, router correctness, provider alignment, and config consolidation.
High-impact outcomes:
- Google OAuth runtime is now centralized with store-first token loading, legacy token migration, and refresh persistence.
- Model router fallback behavior now matches retry/fallback intent (no duplicate fallback attempts, retry policy applied on fallback paths).
- OpenAI OAuth mode now fails fast when tools are requested (prevents silent non-executable tool output).
- PaaS config is now generated from canonical `default.yaml` + overlay to prevent template drift.
- `flynn doctor` now validates all Google automation services, not just Gmail.
Breaking behavior changes introduced:
- `models.fallback_chain` schema default changed from `['anthropic']` to `[]` (avoids invalid fallback entries by default).
- OpenAI OAuth requests with tools now throw explicit errors instead of returning warning text.
## Findings (With File Pointers)
1. `HIGH` Google token persistence path caused runtime failures in restricted environments.
- Cause: migration/store writes to `~/.config/flynn/auth.json` could fail and abort tool execution.
- Evidence: [src/google/oauth.ts](src/google/oauth.ts:111), [src/auth/google.ts](src/auth/google.ts:108)
- Fix: auth store writes now tolerate known filesystem permission errors and preserve token-file compatibility.
2. `HIGH` OpenAI OAuth mode accepted tool-bearing requests without executable tool support.
- Cause: OAuth Codex backend path did not support Flynn tool execution semantics.
- Evidence: [src/models/openai.ts](src/models/openai.ts:236)
- Fix: explicit throw when tools are present, enabling router fallback or config correction.
3. `HIGH` Router fallback execution did not fully match retry policy and could repeat the same failing client.
- Cause: retry policy only wrapped primary path; fallback clients could be retried inconsistently; duplicate fallbacks possible.
- Evidence: [src/models/router.ts](src/models/router.ts:90)
- Fix: attempted-client tracking and retry wrapping now apply to tier/global fallback chat paths and streaming fallback paths.
4. `MEDIUM` `flynn doctor` had incomplete feature wiring checks for Google services.
- Cause: only Gmail automation health was validated.
- Evidence: [src/cli/doctor.ts](src/cli/doctor.ts:663), [src/cli/doctor.ts](src/cli/doctor.ts:723)
- Fix: added service checks for Calendar/Docs/Drive/Tasks with auth-store and token-file detection.
5. `MEDIUM` Config profile overlap risk (manual `config/paas.yaml` drift from canonical defaults).
- Cause: duplicated full config template with independent edits.
- Evidence: [config/profiles/paas.overlay.yaml](config/profiles/paas.overlay.yaml:1), [scripts/generate-config-profiles.mjs](scripts/generate-config-profiles.mjs:10)
- Fix: canonical+overlay generation model, profile drift check, and sync test.
6. `MEDIUM` Default fallback chain schema value conflicted with router semantics.
- Cause: default `['anthropic']` is not a tier/local-provider key in current router semantics.
- Evidence: [src/config/schema.ts](src/config/schema.ts:181)
- Fix: schema default set to `[]` to avoid spurious invalid fallback entries.
7. `LOW` Provider capability type list lagged configured providers.
- Cause: `ModelProvider` union omitted `vercel`, `minimax`, `moonshot`, `synthetic`.
- Evidence: [src/models/capabilities.ts](src/models/capabilities.ts:8)
- Fix: union updated and test coverage expanded.
8. `LOW` Audit logger path expansion bug for `~`.
- Cause: logger configured rotator with expanded path but write stream used raw path.
- Evidence: [src/audit/logger.ts](src/audit/logger.ts:44), [src/audit/logger.ts](src/audit/logger.ts:57)
- Fix: normalized path now used consistently by logger and rotator.
9. `INFO` Log-pattern analysis could not be completed from repository artifacts.
- Cause: no runtime `.log` / audit JSONL artifacts present in workspace snapshot.
- Evidence: repository scan returned no log files under repo root.
- Mitigation: recommendations added below for repeatable log collection/analysis workflow.
## Recommended Changes (Prioritized)
1. `P0` Keep OpenAI OAuth tool rejection behavior and enforce documented fallback guidance.
2. `P0` Keep Google auth centralized; avoid introducing new per-tool OAuth duplication.
3. `P1` Add a shared Google auth CLI factory to remove duplicated `*-auth` command flow code.
4. `P1` Add optional `XDG_CONFIG_HOME`/override support for auth store paths for containerized/sandboxed environments.
5. `P1` Add periodic log export + analyzer command (error-rate, latency, provider fallback frequency) so reliability trends are measurable from CI/dev snapshots.
6. `P2` Introduce a provider capability matrix module consumed by router/doctor/docs from one source of truth.
## Implemented Changes (Diff Summary)
Commits in this branch:
- `5b95eb1` `fix(audit): expand tilde paths for audit log output`
- `076379b` `refactor(config): generate paas profile from default overlay`
- `00b2d64` `feat(google-auth): centralize oauth token store and service checks`
- `092a9ba` `fix(router): align fallback semantics and oauth provider behavior`
Notable file groups:
- Audit hardening: `src/audit/logger.ts`, `src/audit/logger.test.ts`
- Config consolidation: `config/profiles/paas.overlay.yaml`, `config/paas.yaml`, `scripts/generate-config-profiles.mjs`, `src/config/profileTemplates.test.ts`, `docs/deployment/PAAS.md`, `package.json`
- Google auth hardening: `src/auth/google.ts`, `src/google/oauth.ts`, Google tool modules, Gmail watcher, Google auth CLI commands, `src/cli/doctor.ts`
- Router/provider correctness: `src/models/router.ts`, `src/models/openai.ts`, `src/config/schema.ts`, `src/models/capabilities.ts`
- Documentation additions: Google OAuth runbook and agent-facing repo map docs
Validation executed:
- Focused suites (420 tests) across changed modules passed.
- `pnpm lint` passed (warnings only, 0 errors).
- `pnpm typecheck` passed.
- `pnpm config:profiles:check` passed.
## Remaining TODOs / Risks
- No runtime log corpus was available for empirical recurring-error/perf bottleneck analysis.
- Google auth CLI commands still contain duplicated flow logic across service-specific command files.
- Auth store remains plaintext on disk (permissions are set, but no at-rest encryption).
- Provider capability behavior is still partially split across provider clients + capability utility; further normalization is recommended.
+2
View File
@@ -6,6 +6,7 @@ This documentation is written to be useful to both humans and AI agents. If you
1. Architecture overview (agent-oriented)
- `docs/architecture/AGENT_DIAGRAM.md`
- `docs/architecture/AGENT_REPO_MAP.md`
- `docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md`
- `docs/architecture/TYPESCRIPT_MAP.md`
- `docs/architecture/SYMBOL_INDEX.md`
@@ -21,6 +22,7 @@ This documentation is written to be useful to both humans and AI agents. If you
- `docs/performance/TUNING.md`
6. Operations runbooks
- `docs/operations/OPERATOR_PACK.md`
- `docs/operations/GOOGLE_AUTH.md`
## Quick Map (One Diagram)
+85
View File
@@ -0,0 +1,85 @@
# Agent Repo Map
This file is an agent-facing operational map of the Flynn repo: entrypoints, key modules, conventions, config schema anchors, and debug workflows.
## Entry Points
- Daemon start: `src/cli/index.ts` -> `src/daemon/index.ts#startDaemon`
- One-shot send: `src/cli/send.ts`
- TUI:
- Minimal: `src/frontends/tui/minimal.ts`
- Fullscreen Ink app: `src/frontends/tui/components/App.tsx`
- Gateway server: `src/gateway/server.ts`
## Core Message Flow
1. Channel ingress/gateway request -> `src/daemon/routing.ts`
2. Session/orchestration -> `src/backends/native/orchestrator.ts`
3. Agent loop/tool execution -> `src/backends/native/agent.ts`
4. Model dispatch -> `src/models/router.ts` + provider clients in `src/models/*`
## High-Value Modules
- Config schema + defaults:
- `src/config/schema.ts`
- `config/default.yaml`
- `config/profiles/*.overlay.yaml` (profile overlays)
- Model wiring:
- `src/daemon/models.ts`
- `src/models/router.ts`
- `src/models/openai.ts`, `src/models/anthropic.ts`, `src/models/gemini.ts`, `src/models/bedrock.ts`, `src/models/github.ts`, `src/models/local/*`
- Tool registration chain:
- Tool impl: `src/tools/builtin/*`
- Export: `src/tools/builtin/index.ts`
- Registry/bootstrap: `src/daemon/index.ts`, `src/daemon/tools.ts`
- Channel adapters:
- `src/channels/*/adapter.ts`
- Registration: `src/daemon/channels.ts`
- Auth stores:
- OpenAI/Anthropic/etc: `src/auth/*`
- Google services: `src/auth/google.ts` + `src/google/oauth.ts`
- Observability:
- Audit logger: `src/audit/*`
- Doctor checks: `src/cli/doctor.ts`
## Conventions
- TypeScript strict mode, NodeNext modules.
- Local imports use `.js` extension.
- Keep provider-specific branches localized in provider clients; prefer shared helpers for cross-service auth logic.
- Tests use Vitest (`*.test.ts`).
## Config Schema Anchors
- Model tiers and fallback chain: `models.*` in `src/config/schema.ts`
- Tool policy and profiles: `tools.*`
- Automation integrations:
- Gmail watcher: `automation.gmail`
- Google tools: `automation.gcal`, `automation.gdocs`, `automation.gdrive`, `automation.gtasks`
- Server/gateway behavior: `server.*`
## Run and Debug
```bash
pnpm dev # daemon watch mode
pnpm tui # minimal terminal UI
pnpm tui:fs # fullscreen UI
pnpm test:run # full test run once
pnpm lint
pnpm typecheck
```
Focused debug commands:
```bash
pnpm test:run src/models/router.test.ts src/models/openai.oauth.test.ts
pnpm test:run src/tools/builtin/gmail.test.ts src/automation/gmail.test.ts
pnpm test:run src/cli/doctor.test.ts
```
Config profile sync:
```bash
pnpm config:profiles:generate
pnpm config:profiles:check
```
+68
View File
@@ -0,0 +1,68 @@
# Google OAuth Runbook
This runbook describes how Flynn acquires, stores, refreshes, and rotates Google OAuth tokens for Gmail, Calendar, Docs, Drive, and Tasks.
## Scope Policy (Least Privilege)
| Service | Command | Scope(s) |
|---|---|---|
| Gmail | `flynn gmail-auth` | `https://www.googleapis.com/auth/gmail.settings.basic`, `https://www.googleapis.com/auth/gmail.readonly` |
| Calendar | `flynn gcal-auth` | `https://www.googleapis.com/auth/calendar.readonly` |
| Docs | `flynn gdocs-auth` | `https://www.googleapis.com/auth/documents.readonly`, `https://www.googleapis.com/auth/drive.metadata.readonly` |
| Drive | `flynn gdrive-auth` | `https://www.googleapis.com/auth/drive.readonly` |
| Tasks | `flynn gtasks-auth` | `https://www.googleapis.com/auth/tasks.readonly` |
Unified entrypoint:
```bash
flynn google-auth --service gmail
flynn google-auth --service gcal
flynn google-auth --service gdocs
flynn google-auth --service gdrive
flynn google-auth --service gtasks
```
Add `--manual` for copy/paste flow, or `--config <path>` for non-default config.
## Storage Model
- Canonical token store: `~/.config/flynn/auth.json`
- Path: `google.services.<service>.token`
- Metadata: scopes, token file path, credentials file path, `updated_at`
- Compatibility token files remain supported:
- `~/.config/flynn/gmail-token.json`
- `~/.config/flynn/gcal-token.json`
- `~/.config/flynn/gdocs-token.json`
- `~/.config/flynn/gdrive-token.json`
- `~/.config/flynn/gtasks-token.json`
Runtime behavior:
1. Flynn checks auth store first.
2. If missing, Flynn loads legacy token file and migrates it into auth store.
3. If both are missing, tools/watchers fail with a service-specific re-auth command hint.
## Refresh and Rotation Behavior
- OAuth client refresh events write back to:
- auth store (`auth.json`)
- service token file (compatibility)
- If auth store write is blocked (for example `EACCES`), Flynn keeps operating from token files.
- Re-auth is required when Google returns insufficient-scope errors for a service.
## Local Operation Checklist
1. Configure `automation.<service>.credentials_file` in Flynn config.
2. Run service auth command (or `flynn google-auth --service <service>`).
3. Run `flynn doctor` and verify service checks:
- `Gmail configured`
- `Google Calendar configured`
- `Google Docs configured`
- `Google Drive configured`
- `Google Tasks configured`
## Failure and Renewal Signals
- Missing credentials file: hard failure with path.
- Missing tokens (store + file): doctor warning + runtime auth error with command hint.
- Scope mismatch: runtime error includes explicit re-auth command for the affected service.
+5
View File
@@ -72,6 +72,11 @@ If the daily briefing reports Calendar or Tasks as blocked with insufficient per
1. `flynn gcal-auth`
2. `flynn gtasks-auth`
Equivalent unified form:
1. `flynn google-auth --service gcal`
2. `flynn google-auth --service gtasks`
Expected scopes after re-auth:
- Calendar: `https://www.googleapis.com/auth/calendar.readonly`
+34 -1
View File
@@ -3,6 +3,36 @@
"updated_at": "2026-02-24",
"description": "Tracks the status of all Flynn plans and implementation phases",
"plans": {
"full-audit-hardening-and-config-consolidation": {
"status": "completed",
"date": "2026-02-24",
"updated": "2026-02-24",
"summary": "Completed a repo-wide audit and implemented hardening/refactors for Google OAuth token handling, router fallback correctness, config-profile consolidation, and audit logging path behavior. Added Google service coverage to `flynn doctor`, introduced a unified `flynn google-auth` command, aligned fallback-chain defaults with runtime semantics, expanded provider capability type coverage, and produced operator/agent-facing documentation plus REPORT.md.",
"files_modified": [
"src/auth/google.ts",
"src/google/oauth.ts",
"src/tools/builtin/gmail.ts",
"src/tools/builtin/gcal.ts",
"src/tools/builtin/gdocs.ts",
"src/tools/builtin/gdrive.ts",
"src/tools/builtin/gtasks.ts",
"src/automation/gmail.ts",
"src/cli/google-auth.ts",
"src/cli/doctor.ts",
"src/models/router.ts",
"src/models/openai.ts",
"src/config/schema.ts",
"config/profiles/paas.overlay.yaml",
"scripts/generate-config-profiles.mjs",
"src/audit/logger.ts",
"README.md",
"docs/operations/GOOGLE_AUTH.md",
"docs/architecture/AGENT_REPO_MAP.md",
"REPORT.md",
"docs/plans/state.json"
],
"test_status": "pnpm test:run (18 focused suites, 420 tests) + pnpm lint (0 errors) + pnpm typecheck + pnpm config:profiles:check passing"
},
"daily-briefing-google-scope-remediation": {
"status": "completed",
"date": "2026-02-24",
@@ -6370,7 +6400,7 @@
}
},
"overall_progress": {
"total_test_count": 1971,
"total_test_count": 1982,
"all_tests_passing": true,
"p0_completion": "3/3 (100%)",
"p1_completion": "4/4 (100%)",
@@ -6395,6 +6425,9 @@
"daily_briefing_google_scope_remediation": "completed — calendar.* and tasks.* now append explicit re-auth guidance (`flynn gcal-auth` / `flynn gtasks-auth`) for insufficient-scope errors, and operator runbook includes remediation steps",
"council_tool_timeout_override": "completed — ToolExecutor supports per-tool timeout overrides and council.run now uses a 180s timeout to avoid false 30s council timeouts in the tool loop",
"minimal_tui_multiline_paste_mode": "completed — minimal TUI now supports `/paste`/`/multiline` multiline compose mode ending with single '.' line, preventing newline truncation for pasted prompts",
"config_profile_consolidation": "completed — config/paas.yaml is now generated from canonical config/default.yaml + config/profiles/paas.overlay.yaml with CI-checkable drift detection",
"google_auth_hardening": "completed — shared Google OAuth runtime helper + auth store (auth.json), legacy token-file migration, refresh persistence, service-wide doctor checks, and unified `flynn google-auth` command",
"model_router_correctness": "completed — fallback paths now avoid duplicate clients, apply retry policy consistently, and reject unsupported OpenAI OAuth tool requests early",
"native_audio_support": "completed — smart routing for native audio (Gemini/OpenAI/GitHub) vs Whisper transcription fallback, plus 2026-02-23 arg hydration hardening, tool.args_rewritten audit metric, transient fetch retry/timeout hardening, localhost->127.0.0.1 fallback for transcription endpoint connectivity, and whisper docker-compose entrypoint arg fix for port 18801",
"remaining_phases_completion": "Phase 1: 3/3 (100%) — context levels, command registry, memory structure. Phase 2: 3/3 (100%) — component registry, confidence routing, history index. Phase 3: 2/2 (100%) — adaptive memory/compaction, truthfulness/autonomy hardening",
"next_up": "Track OpenClaw evolution regularly for inspiration and feature ideas"