From d8188b5425fc58549a6999bf026880546443b9c3 Mon Sep 17 00:00:00 2001 From: William Valentin Date: Mon, 23 Feb 2026 17:12:41 -0800 Subject: [PATCH] docs(audit): add report, google auth runbook, and agent repo map --- README.md | 47 +++++++++++++ REPORT.md | 103 ++++++++++++++++++++++++++++ docs/README.md | 2 + docs/architecture/AGENT_REPO_MAP.md | 85 +++++++++++++++++++++++ docs/operations/GOOGLE_AUTH.md | 68 ++++++++++++++++++ docs/operations/OPERATOR_PACK.md | 5 ++ docs/plans/state.json | 35 +++++++++- 7 files changed, 344 insertions(+), 1 deletion(-) create mode 100644 REPORT.md create mode 100644 docs/architecture/AGENT_REPO_MAP.md create mode 100644 docs/operations/GOOGLE_AUTH.md diff --git a/README.md b/README.md index 780bc87..2fdb6e9 100644 --- a/README.md +++ b/README.md @@ -51,6 +51,10 @@ make install cp config/default.yaml ~/.config/flynn/config.yaml # Edit config with your API keys and Telegram bot token +# Optional: regenerate/check derived config profiles +pnpm config:profiles:generate +pnpm config:profiles:check + # Run flynn start @@ -76,6 +80,10 @@ Flynn provides a full CLI via the `flynn` binary (or `npx tsx src/cli/index.ts` | `flynn onboard` | Guided onboarding alias for setup wizard | | `flynn gmail-auth` | Authenticate with Gmail via OAuth2 | | `flynn gcal-auth` | Authenticate with Google Calendar via OAuth2 | +| `flynn gdocs-auth` | Authenticate with Google Docs via OAuth2 | +| `flynn gdrive-auth` | Authenticate with Google Drive via OAuth2 | +| `flynn gtasks-auth` | Authenticate with Google Tasks via OAuth2 | +| `flynn google-auth --service ` | Unified Google OAuth entrypoint (`gmail`, `gcal`, `gdocs`, `gdrive`, `gtasks`) | | `flynn gemini-auth` | Store a Gemini API key in `~/.config/flynn/auth.json` | | `flynn skills` | List/install/manage skills | | `flynn companion` | Run a minimal companion node client against the gateway | @@ -1155,6 +1163,7 @@ Supported delivery modes: 2. Create OAuth2 credentials (Desktop application type) and download the JSON file 3. Run `flynn gmail-auth` to complete the OAuth2 flow and store the refresh token - Requests Gmail scopes for settings + read access (`gmail.settings.basic` + `gmail.readonly`) + - Flynn stores service tokens in `~/.config/flynn/auth.json` and keeps per-service token files for compatibility For Pub/Sub delivery (push/pull), also enable the Pub/Sub API and create: - A topic (e.g. `projects/your-project/topics/gmail-push`) @@ -1241,6 +1250,7 @@ Query Google Calendar events from within conversations. Provides three tools: `c 1. A Google Cloud project with the **Calendar API** enabled 2. OAuth2 credentials (Desktop application type) — the same credentials file used for Gmail works 3. Run `flynn gcal-auth` to complete the OAuth2 flow and store the refresh token + - Also persisted in `~/.config/flynn/auth.json` for shared runtime refresh handling ### Configuration @@ -1262,6 +1272,43 @@ automation: | `token_file` | no | Path to stored OAuth2 refresh token (default: `~/.config/flynn/gcal-token.json`) | | `calendar_ids` | no | Calendar IDs available for queries (default: `[primary]`) | +For full local operation guidance (token acquisition, storage, migration, refresh/renewal, and service scopes), see [Google OAuth Runbook](docs/operations/GOOGLE_AUTH.md). + +## Google Docs, Drive, and Tasks Tools + +Flynn also supports Google Docs, Drive, and Tasks tools: + +- Docs: `docs.list`, `docs.search`, `docs.read` +- Drive: `drive.list`, `drive.search`, `drive.read` +- Tasks: `tasks.lists`, `tasks.list` + +Enable in config: + +```yaml +automation: + gdocs: + enabled: true + credentials_file: ~/.config/flynn/gmail-credentials.json + token_file: ~/.config/flynn/gdocs-token.json + gdrive: + enabled: true + credentials_file: ~/.config/flynn/gmail-credentials.json + token_file: ~/.config/flynn/gdrive-token.json + gtasks: + enabled: true + credentials_file: ~/.config/flynn/gmail-credentials.json + token_file: ~/.config/flynn/gtasks-token.json +``` + +Authenticate with: + +```bash +flynn gdocs-auth +flynn gdrive-auth +flynn gtasks-auth +# or: flynn google-auth --service gdocs|gdrive|gtasks +``` + ## Vector Memory Search The memory system supports hybrid search combining keyword matching with semantic vector similarity. When embeddings are enabled, `memory.search` uses both approaches and merges results with configurable weighting. diff --git a/REPORT.md b/REPORT.md new file mode 100644 index 0000000..8ea14ed --- /dev/null +++ b/REPORT.md @@ -0,0 +1,103 @@ +# Flynn Codebase Audit + Improvement Report + +Date: 2026-02-24 +Branch: `feature/full-audit-hardening-and-config-consolidation` + +## Executive Summary + +I audited core Flynn wiring across config -> daemon -> router/providers -> automation/tools -> CLI/docs, then implemented high-safety fixes for auth hardening, router correctness, provider alignment, and config consolidation. + +High-impact outcomes: +- Google OAuth runtime is now centralized with store-first token loading, legacy token migration, and refresh persistence. +- Model router fallback behavior now matches retry/fallback intent (no duplicate fallback attempts, retry policy applied on fallback paths). +- OpenAI OAuth mode now fails fast when tools are requested (prevents silent non-executable tool output). +- PaaS config is now generated from canonical `default.yaml` + overlay to prevent template drift. +- `flynn doctor` now validates all Google automation services, not just Gmail. + +Breaking behavior changes introduced: +- `models.fallback_chain` schema default changed from `['anthropic']` to `[]` (avoids invalid fallback entries by default). +- OpenAI OAuth requests with tools now throw explicit errors instead of returning warning text. + +## Findings (With File Pointers) + +1. `HIGH` Google token persistence path caused runtime failures in restricted environments. +- Cause: migration/store writes to `~/.config/flynn/auth.json` could fail and abort tool execution. +- Evidence: [src/google/oauth.ts](src/google/oauth.ts:111), [src/auth/google.ts](src/auth/google.ts:108) +- Fix: auth store writes now tolerate known filesystem permission errors and preserve token-file compatibility. + +2. `HIGH` OpenAI OAuth mode accepted tool-bearing requests without executable tool support. +- Cause: OAuth Codex backend path did not support Flynn tool execution semantics. +- Evidence: [src/models/openai.ts](src/models/openai.ts:236) +- Fix: explicit throw when tools are present, enabling router fallback or config correction. + +3. `HIGH` Router fallback execution did not fully match retry policy and could repeat the same failing client. +- Cause: retry policy only wrapped primary path; fallback clients could be retried inconsistently; duplicate fallbacks possible. +- Evidence: [src/models/router.ts](src/models/router.ts:90) +- Fix: attempted-client tracking and retry wrapping now apply to tier/global fallback chat paths and streaming fallback paths. + +4. `MEDIUM` `flynn doctor` had incomplete feature wiring checks for Google services. +- Cause: only Gmail automation health was validated. +- Evidence: [src/cli/doctor.ts](src/cli/doctor.ts:663), [src/cli/doctor.ts](src/cli/doctor.ts:723) +- Fix: added service checks for Calendar/Docs/Drive/Tasks with auth-store and token-file detection. + +5. `MEDIUM` Config profile overlap risk (manual `config/paas.yaml` drift from canonical defaults). +- Cause: duplicated full config template with independent edits. +- Evidence: [config/profiles/paas.overlay.yaml](config/profiles/paas.overlay.yaml:1), [scripts/generate-config-profiles.mjs](scripts/generate-config-profiles.mjs:10) +- Fix: canonical+overlay generation model, profile drift check, and sync test. + +6. `MEDIUM` Default fallback chain schema value conflicted with router semantics. +- Cause: default `['anthropic']` is not a tier/local-provider key in current router semantics. +- Evidence: [src/config/schema.ts](src/config/schema.ts:181) +- Fix: schema default set to `[]` to avoid spurious invalid fallback entries. + +7. `LOW` Provider capability type list lagged configured providers. +- Cause: `ModelProvider` union omitted `vercel`, `minimax`, `moonshot`, `synthetic`. +- Evidence: [src/models/capabilities.ts](src/models/capabilities.ts:8) +- Fix: union updated and test coverage expanded. + +8. `LOW` Audit logger path expansion bug for `~`. +- Cause: logger configured rotator with expanded path but write stream used raw path. +- Evidence: [src/audit/logger.ts](src/audit/logger.ts:44), [src/audit/logger.ts](src/audit/logger.ts:57) +- Fix: normalized path now used consistently by logger and rotator. + +9. `INFO` Log-pattern analysis could not be completed from repository artifacts. +- Cause: no runtime `.log` / audit JSONL artifacts present in workspace snapshot. +- Evidence: repository scan returned no log files under repo root. +- Mitigation: recommendations added below for repeatable log collection/analysis workflow. + +## Recommended Changes (Prioritized) + +1. `P0` Keep OpenAI OAuth tool rejection behavior and enforce documented fallback guidance. +2. `P0` Keep Google auth centralized; avoid introducing new per-tool OAuth duplication. +3. `P1` Add a shared Google auth CLI factory to remove duplicated `*-auth` command flow code. +4. `P1` Add optional `XDG_CONFIG_HOME`/override support for auth store paths for containerized/sandboxed environments. +5. `P1` Add periodic log export + analyzer command (error-rate, latency, provider fallback frequency) so reliability trends are measurable from CI/dev snapshots. +6. `P2` Introduce a provider capability matrix module consumed by router/doctor/docs from one source of truth. + +## Implemented Changes (Diff Summary) + +Commits in this branch: +- `5b95eb1` `fix(audit): expand tilde paths for audit log output` +- `076379b` `refactor(config): generate paas profile from default overlay` +- `00b2d64` `feat(google-auth): centralize oauth token store and service checks` +- `092a9ba` `fix(router): align fallback semantics and oauth provider behavior` + +Notable file groups: +- Audit hardening: `src/audit/logger.ts`, `src/audit/logger.test.ts` +- Config consolidation: `config/profiles/paas.overlay.yaml`, `config/paas.yaml`, `scripts/generate-config-profiles.mjs`, `src/config/profileTemplates.test.ts`, `docs/deployment/PAAS.md`, `package.json` +- Google auth hardening: `src/auth/google.ts`, `src/google/oauth.ts`, Google tool modules, Gmail watcher, Google auth CLI commands, `src/cli/doctor.ts` +- Router/provider correctness: `src/models/router.ts`, `src/models/openai.ts`, `src/config/schema.ts`, `src/models/capabilities.ts` +- Documentation additions: Google OAuth runbook and agent-facing repo map docs + +Validation executed: +- Focused suites (420 tests) across changed modules passed. +- `pnpm lint` passed (warnings only, 0 errors). +- `pnpm typecheck` passed. +- `pnpm config:profiles:check` passed. + +## Remaining TODOs / Risks + +- No runtime log corpus was available for empirical recurring-error/perf bottleneck analysis. +- Google auth CLI commands still contain duplicated flow logic across service-specific command files. +- Auth store remains plaintext on disk (permissions are set, but no at-rest encryption). +- Provider capability behavior is still partially split across provider clients + capability utility; further normalization is recommended. diff --git a/docs/README.md b/docs/README.md index 3bdeca8..879cf9e 100644 --- a/docs/README.md +++ b/docs/README.md @@ -6,6 +6,7 @@ This documentation is written to be useful to both humans and AI agents. If you 1. Architecture overview (agent-oriented) - `docs/architecture/AGENT_DIAGRAM.md` + - `docs/architecture/AGENT_REPO_MAP.md` - `docs/architecture/GATEWAY_SESSIONS_AND_QUEUE.md` - `docs/architecture/TYPESCRIPT_MAP.md` - `docs/architecture/SYMBOL_INDEX.md` @@ -21,6 +22,7 @@ This documentation is written to be useful to both humans and AI agents. If you - `docs/performance/TUNING.md` 6. Operations runbooks - `docs/operations/OPERATOR_PACK.md` + - `docs/operations/GOOGLE_AUTH.md` ## Quick Map (One Diagram) diff --git a/docs/architecture/AGENT_REPO_MAP.md b/docs/architecture/AGENT_REPO_MAP.md new file mode 100644 index 0000000..7f918ff --- /dev/null +++ b/docs/architecture/AGENT_REPO_MAP.md @@ -0,0 +1,85 @@ +# Agent Repo Map + +This file is an agent-facing operational map of the Flynn repo: entrypoints, key modules, conventions, config schema anchors, and debug workflows. + +## Entry Points + +- Daemon start: `src/cli/index.ts` -> `src/daemon/index.ts#startDaemon` +- One-shot send: `src/cli/send.ts` +- TUI: + - Minimal: `src/frontends/tui/minimal.ts` + - Fullscreen Ink app: `src/frontends/tui/components/App.tsx` +- Gateway server: `src/gateway/server.ts` + +## Core Message Flow + +1. Channel ingress/gateway request -> `src/daemon/routing.ts` +2. Session/orchestration -> `src/backends/native/orchestrator.ts` +3. Agent loop/tool execution -> `src/backends/native/agent.ts` +4. Model dispatch -> `src/models/router.ts` + provider clients in `src/models/*` + +## High-Value Modules + +- Config schema + defaults: + - `src/config/schema.ts` + - `config/default.yaml` + - `config/profiles/*.overlay.yaml` (profile overlays) +- Model wiring: + - `src/daemon/models.ts` + - `src/models/router.ts` + - `src/models/openai.ts`, `src/models/anthropic.ts`, `src/models/gemini.ts`, `src/models/bedrock.ts`, `src/models/github.ts`, `src/models/local/*` +- Tool registration chain: + - Tool impl: `src/tools/builtin/*` + - Export: `src/tools/builtin/index.ts` + - Registry/bootstrap: `src/daemon/index.ts`, `src/daemon/tools.ts` +- Channel adapters: + - `src/channels/*/adapter.ts` + - Registration: `src/daemon/channels.ts` +- Auth stores: + - OpenAI/Anthropic/etc: `src/auth/*` + - Google services: `src/auth/google.ts` + `src/google/oauth.ts` +- Observability: + - Audit logger: `src/audit/*` + - Doctor checks: `src/cli/doctor.ts` + +## Conventions + +- TypeScript strict mode, NodeNext modules. +- Local imports use `.js` extension. +- Keep provider-specific branches localized in provider clients; prefer shared helpers for cross-service auth logic. +- Tests use Vitest (`*.test.ts`). + +## Config Schema Anchors + +- Model tiers and fallback chain: `models.*` in `src/config/schema.ts` +- Tool policy and profiles: `tools.*` +- Automation integrations: + - Gmail watcher: `automation.gmail` + - Google tools: `automation.gcal`, `automation.gdocs`, `automation.gdrive`, `automation.gtasks` +- Server/gateway behavior: `server.*` + +## Run and Debug + +```bash +pnpm dev # daemon watch mode +pnpm tui # minimal terminal UI +pnpm tui:fs # fullscreen UI +pnpm test:run # full test run once +pnpm lint +pnpm typecheck +``` + +Focused debug commands: + +```bash +pnpm test:run src/models/router.test.ts src/models/openai.oauth.test.ts +pnpm test:run src/tools/builtin/gmail.test.ts src/automation/gmail.test.ts +pnpm test:run src/cli/doctor.test.ts +``` + +Config profile sync: + +```bash +pnpm config:profiles:generate +pnpm config:profiles:check +``` diff --git a/docs/operations/GOOGLE_AUTH.md b/docs/operations/GOOGLE_AUTH.md new file mode 100644 index 0000000..b7bdace --- /dev/null +++ b/docs/operations/GOOGLE_AUTH.md @@ -0,0 +1,68 @@ +# Google OAuth Runbook + +This runbook describes how Flynn acquires, stores, refreshes, and rotates Google OAuth tokens for Gmail, Calendar, Docs, Drive, and Tasks. + +## Scope Policy (Least Privilege) + +| Service | Command | Scope(s) | +|---|---|---| +| Gmail | `flynn gmail-auth` | `https://www.googleapis.com/auth/gmail.settings.basic`, `https://www.googleapis.com/auth/gmail.readonly` | +| Calendar | `flynn gcal-auth` | `https://www.googleapis.com/auth/calendar.readonly` | +| Docs | `flynn gdocs-auth` | `https://www.googleapis.com/auth/documents.readonly`, `https://www.googleapis.com/auth/drive.metadata.readonly` | +| Drive | `flynn gdrive-auth` | `https://www.googleapis.com/auth/drive.readonly` | +| Tasks | `flynn gtasks-auth` | `https://www.googleapis.com/auth/tasks.readonly` | + +Unified entrypoint: + +```bash +flynn google-auth --service gmail +flynn google-auth --service gcal +flynn google-auth --service gdocs +flynn google-auth --service gdrive +flynn google-auth --service gtasks +``` + +Add `--manual` for copy/paste flow, or `--config ` for non-default config. + +## Storage Model + +- Canonical token store: `~/.config/flynn/auth.json` + - Path: `google.services..token` + - Metadata: scopes, token file path, credentials file path, `updated_at` +- Compatibility token files remain supported: + - `~/.config/flynn/gmail-token.json` + - `~/.config/flynn/gcal-token.json` + - `~/.config/flynn/gdocs-token.json` + - `~/.config/flynn/gdrive-token.json` + - `~/.config/flynn/gtasks-token.json` + +Runtime behavior: + +1. Flynn checks auth store first. +2. If missing, Flynn loads legacy token file and migrates it into auth store. +3. If both are missing, tools/watchers fail with a service-specific re-auth command hint. + +## Refresh and Rotation Behavior + +- OAuth client refresh events write back to: + - auth store (`auth.json`) + - service token file (compatibility) +- If auth store write is blocked (for example `EACCES`), Flynn keeps operating from token files. +- Re-auth is required when Google returns insufficient-scope errors for a service. + +## Local Operation Checklist + +1. Configure `automation..credentials_file` in Flynn config. +2. Run service auth command (or `flynn google-auth --service `). +3. Run `flynn doctor` and verify service checks: + - `Gmail configured` + - `Google Calendar configured` + - `Google Docs configured` + - `Google Drive configured` + - `Google Tasks configured` + +## Failure and Renewal Signals + +- Missing credentials file: hard failure with path. +- Missing tokens (store + file): doctor warning + runtime auth error with command hint. +- Scope mismatch: runtime error includes explicit re-auth command for the affected service. diff --git a/docs/operations/OPERATOR_PACK.md b/docs/operations/OPERATOR_PACK.md index be5f20c..4b5494c 100644 --- a/docs/operations/OPERATOR_PACK.md +++ b/docs/operations/OPERATOR_PACK.md @@ -72,6 +72,11 @@ If the daily briefing reports Calendar or Tasks as blocked with insufficient per 1. `flynn gcal-auth` 2. `flynn gtasks-auth` +Equivalent unified form: + +1. `flynn google-auth --service gcal` +2. `flynn google-auth --service gtasks` + Expected scopes after re-auth: - Calendar: `https://www.googleapis.com/auth/calendar.readonly` diff --git a/docs/plans/state.json b/docs/plans/state.json index 1e85ade..424c1f6 100644 --- a/docs/plans/state.json +++ b/docs/plans/state.json @@ -3,6 +3,36 @@ "updated_at": "2026-02-24", "description": "Tracks the status of all Flynn plans and implementation phases", "plans": { + "full-audit-hardening-and-config-consolidation": { + "status": "completed", + "date": "2026-02-24", + "updated": "2026-02-24", + "summary": "Completed a repo-wide audit and implemented hardening/refactors for Google OAuth token handling, router fallback correctness, config-profile consolidation, and audit logging path behavior. Added Google service coverage to `flynn doctor`, introduced a unified `flynn google-auth` command, aligned fallback-chain defaults with runtime semantics, expanded provider capability type coverage, and produced operator/agent-facing documentation plus REPORT.md.", + "files_modified": [ + "src/auth/google.ts", + "src/google/oauth.ts", + "src/tools/builtin/gmail.ts", + "src/tools/builtin/gcal.ts", + "src/tools/builtin/gdocs.ts", + "src/tools/builtin/gdrive.ts", + "src/tools/builtin/gtasks.ts", + "src/automation/gmail.ts", + "src/cli/google-auth.ts", + "src/cli/doctor.ts", + "src/models/router.ts", + "src/models/openai.ts", + "src/config/schema.ts", + "config/profiles/paas.overlay.yaml", + "scripts/generate-config-profiles.mjs", + "src/audit/logger.ts", + "README.md", + "docs/operations/GOOGLE_AUTH.md", + "docs/architecture/AGENT_REPO_MAP.md", + "REPORT.md", + "docs/plans/state.json" + ], + "test_status": "pnpm test:run (18 focused suites, 420 tests) + pnpm lint (0 errors) + pnpm typecheck + pnpm config:profiles:check passing" + }, "daily-briefing-google-scope-remediation": { "status": "completed", "date": "2026-02-24", @@ -6370,7 +6400,7 @@ } }, "overall_progress": { - "total_test_count": 1971, + "total_test_count": 1982, "all_tests_passing": true, "p0_completion": "3/3 (100%)", "p1_completion": "4/4 (100%)", @@ -6395,6 +6425,9 @@ "daily_briefing_google_scope_remediation": "completed — calendar.* and tasks.* now append explicit re-auth guidance (`flynn gcal-auth` / `flynn gtasks-auth`) for insufficient-scope errors, and operator runbook includes remediation steps", "council_tool_timeout_override": "completed — ToolExecutor supports per-tool timeout overrides and council.run now uses a 180s timeout to avoid false 30s council timeouts in the tool loop", "minimal_tui_multiline_paste_mode": "completed — minimal TUI now supports `/paste`/`/multiline` multiline compose mode ending with single '.' line, preventing newline truncation for pasted prompts", + "config_profile_consolidation": "completed — config/paas.yaml is now generated from canonical config/default.yaml + config/profiles/paas.overlay.yaml with CI-checkable drift detection", + "google_auth_hardening": "completed — shared Google OAuth runtime helper + auth store (auth.json), legacy token-file migration, refresh persistence, service-wide doctor checks, and unified `flynn google-auth` command", + "model_router_correctness": "completed — fallback paths now avoid duplicate clients, apply retry policy consistently, and reject unsupported OpenAI OAuth tool requests early", "native_audio_support": "completed — smart routing for native audio (Gemini/OpenAI/GitHub) vs Whisper transcription fallback, plus 2026-02-23 arg hydration hardening, tool.args_rewritten audit metric, transient fetch retry/timeout hardening, localhost->127.0.0.1 fallback for transcription endpoint connectivity, and whisper docker-compose entrypoint arg fix for port 18801", "remaining_phases_completion": "Phase 1: 3/3 (100%) — context levels, command registry, memory structure. Phase 2: 3/3 (100%) — component registry, confidence routing, history index. Phase 3: 2/2 (100%) — adaptive memory/compaction, truthfulness/autonomy hardening", "next_up": "Track OpenClaw evolution regularly for inspiration and feature ideas"