# Flynn Codebase Audit + Improvement Report Date: 2026-02-24 Branch: `feature/full-audit-hardening-and-config-consolidation` ## Executive Summary I audited core Flynn wiring across config -> daemon -> router/providers -> automation/tools -> CLI/docs, then implemented high-safety fixes for auth hardening, router correctness, provider alignment, and config consolidation. High-impact outcomes: - Google OAuth runtime is now centralized with store-first token loading, legacy token migration, and refresh persistence. - Model router fallback behavior now matches retry/fallback intent (no duplicate fallback attempts, retry policy applied on fallback paths). - OpenAI OAuth mode now fails fast when tools are requested (prevents silent non-executable tool output). - PaaS config is now generated from canonical `default.yaml` + overlay to prevent template drift. - `flynn doctor` now validates all Google automation services, not just Gmail. Breaking behavior changes introduced: - `models.fallback_chain` schema default changed from `['anthropic']` to `[]` (avoids invalid fallback entries by default). - OpenAI OAuth requests with tools now throw explicit errors instead of returning warning text. ## Findings (With File Pointers) 1. `HIGH` Google token persistence path caused runtime failures in restricted environments. - Cause: migration/store writes to `~/.config/flynn/auth.json` could fail and abort tool execution. - Evidence: [src/google/oauth.ts](src/google/oauth.ts:111), [src/auth/google.ts](src/auth/google.ts:108) - Fix: auth store writes now tolerate known filesystem permission errors and preserve token-file compatibility. 2. `HIGH` OpenAI OAuth mode accepted tool-bearing requests without executable tool support. - Cause: OAuth Codex backend path did not support Flynn tool execution semantics. - Evidence: [src/models/openai.ts](src/models/openai.ts:236) - Fix: explicit throw when tools are present, enabling router fallback or config correction. 3. `HIGH` Router fallback execution did not fully match retry policy and could repeat the same failing client. - Cause: retry policy only wrapped primary path; fallback clients could be retried inconsistently; duplicate fallbacks possible. - Evidence: [src/models/router.ts](src/models/router.ts:90) - Fix: attempted-client tracking and retry wrapping now apply to tier/global fallback chat paths and streaming fallback paths. 4. `MEDIUM` `flynn doctor` had incomplete feature wiring checks for Google services. - Cause: only Gmail automation health was validated. - Evidence: [src/cli/doctor.ts](src/cli/doctor.ts:663), [src/cli/doctor.ts](src/cli/doctor.ts:723) - Fix: added service checks for Calendar/Docs/Drive/Tasks with auth-store and token-file detection. 5. `MEDIUM` Config profile overlap risk (manual `config/paas.yaml` drift from canonical defaults). - Cause: duplicated full config template with independent edits. - Evidence: [config/profiles/paas.overlay.yaml](config/profiles/paas.overlay.yaml:1), [scripts/generate-config-profiles.mjs](scripts/generate-config-profiles.mjs:10) - Fix: canonical+overlay generation model, profile drift check, and sync test. 6. `MEDIUM` Default fallback chain schema value conflicted with router semantics. - Cause: default `['anthropic']` is not a tier/local-provider key in current router semantics. - Evidence: [src/config/schema.ts](src/config/schema.ts:181) - Fix: schema default set to `[]` to avoid spurious invalid fallback entries. 7. `LOW` Provider capability type list lagged configured providers. - Cause: `ModelProvider` union omitted `vercel`, `minimax`, `moonshot`, `synthetic`. - Evidence: [src/models/capabilities.ts](src/models/capabilities.ts:8) - Fix: union updated and test coverage expanded. 8. `LOW` Audit logger path expansion bug for `~`. - Cause: logger configured rotator with expanded path but write stream used raw path. - Evidence: [src/audit/logger.ts](src/audit/logger.ts:44), [src/audit/logger.ts](src/audit/logger.ts:57) - Fix: normalized path now used consistently by logger and rotator. 9. `INFO` Log-pattern analysis could not be completed from repository artifacts. - Cause: no runtime `.log` / audit JSONL artifacts present in workspace snapshot. - Evidence: repository scan returned no log files under repo root. - Mitigation: recommendations added below for repeatable log collection/analysis workflow. ## Recommended Changes (Prioritized) 1. `P0` Keep OpenAI OAuth tool rejection behavior and enforce documented fallback guidance. 2. `P0` Keep Google auth centralized; avoid introducing new per-tool OAuth duplication. 3. `P1` Add a shared Google auth CLI factory to remove duplicated `*-auth` command flow code. 4. `P1` Add optional `XDG_CONFIG_HOME`/override support for auth store paths for containerized/sandboxed environments. 5. `P1` Add periodic log export + analyzer command (error-rate, latency, provider fallback frequency) so reliability trends are measurable from CI/dev snapshots. 6. `P2` Introduce a provider capability matrix module consumed by router/doctor/docs from one source of truth. ## Implemented Changes (Diff Summary) Commits in this branch: - `5b95eb1` `fix(audit): expand tilde paths for audit log output` - `076379b` `refactor(config): generate paas profile from default overlay` - `00b2d64` `feat(google-auth): centralize oauth token store and service checks` - `092a9ba` `fix(router): align fallback semantics and oauth provider behavior` Notable file groups: - Audit hardening: `src/audit/logger.ts`, `src/audit/logger.test.ts` - Config consolidation: `config/profiles/paas.overlay.yaml`, `config/paas.yaml`, `scripts/generate-config-profiles.mjs`, `src/config/profileTemplates.test.ts`, `docs/deployment/PAAS.md`, `package.json` - Google auth hardening: `src/auth/google.ts`, `src/google/oauth.ts`, Google tool modules, Gmail watcher, Google auth CLI commands, `src/cli/doctor.ts` - Router/provider correctness: `src/models/router.ts`, `src/models/openai.ts`, `src/config/schema.ts`, `src/models/capabilities.ts` - Documentation additions: Google OAuth runbook and agent-facing repo map docs Validation executed: - Focused suites (420 tests) across changed modules passed. - `pnpm lint` passed (warnings only, 0 errors). - `pnpm typecheck` passed. - `pnpm config:profiles:check` passed. ## Remaining TODOs / Risks - No runtime log corpus was available for empirical recurring-error/perf bottleneck analysis. - Google auth CLI commands still contain duplicated flow logic across service-specific command files. - Auth store remains plaintext on disk (permissions are set, but no at-rest encryption). - Provider capability behavior is still partially split across provider clients + capability utility; further normalization is recommended.