Files

6.6 KiB

Flynn Codebase Audit + Improvement Report

Date: 2026-02-24 Branch: feature/full-audit-hardening-and-config-consolidation

Executive Summary

I audited core Flynn wiring across config -> daemon -> router/providers -> automation/tools -> CLI/docs, then implemented high-safety fixes for auth hardening, router correctness, provider alignment, and config consolidation.

High-impact outcomes:

  • Google OAuth runtime is now centralized with store-first token loading, legacy token migration, and refresh persistence.
  • Model router fallback behavior now matches retry/fallback intent (no duplicate fallback attempts, retry policy applied on fallback paths).
  • OpenAI OAuth mode now fails fast when tools are requested (prevents silent non-executable tool output).
  • PaaS config is now generated from canonical default.yaml + overlay to prevent template drift.
  • flynn doctor now validates all Google automation services, not just Gmail.

Breaking behavior changes introduced:

  • models.fallback_chain schema default changed from ['anthropic'] to [] (avoids invalid fallback entries by default).
  • OpenAI OAuth requests with tools now throw explicit errors instead of returning warning text.

Findings (With File Pointers)

  1. HIGH Google token persistence path caused runtime failures in restricted environments.
  • Cause: migration/store writes to ~/.config/flynn/auth.json could fail and abort tool execution.
  • Evidence: src/google/oauth.ts, src/auth/google.ts
  • Fix: auth store writes now tolerate known filesystem permission errors and preserve token-file compatibility.
  1. HIGH OpenAI OAuth mode accepted tool-bearing requests without executable tool support.
  • Cause: OAuth Codex backend path did not support Flynn tool execution semantics.
  • Evidence: src/models/openai.ts
  • Fix: explicit throw when tools are present, enabling router fallback or config correction.
  1. HIGH Router fallback execution did not fully match retry policy and could repeat the same failing client.
  • Cause: retry policy only wrapped primary path; fallback clients could be retried inconsistently; duplicate fallbacks possible.
  • Evidence: src/models/router.ts
  • Fix: attempted-client tracking and retry wrapping now apply to tier/global fallback chat paths and streaming fallback paths.
  1. MEDIUM flynn doctor had incomplete feature wiring checks for Google services.
  • Cause: only Gmail automation health was validated.
  • Evidence: src/cli/doctor.ts, src/cli/doctor.ts
  • Fix: added service checks for Calendar/Docs/Drive/Tasks with auth-store and token-file detection.
  1. MEDIUM Config profile overlap risk (manual config/paas.yaml drift from canonical defaults).
  1. MEDIUM Default fallback chain schema value conflicted with router semantics.
  • Cause: default ['anthropic'] is not a tier/local-provider key in current router semantics.
  • Evidence: src/config/schema.ts
  • Fix: schema default set to [] to avoid spurious invalid fallback entries.
  1. LOW Provider capability type list lagged configured providers.
  • Cause: ModelProvider union omitted vercel, minimax, moonshot, synthetic.
  • Evidence: src/models/capabilities.ts
  • Fix: union updated and test coverage expanded.
  1. LOW Audit logger path expansion bug for ~.
  • Cause: logger configured rotator with expanded path but write stream used raw path.
  • Evidence: src/audit/logger.ts, src/audit/logger.ts
  • Fix: normalized path now used consistently by logger and rotator.
  1. INFO Log-pattern analysis could not be completed from repository artifacts.
  • Cause: no runtime .log / audit JSONL artifacts present in workspace snapshot.
  • Evidence: repository scan returned no log files under repo root.
  • Mitigation: recommendations added below for repeatable log collection/analysis workflow.
  1. P0 Keep OpenAI OAuth tool rejection behavior and enforce documented fallback guidance.
  2. P0 Keep Google auth centralized; avoid introducing new per-tool OAuth duplication.
  3. P1 Add a shared Google auth CLI factory to remove duplicated *-auth command flow code.
  4. P1 Add optional XDG_CONFIG_HOME/override support for auth store paths for containerized/sandboxed environments.
  5. P1 Add periodic log export + analyzer command (error-rate, latency, provider fallback frequency) so reliability trends are measurable from CI/dev snapshots.
  6. P2 Introduce a provider capability matrix module consumed by router/doctor/docs from one source of truth.

Implemented Changes (Diff Summary)

Commits in this branch:

  • 5b95eb1 fix(audit): expand tilde paths for audit log output
  • 076379b refactor(config): generate paas profile from default overlay
  • 00b2d64 feat(google-auth): centralize oauth token store and service checks
  • 092a9ba fix(router): align fallback semantics and oauth provider behavior

Notable file groups:

  • Audit hardening: src/audit/logger.ts, src/audit/logger.test.ts
  • Config consolidation: config/profiles/paas.overlay.yaml, config/paas.yaml, scripts/generate-config-profiles.mjs, src/config/profileTemplates.test.ts, docs/deployment/PAAS.md, package.json
  • Google auth hardening: src/auth/google.ts, src/google/oauth.ts, Google tool modules, Gmail watcher, Google auth CLI commands, src/cli/doctor.ts
  • Router/provider correctness: src/models/router.ts, src/models/openai.ts, src/config/schema.ts, src/models/capabilities.ts
  • Documentation additions: Google OAuth runbook and agent-facing repo map docs

Validation executed:

  • Focused suites (420 tests) across changed modules passed.
  • pnpm lint passed (warnings only, 0 errors).
  • pnpm typecheck passed.
  • pnpm config:profiles:check passed.

Remaining TODOs / Risks

  • No runtime log corpus was available for empirical recurring-error/perf bottleneck analysis.
  • Google auth CLI commands still contain duplicated flow logic across service-specific command files.
  • Auth store remains plaintext on disk (permissions are set, but no at-rest encryption).
  • Provider capability behavior is still partially split across provider clients + capability utility; further normalization is recommended.