6.6 KiB
6.6 KiB
Flynn Codebase Audit + Improvement Report
Date: 2026-02-24
Branch: feature/full-audit-hardening-and-config-consolidation
Executive Summary
I audited core Flynn wiring across config -> daemon -> router/providers -> automation/tools -> CLI/docs, then implemented high-safety fixes for auth hardening, router correctness, provider alignment, and config consolidation.
High-impact outcomes:
- Google OAuth runtime is now centralized with store-first token loading, legacy token migration, and refresh persistence.
- Model router fallback behavior now matches retry/fallback intent (no duplicate fallback attempts, retry policy applied on fallback paths).
- OpenAI OAuth mode now fails fast when tools are requested (prevents silent non-executable tool output).
- PaaS config is now generated from canonical
default.yaml+ overlay to prevent template drift. flynn doctornow validates all Google automation services, not just Gmail.
Breaking behavior changes introduced:
models.fallback_chainschema default changed from['anthropic']to[](avoids invalid fallback entries by default).- OpenAI OAuth requests with tools now throw explicit errors instead of returning warning text.
Findings (With File Pointers)
HIGHGoogle token persistence path caused runtime failures in restricted environments.
- Cause: migration/store writes to
~/.config/flynn/auth.jsoncould fail and abort tool execution. - Evidence: src/google/oauth.ts, src/auth/google.ts
- Fix: auth store writes now tolerate known filesystem permission errors and preserve token-file compatibility.
HIGHOpenAI OAuth mode accepted tool-bearing requests without executable tool support.
- Cause: OAuth Codex backend path did not support Flynn tool execution semantics.
- Evidence: src/models/openai.ts
- Fix: explicit throw when tools are present, enabling router fallback or config correction.
HIGHRouter fallback execution did not fully match retry policy and could repeat the same failing client.
- Cause: retry policy only wrapped primary path; fallback clients could be retried inconsistently; duplicate fallbacks possible.
- Evidence: src/models/router.ts
- Fix: attempted-client tracking and retry wrapping now apply to tier/global fallback chat paths and streaming fallback paths.
MEDIUMflynn doctorhad incomplete feature wiring checks for Google services.
- Cause: only Gmail automation health was validated.
- Evidence: src/cli/doctor.ts, src/cli/doctor.ts
- Fix: added service checks for Calendar/Docs/Drive/Tasks with auth-store and token-file detection.
MEDIUMConfig profile overlap risk (manualconfig/paas.yamldrift from canonical defaults).
- Cause: duplicated full config template with independent edits.
- Evidence: config/profiles/paas.overlay.yaml, scripts/generate-config-profiles.mjs
- Fix: canonical+overlay generation model, profile drift check, and sync test.
MEDIUMDefault fallback chain schema value conflicted with router semantics.
- Cause: default
['anthropic']is not a tier/local-provider key in current router semantics. - Evidence: src/config/schema.ts
- Fix: schema default set to
[]to avoid spurious invalid fallback entries.
LOWProvider capability type list lagged configured providers.
- Cause:
ModelProviderunion omittedvercel,minimax,moonshot,synthetic. - Evidence: src/models/capabilities.ts
- Fix: union updated and test coverage expanded.
LOWAudit logger path expansion bug for~.
- Cause: logger configured rotator with expanded path but write stream used raw path.
- Evidence: src/audit/logger.ts, src/audit/logger.ts
- Fix: normalized path now used consistently by logger and rotator.
INFOLog-pattern analysis could not be completed from repository artifacts.
- Cause: no runtime
.log/ audit JSONL artifacts present in workspace snapshot. - Evidence: repository scan returned no log files under repo root.
- Mitigation: recommendations added below for repeatable log collection/analysis workflow.
Recommended Changes (Prioritized)
P0Keep OpenAI OAuth tool rejection behavior and enforce documented fallback guidance.P0Keep Google auth centralized; avoid introducing new per-tool OAuth duplication.P1Add a shared Google auth CLI factory to remove duplicated*-authcommand flow code.P1Add optionalXDG_CONFIG_HOME/override support for auth store paths for containerized/sandboxed environments.P1Add periodic log export + analyzer command (error-rate, latency, provider fallback frequency) so reliability trends are measurable from CI/dev snapshots.P2Introduce a provider capability matrix module consumed by router/doctor/docs from one source of truth.
Implemented Changes (Diff Summary)
Commits in this branch:
5b95eb1fix(audit): expand tilde paths for audit log output076379brefactor(config): generate paas profile from default overlay00b2d64feat(google-auth): centralize oauth token store and service checks092a9bafix(router): align fallback semantics and oauth provider behavior
Notable file groups:
- Audit hardening:
src/audit/logger.ts,src/audit/logger.test.ts - Config consolidation:
config/profiles/paas.overlay.yaml,config/paas.yaml,scripts/generate-config-profiles.mjs,src/config/profileTemplates.test.ts,docs/deployment/PAAS.md,package.json - Google auth hardening:
src/auth/google.ts,src/google/oauth.ts, Google tool modules, Gmail watcher, Google auth CLI commands,src/cli/doctor.ts - Router/provider correctness:
src/models/router.ts,src/models/openai.ts,src/config/schema.ts,src/models/capabilities.ts - Documentation additions: Google OAuth runbook and agent-facing repo map docs
Validation executed:
- Focused suites (420 tests) across changed modules passed.
pnpm lintpassed (warnings only, 0 errors).pnpm typecheckpassed.pnpm config:profiles:checkpassed.
Remaining TODOs / Risks
- No runtime log corpus was available for empirical recurring-error/perf bottleneck analysis.
- Google auth CLI commands still contain duplicated flow logic across service-specific command files.
- Auth store remains plaintext on disk (permissions are set, but no at-rest encryption).
- Provider capability behavior is still partially split across provider clients + capability utility; further normalization is recommended.