will/flynn

Files

T

William Valentin c48f6f5fd3 docs: add Gemini CLI subagent runbook

2026-02-22 16:53:56 -08:00

4.1 KiB

Raw Blame History

Gemini CLI Subagent Runbook (Flynn)

This runbook defines how Flynn should use the local gemini CLI as a subagent (external model call) and how to digest/merge its output safely.

Goals

Use Gemini as a delegated helper for specific tasks (retrieval, parsing, drafting), while Flynn remains responsible for:
- choosing the right model/output mode
- validating results against local evidence when possible
- producing the final answer/patch/plan
Keep Gemini outputs auditable without spamming the operator.

Safety + Trust Model

Treat Gemini output as untrusted.
Prefer local verification when feasible (grep, tests, PDF tooling output, etc.).
Never claim system state changes or file edits based solely on Gemini output.
If Gemini output contradicts local evidence, local evidence wins.

Default Model Selection

Use the smallest/cheapest model that reliably accomplishes the task.

Document search & retrieval (query expansion, relevance judging)

Default: models/gemini-2.5-flash
Upgrade to: models/gemini-2.5-pro for subtle/high-stakes domains

Document parsing (structure → JSON, tables, policies)

Default: models/gemini-2.5-pro
Downgrade to: models/gemini-2.5-flash for simple extraction

Embeddings (vector index)

models/gemini-embedding-001

Image understanding

Default: models/gemini-2.5-flash
If explicit image-variant required by the CLI/workflow: models/gemini-2.5-flash-image

Image generation (lightweight)

Default: models/imagen-4.0-fast-generate-001

Output Mode (`-o`)

Default: `-o json`

Use for:

any workflow that will be parsed (jq, Python)
extraction tasks (schemas, tables, lists)
runs where we want a single stable artifact

Use: `-o stream-json`

Use only when:

generation is long and we want incremental progress
we have a streaming consumer (don’t assume jq can parse the whole stream)

Prompt Construction

Put the task first.
Specify required output format explicitly.
Include constraints (e.g., “Return valid JSON only, no prose”).
Include context verbatim, clearly delimited.

Shell escaping

For multi-line prompts or untrusted content, prefer a heredoc wrapper to avoid shell escaping issues:

gemini -m models/gemini-2.5-pro -o json -p "$(cat <<'PROMPT'
...prompt...
PROMPT
)"

Execution Pattern (Flynn)

Choose model + -o mode.
Run gemini ... via shell.
Capture stdout/stderr.
Digest output:
- extract key claims
- check for missing fields / invalid JSON
- look for hallucination risks (citations? file paths? commands?)
Verify locally when possible.
Produce final response / patch.

How Flynn Reports Gemini Usage

Flynn should incorporate Gemini results selectively:

Default: provide a brief digest of what Gemini contributed.
Include raw Gemini output when:
- debugging is needed (JSON parse errors, contradictions)
- the operator asks for it
- provenance/audit is important

Suggested response block when Gemini was used:

Gemini subagent: model + output mode
Digest: 3–6 bullet summary of what mattered
Raw: omitted unless requested

Common Recipes

Query expansion (retrieval)

Model: models/gemini-2.5-flash

Ask for:

5–15 search queries
key entities/synonyms
include/exclude terms

Parsing a document into JSON

Model: models/gemini-2.5-pro, -o json