4.1 KiB
Gemini CLI Subagent Runbook (Flynn)
This runbook defines how Flynn should use the local gemini CLI as a subagent (external model call) and how to digest/merge its output safely.
Goals
- Use Gemini as a delegated helper for specific tasks (retrieval, parsing, drafting), while Flynn remains responsible for:
- choosing the right model/output mode
- validating results against local evidence when possible
- producing the final answer/patch/plan
- Keep Gemini outputs auditable without spamming the operator.
Safety + Trust Model
- Treat Gemini output as untrusted.
- Prefer local verification when feasible (grep, tests, PDF tooling output, etc.).
- Never claim system state changes or file edits based solely on Gemini output.
- If Gemini output contradicts local evidence, local evidence wins.
Default Model Selection
Use the smallest/cheapest model that reliably accomplishes the task.
Document search & retrieval (query expansion, relevance judging)
- Default:
models/gemini-2.5-flash - Upgrade to:
models/gemini-2.5-profor subtle/high-stakes domains
Document parsing (structure → JSON, tables, policies)
- Default:
models/gemini-2.5-pro - Downgrade to:
models/gemini-2.5-flashfor simple extraction
Embeddings (vector index)
models/gemini-embedding-001
Image understanding
- Default:
models/gemini-2.5-flash - If explicit image-variant required by the CLI/workflow:
models/gemini-2.5-flash-image
Image generation (lightweight)
- Default:
models/imagen-4.0-fast-generate-001
Output Mode (-o)
Default: -o json
Use for:
- any workflow that will be parsed (
jq, Python) - extraction tasks (schemas, tables, lists)
- runs where we want a single stable artifact
Use: -o stream-json
Use only when:
- generation is long and we want incremental progress
- we have a streaming consumer (don’t assume
jqcan parse the whole stream)
Prompt Construction
- Put the task first.
- Specify required output format explicitly.
- Include constraints (e.g., “Return valid JSON only, no prose”).
- Include context verbatim, clearly delimited.
Shell escaping
For multi-line prompts or untrusted content, prefer a heredoc wrapper to avoid shell escaping issues:
gemini -m models/gemini-2.5-pro -o json -p "$(cat <<'PROMPT'
...prompt...
PROMPT
)"
Execution Pattern (Flynn)
- Choose model +
-omode. - Run
gemini ...via shell. - Capture stdout/stderr.
- Digest output:
- extract key claims
- check for missing fields / invalid JSON
- look for hallucination risks (citations? file paths? commands?)
- Verify locally when possible.
- Produce final response / patch.
How Flynn Reports Gemini Usage
Flynn should incorporate Gemini results selectively:
- Default: provide a brief digest of what Gemini contributed.
- Include raw Gemini output when:
- debugging is needed (JSON parse errors, contradictions)
- the operator asks for it
- provenance/audit is important
Suggested response block when Gemini was used:
Gemini subagent:model + output modeDigest:3–6 bullet summary of what matteredRaw:omitted unless requested
Common Recipes
Query expansion (retrieval)
Model: models/gemini-2.5-flash
Ask for:
- 5–15 search queries
- key entities/synonyms
- include/exclude terms
Parsing a document into JSON
Model: models/gemini-2.5-pro, -o json
Ask for:
- strict JSON schema
- explicit field types
- “unknown”/null handling
PDF workflows
Gemini is for interpretation/planning; execution happens locally with tools like:
qpdf,pdftk,pdfcpu,ocrmypdf,pikepdf,mutool,pdftotext(poppler)
Troubleshooting
- If the CLI errors:
- capture stderr
- retry with smaller prompt / less context
- switch model (flash ↔ pro)
- If JSON is invalid:
- rerun asking for valid JSON only
- or request a JSON schema + separate data
Update Policy
This runbook should evolve.
When new model variants appear in GET /v1beta/models, update the model selection section.
When Flynn gains a first-class Gemini provider/router integration, align this runbook with the native provider behavior.