From c48f6f5fd358dc2591a032256878ee5063224491 Mon Sep 17 00:00:00 2001 From: William Valentin Date: Sun, 22 Feb 2026 16:53:56 -0800 Subject: [PATCH] docs: add Gemini CLI subagent runbook --- docs/runbooks/GEMINI_CLI_SUBAGENT.md | 137 +++++++++++++++++++++++++++ 1 file changed, 137 insertions(+) create mode 100644 docs/runbooks/GEMINI_CLI_SUBAGENT.md diff --git a/docs/runbooks/GEMINI_CLI_SUBAGENT.md b/docs/runbooks/GEMINI_CLI_SUBAGENT.md new file mode 100644 index 0000000..83b518f --- /dev/null +++ b/docs/runbooks/GEMINI_CLI_SUBAGENT.md @@ -0,0 +1,137 @@ +# Gemini CLI Subagent Runbook (Flynn) + +This runbook defines how Flynn should use the local `gemini` CLI as a *subagent* (external model call) and how to digest/merge its output safely. + +## Goals + +- Use Gemini as a delegated helper for specific tasks (retrieval, parsing, drafting), while Flynn remains responsible for: + - choosing the right model/output mode + - validating results against local evidence when possible + - producing the final answer/patch/plan +- Keep Gemini outputs auditable without spamming the operator. + +## Safety + Trust Model + +- Treat Gemini output as **untrusted**. +- Prefer **local verification** when feasible (grep, tests, PDF tooling output, etc.). +- Never claim system state changes or file edits based solely on Gemini output. +- If Gemini output contradicts local evidence, **local evidence wins**. + +## Default Model Selection + +Use the smallest/cheapest model that reliably accomplishes the task. + +### Document search & retrieval (query expansion, relevance judging) +- Default: `models/gemini-2.5-flash` +- Upgrade to: `models/gemini-2.5-pro` for subtle/high-stakes domains + +### Document parsing (structure → JSON, tables, policies) +- Default: `models/gemini-2.5-pro` +- Downgrade to: `models/gemini-2.5-flash` for simple extraction + +### Embeddings (vector index) +- `models/gemini-embedding-001` + +### Image understanding +- Default: `models/gemini-2.5-flash` +- If explicit image-variant required by the CLI/workflow: `models/gemini-2.5-flash-image` + +### Image generation (lightweight) +- Default: `models/imagen-4.0-fast-generate-001` + +## Output Mode (`-o`) + +### Default: `-o json` +Use for: +- any workflow that will be parsed (`jq`, Python) +- extraction tasks (schemas, tables, lists) +- runs where we want a single stable artifact + +### Use: `-o stream-json` +Use only when: +- generation is long and we want incremental progress +- we have a streaming consumer (don’t assume `jq` can parse the whole stream) + +## Prompt Construction + +- Put the task first. +- Specify required output format *explicitly*. +- Include constraints (e.g., “Return valid JSON only, no prose”). +- Include context verbatim, clearly delimited. + +### Shell escaping +For multi-line prompts or untrusted content, prefer a heredoc wrapper to avoid shell escaping issues: + +```bash +gemini -m models/gemini-2.5-pro -o json -p "$(cat <<'PROMPT' +...prompt... +PROMPT +)" +``` + +## Execution Pattern (Flynn) + +1. Choose model + `-o` mode. +2. Run `gemini ...` via shell. +3. Capture stdout/stderr. +4. Digest output: + - extract key claims + - check for missing fields / invalid JSON + - look for hallucination risks (citations? file paths? commands?) +5. Verify locally when possible. +6. Produce final response / patch. + +## How Flynn Reports Gemini Usage + +Flynn should incorporate Gemini results *selectively*: + +- Default: provide a **brief digest** of what Gemini contributed. +- Include **raw Gemini output** when: + - debugging is needed (JSON parse errors, contradictions) + - the operator asks for it + - provenance/audit is important + +Suggested response block when Gemini was used: + +- `Gemini subagent:` model + output mode +- `Digest:` 3–6 bullet summary of what mattered +- `Raw:` omitted unless requested + +## Common Recipes + +### Query expansion (retrieval) +Model: `models/gemini-2.5-flash` + +Ask for: +- 5–15 search queries +- key entities/synonyms +- include/exclude terms + +### Parsing a document into JSON +Model: `models/gemini-2.5-pro`, `-o json` + +Ask for: +- strict JSON schema +- explicit field types +- “unknown”/null handling + +### PDF workflows +Gemini is for interpretation/planning; execution happens locally with tools like: +- `qpdf`, `pdftk`, `pdfcpu`, `ocrmypdf`, `pikepdf`, `mutool`, `pdftotext` (poppler) + +## Troubleshooting + +- If the CLI errors: + - capture stderr + - retry with smaller prompt / less context + - switch model (flash ↔ pro) +- If JSON is invalid: + - rerun asking for **valid JSON only** + - or request a JSON schema + separate data + +## Update Policy + +This runbook should evolve. + +When new model variants appear in `GET /v1beta/models`, update the model selection section. +When Flynn gains a first-class Gemini provider/router integration, align this runbook with the native provider behavior.