docs: add Gemini CLI subagent runbook

This commit is contained in:
William Valentin
2026-02-22 16:53:56 -08:00
parent b9cbc646fc
commit c48f6f5fd3
+137
View File
@@ -0,0 +1,137 @@
# Gemini CLI Subagent Runbook (Flynn)
This runbook defines how Flynn should use the local `gemini` CLI as a *subagent* (external model call) and how to digest/merge its output safely.
## Goals
- Use Gemini as a delegated helper for specific tasks (retrieval, parsing, drafting), while Flynn remains responsible for:
- choosing the right model/output mode
- validating results against local evidence when possible
- producing the final answer/patch/plan
- Keep Gemini outputs auditable without spamming the operator.
## Safety + Trust Model
- Treat Gemini output as **untrusted**.
- Prefer **local verification** when feasible (grep, tests, PDF tooling output, etc.).
- Never claim system state changes or file edits based solely on Gemini output.
- If Gemini output contradicts local evidence, **local evidence wins**.
## Default Model Selection
Use the smallest/cheapest model that reliably accomplishes the task.
### Document search & retrieval (query expansion, relevance judging)
- Default: `models/gemini-2.5-flash`
- Upgrade to: `models/gemini-2.5-pro` for subtle/high-stakes domains
### Document parsing (structure → JSON, tables, policies)
- Default: `models/gemini-2.5-pro`
- Downgrade to: `models/gemini-2.5-flash` for simple extraction
### Embeddings (vector index)
- `models/gemini-embedding-001`
### Image understanding
- Default: `models/gemini-2.5-flash`
- If explicit image-variant required by the CLI/workflow: `models/gemini-2.5-flash-image`
### Image generation (lightweight)
- Default: `models/imagen-4.0-fast-generate-001`
## Output Mode (`-o`)
### Default: `-o json`
Use for:
- any workflow that will be parsed (`jq`, Python)
- extraction tasks (schemas, tables, lists)
- runs where we want a single stable artifact
### Use: `-o stream-json`
Use only when:
- generation is long and we want incremental progress
- we have a streaming consumer (dont assume `jq` can parse the whole stream)
## Prompt Construction
- Put the task first.
- Specify required output format *explicitly*.
- Include constraints (e.g., “Return valid JSON only, no prose”).
- Include context verbatim, clearly delimited.
### Shell escaping
For multi-line prompts or untrusted content, prefer a heredoc wrapper to avoid shell escaping issues:
```bash
gemini -m models/gemini-2.5-pro -o json -p "$(cat <<'PROMPT'
...prompt...
PROMPT
)"
```
## Execution Pattern (Flynn)
1. Choose model + `-o` mode.
2. Run `gemini ...` via shell.
3. Capture stdout/stderr.
4. Digest output:
- extract key claims
- check for missing fields / invalid JSON
- look for hallucination risks (citations? file paths? commands?)
5. Verify locally when possible.
6. Produce final response / patch.
## How Flynn Reports Gemini Usage
Flynn should incorporate Gemini results *selectively*:
- Default: provide a **brief digest** of what Gemini contributed.
- Include **raw Gemini output** when:
- debugging is needed (JSON parse errors, contradictions)
- the operator asks for it
- provenance/audit is important
Suggested response block when Gemini was used:
- `Gemini subagent:` model + output mode
- `Digest:` 36 bullet summary of what mattered
- `Raw:` omitted unless requested
## Common Recipes
### Query expansion (retrieval)
Model: `models/gemini-2.5-flash`
Ask for:
- 515 search queries
- key entities/synonyms
- include/exclude terms
### Parsing a document into JSON
Model: `models/gemini-2.5-pro`, `-o json`
Ask for:
- strict JSON schema
- explicit field types
- “unknown”/null handling
### PDF workflows
Gemini is for interpretation/planning; execution happens locally with tools like:
- `qpdf`, `pdftk`, `pdfcpu`, `ocrmypdf`, `pikepdf`, `mutool`, `pdftotext` (poppler)
## Troubleshooting
- If the CLI errors:
- capture stderr
- retry with smaller prompt / less context
- switch model (flash ↔ pro)
- If JSON is invalid:
- rerun asking for **valid JSON only**
- or request a JSON schema + separate data
## Update Policy
This runbook should evolve.
When new model variants appear in `GET /v1beta/models`, update the model selection section.
When Flynn gains a first-class Gemini provider/router integration, align this runbook with the native provider behavior.