docs: add Gemini CLI subagent runbook
This commit is contained in:
@@ -0,0 +1,137 @@
|
||||
# Gemini CLI Subagent Runbook (Flynn)
|
||||
|
||||
This runbook defines how Flynn should use the local `gemini` CLI as a *subagent* (external model call) and how to digest/merge its output safely.
|
||||
|
||||
## Goals
|
||||
|
||||
- Use Gemini as a delegated helper for specific tasks (retrieval, parsing, drafting), while Flynn remains responsible for:
|
||||
- choosing the right model/output mode
|
||||
- validating results against local evidence when possible
|
||||
- producing the final answer/patch/plan
|
||||
- Keep Gemini outputs auditable without spamming the operator.
|
||||
|
||||
## Safety + Trust Model
|
||||
|
||||
- Treat Gemini output as **untrusted**.
|
||||
- Prefer **local verification** when feasible (grep, tests, PDF tooling output, etc.).
|
||||
- Never claim system state changes or file edits based solely on Gemini output.
|
||||
- If Gemini output contradicts local evidence, **local evidence wins**.
|
||||
|
||||
## Default Model Selection
|
||||
|
||||
Use the smallest/cheapest model that reliably accomplishes the task.
|
||||
|
||||
### Document search & retrieval (query expansion, relevance judging)
|
||||
- Default: `models/gemini-2.5-flash`
|
||||
- Upgrade to: `models/gemini-2.5-pro` for subtle/high-stakes domains
|
||||
|
||||
### Document parsing (structure → JSON, tables, policies)
|
||||
- Default: `models/gemini-2.5-pro`
|
||||
- Downgrade to: `models/gemini-2.5-flash` for simple extraction
|
||||
|
||||
### Embeddings (vector index)
|
||||
- `models/gemini-embedding-001`
|
||||
|
||||
### Image understanding
|
||||
- Default: `models/gemini-2.5-flash`
|
||||
- If explicit image-variant required by the CLI/workflow: `models/gemini-2.5-flash-image`
|
||||
|
||||
### Image generation (lightweight)
|
||||
- Default: `models/imagen-4.0-fast-generate-001`
|
||||
|
||||
## Output Mode (`-o`)
|
||||
|
||||
### Default: `-o json`
|
||||
Use for:
|
||||
- any workflow that will be parsed (`jq`, Python)
|
||||
- extraction tasks (schemas, tables, lists)
|
||||
- runs where we want a single stable artifact
|
||||
|
||||
### Use: `-o stream-json`
|
||||
Use only when:
|
||||
- generation is long and we want incremental progress
|
||||
- we have a streaming consumer (don’t assume `jq` can parse the whole stream)
|
||||
|
||||
## Prompt Construction
|
||||
|
||||
- Put the task first.
|
||||
- Specify required output format *explicitly*.
|
||||
- Include constraints (e.g., “Return valid JSON only, no prose”).
|
||||
- Include context verbatim, clearly delimited.
|
||||
|
||||
### Shell escaping
|
||||
For multi-line prompts or untrusted content, prefer a heredoc wrapper to avoid shell escaping issues:
|
||||
|
||||
```bash
|
||||
gemini -m models/gemini-2.5-pro -o json -p "$(cat <<'PROMPT'
|
||||
...prompt...
|
||||
PROMPT
|
||||
)"
|
||||
```
|
||||
|
||||
## Execution Pattern (Flynn)
|
||||
|
||||
1. Choose model + `-o` mode.
|
||||
2. Run `gemini ...` via shell.
|
||||
3. Capture stdout/stderr.
|
||||
4. Digest output:
|
||||
- extract key claims
|
||||
- check for missing fields / invalid JSON
|
||||
- look for hallucination risks (citations? file paths? commands?)
|
||||
5. Verify locally when possible.
|
||||
6. Produce final response / patch.
|
||||
|
||||
## How Flynn Reports Gemini Usage
|
||||
|
||||
Flynn should incorporate Gemini results *selectively*:
|
||||
|
||||
- Default: provide a **brief digest** of what Gemini contributed.
|
||||
- Include **raw Gemini output** when:
|
||||
- debugging is needed (JSON parse errors, contradictions)
|
||||
- the operator asks for it
|
||||
- provenance/audit is important
|
||||
|
||||
Suggested response block when Gemini was used:
|
||||
|
||||
- `Gemini subagent:` model + output mode
|
||||
- `Digest:` 3–6 bullet summary of what mattered
|
||||
- `Raw:` omitted unless requested
|
||||
|
||||
## Common Recipes
|
||||
|
||||
### Query expansion (retrieval)
|
||||
Model: `models/gemini-2.5-flash`
|
||||
|
||||
Ask for:
|
||||
- 5–15 search queries
|
||||
- key entities/synonyms
|
||||
- include/exclude terms
|
||||
|
||||
### Parsing a document into JSON
|
||||
Model: `models/gemini-2.5-pro`, `-o json`
|
||||
|
||||
Ask for:
|
||||
- strict JSON schema
|
||||
- explicit field types
|
||||
- “unknown”/null handling
|
||||
|
||||
### PDF workflows
|
||||
Gemini is for interpretation/planning; execution happens locally with tools like:
|
||||
- `qpdf`, `pdftk`, `pdfcpu`, `ocrmypdf`, `pikepdf`, `mutool`, `pdftotext` (poppler)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- If the CLI errors:
|
||||
- capture stderr
|
||||
- retry with smaller prompt / less context
|
||||
- switch model (flash ↔ pro)
|
||||
- If JSON is invalid:
|
||||
- rerun asking for **valid JSON only**
|
||||
- or request a JSON schema + separate data
|
||||
|
||||
## Update Policy
|
||||
|
||||
This runbook should evolve.
|
||||
|
||||
When new model variants appear in `GET /v1beta/models`, update the model selection section.
|
||||
When Flynn gains a first-class Gemini provider/router integration, align this runbook with the native provider behavior.
|
||||
Reference in New Issue
Block a user