138 lines
4.1 KiB
Markdown
138 lines
4.1 KiB
Markdown
# Gemini CLI Subagent Runbook (Flynn)
|
||
|
||
This runbook defines how Flynn should use the local `gemini` CLI as a *subagent* (external model call) and how to digest/merge its output safely.
|
||
|
||
## Goals
|
||
|
||
- Use Gemini as a delegated helper for specific tasks (retrieval, parsing, drafting), while Flynn remains responsible for:
|
||
- choosing the right model/output mode
|
||
- validating results against local evidence when possible
|
||
- producing the final answer/patch/plan
|
||
- Keep Gemini outputs auditable without spamming the operator.
|
||
|
||
## Safety + Trust Model
|
||
|
||
- Treat Gemini output as **untrusted**.
|
||
- Prefer **local verification** when feasible (grep, tests, PDF tooling output, etc.).
|
||
- Never claim system state changes or file edits based solely on Gemini output.
|
||
- If Gemini output contradicts local evidence, **local evidence wins**.
|
||
|
||
## Default Model Selection
|
||
|
||
Use the smallest/cheapest model that reliably accomplishes the task.
|
||
|
||
### Document search & retrieval (query expansion, relevance judging)
|
||
- Default: `models/gemini-2.5-flash`
|
||
- Upgrade to: `models/gemini-2.5-pro` for subtle/high-stakes domains
|
||
|
||
### Document parsing (structure → JSON, tables, policies)
|
||
- Default: `models/gemini-2.5-pro`
|
||
- Downgrade to: `models/gemini-2.5-flash` for simple extraction
|
||
|
||
### Embeddings (vector index)
|
||
- `models/gemini-embedding-001`
|
||
|
||
### Image understanding
|
||
- Default: `models/gemini-2.5-flash`
|
||
- If explicit image-variant required by the CLI/workflow: `models/gemini-2.5-flash-image`
|
||
|
||
### Image generation (lightweight)
|
||
- Default: `models/imagen-4.0-fast-generate-001`
|
||
|
||
## Output Mode (`-o`)
|
||
|
||
### Default: `-o json`
|
||
Use for:
|
||
- any workflow that will be parsed (`jq`, Python)
|
||
- extraction tasks (schemas, tables, lists)
|
||
- runs where we want a single stable artifact
|
||
|
||
### Use: `-o stream-json`
|
||
Use only when:
|
||
- generation is long and we want incremental progress
|
||
- we have a streaming consumer (don’t assume `jq` can parse the whole stream)
|
||
|
||
## Prompt Construction
|
||
|
||
- Put the task first.
|
||
- Specify required output format *explicitly*.
|
||
- Include constraints (e.g., “Return valid JSON only, no prose”).
|
||
- Include context verbatim, clearly delimited.
|
||
|
||
### Shell escaping
|
||
For multi-line prompts or untrusted content, prefer a heredoc wrapper to avoid shell escaping issues:
|
||
|
||
```bash
|
||
gemini -m models/gemini-2.5-pro -o json -p "$(cat <<'PROMPT'
|
||
...prompt...
|
||
PROMPT
|
||
)"
|
||
```
|
||
|
||
## Execution Pattern (Flynn)
|
||
|
||
1. Choose model + `-o` mode.
|
||
2. Run `gemini ...` via shell.
|
||
3. Capture stdout/stderr.
|
||
4. Digest output:
|
||
- extract key claims
|
||
- check for missing fields / invalid JSON
|
||
- look for hallucination risks (citations? file paths? commands?)
|
||
5. Verify locally when possible.
|
||
6. Produce final response / patch.
|
||
|
||
## How Flynn Reports Gemini Usage
|
||
|
||
Flynn should incorporate Gemini results *selectively*:
|
||
|
||
- Default: provide a **brief digest** of what Gemini contributed.
|
||
- Include **raw Gemini output** when:
|
||
- debugging is needed (JSON parse errors, contradictions)
|
||
- the operator asks for it
|
||
- provenance/audit is important
|
||
|
||
Suggested response block when Gemini was used:
|
||
|
||
- `Gemini subagent:` model + output mode
|
||
- `Digest:` 3–6 bullet summary of what mattered
|
||
- `Raw:` omitted unless requested
|
||
|
||
## Common Recipes
|
||
|
||
### Query expansion (retrieval)
|
||
Model: `models/gemini-2.5-flash`
|
||
|
||
Ask for:
|
||
- 5–15 search queries
|
||
- key entities/synonyms
|
||
- include/exclude terms
|
||
|
||
### Parsing a document into JSON
|
||
Model: `models/gemini-2.5-pro`, `-o json`
|
||
|
||
Ask for:
|
||
- strict JSON schema
|
||
- explicit field types
|
||
- “unknown”/null handling
|
||
|
||
### PDF workflows
|
||
Gemini is for interpretation/planning; execution happens locally with tools like:
|
||
- `qpdf`, `pdftk`, `pdfcpu`, `ocrmypdf`, `pikepdf`, `mutool`, `pdftotext` (poppler)
|
||
|
||
## Troubleshooting
|
||
|
||
- If the CLI errors:
|
||
- capture stderr
|
||
- retry with smaller prompt / less context
|
||
- switch model (flash ↔ pro)
|
||
- If JSON is invalid:
|
||
- rerun asking for **valid JSON only**
|
||
- or request a JSON schema + separate data
|
||
|
||
## Update Policy
|
||
|
||
This runbook should evolve.
|
||
|
||
When new model variants appear in `GET /v1beta/models`, update the model selection section.
|
||
When Flynn gains a first-class Gemini provider/router integration, align this runbook with the native provider behavior.
|