docs: add Codex CLI subagent runbook
This commit is contained in:
@@ -0,0 +1,102 @@
|
|||||||
|
# Codex CLI as a Flynn subagent (runbook)
|
||||||
|
|
||||||
|
This runbook documents how Flynn uses the **Codex CLI** as an external “subagent” for certain tasks.
|
||||||
|
|
||||||
|
It is intentionally pragmatic and should be updated as we learn.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
- Use Codex CLI reliably in **non-interactive** contexts (Flynn backends, scripts).
|
||||||
|
- Prefer predictable, easy-to-ingest output.
|
||||||
|
- Treat subagent output as *advice* that Flynn digests and verifies when possible.
|
||||||
|
|
||||||
|
## Key constraints
|
||||||
|
|
||||||
|
### Interactive mode requires a TTY
|
||||||
|
Running `codex` (no subcommand) is the interactive TUI and fails in non-interactive usage (e.g., from a backend runner) with errors like:
|
||||||
|
|
||||||
|
- `Error: stdin is not a terminal`
|
||||||
|
|
||||||
|
**Therefore, Flynn must use `codex exec`**.
|
||||||
|
|
||||||
|
Reference: https://developers.openai.com/codex/noninteractive/
|
||||||
|
|
||||||
|
## Default invocation
|
||||||
|
|
||||||
|
### Plain-text (default)
|
||||||
|
Flynn’s default is **plain-text stdout** (no JSON streaming) because it is the most compatible with external-backend execution and easiest to digest.
|
||||||
|
|
||||||
|
Pattern:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
codex exec --ephemeral -m gpt-5.3-codex "<PROMPT>"
|
||||||
|
```
|
||||||
|
|
||||||
|
Behavioral expectations:
|
||||||
|
- **stdout**: final assistant message (what Flynn consumes)
|
||||||
|
- **stderr**: logs/progress (ignored unless debugging)
|
||||||
|
|
||||||
|
### Why `--ephemeral`
|
||||||
|
Use `--ephemeral` for backend/subagent usage to avoid persisting sessions to disk.
|
||||||
|
|
||||||
|
## When to use `--json`
|
||||||
|
|
||||||
|
Codex supports `--json` to emit an event stream (JSONL). Flynn will only opt into this when it is clearly beneficial, e.g.:
|
||||||
|
|
||||||
|
- debugging / auditing where event-level structure matters
|
||||||
|
- programmatic extraction where the plain-text output is ambiguous
|
||||||
|
|
||||||
|
Pattern:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
codex exec --ephemeral --json -m gpt-5.3-codex "<PROMPT>"
|
||||||
|
```
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- `--json` changes stdout format to **JSONL events**, which is not as drop-in as plain text.
|
||||||
|
- If/when Flynn grows a dedicated parser for Codex JSONL, we can consider making `--json` the default.
|
||||||
|
|
||||||
|
## Model selection
|
||||||
|
|
||||||
|
Default model:
|
||||||
|
- `gpt-5.3-codex`
|
||||||
|
|
||||||
|
Policy:
|
||||||
|
- Keep a single default unless a task clearly benefits from a different one.
|
||||||
|
- If a run fails with model availability errors, verify via a small `codex exec -m <MODEL> "test"` smoke test and update this runbook + config.
|
||||||
|
|
||||||
|
## Prompting guidance (subagent hygiene)
|
||||||
|
|
||||||
|
- Put the **task** first.
|
||||||
|
- Specify the expected **output format** when necessary.
|
||||||
|
- Provide only relevant context; avoid leaking secrets.
|
||||||
|
- If Codex is being used to draft code changes, ask it for:
|
||||||
|
- exact file paths
|
||||||
|
- minimal diffs/patches
|
||||||
|
- assumptions and risks
|
||||||
|
|
||||||
|
## How Flynn digests Codex output
|
||||||
|
|
||||||
|
When Flynn uses Codex:
|
||||||
|
|
||||||
|
1. **Digest**: summarize the useful pieces and discard irrelevant content.
|
||||||
|
2. **Verify** where possible (local grep/tests/lint) before claiming correctness.
|
||||||
|
3. **Integrate** into the final response as:
|
||||||
|
- a patch/commit
|
||||||
|
- a concise plan
|
||||||
|
- extracted structured data
|
||||||
|
|
||||||
|
Raw Codex output is shown only when:
|
||||||
|
- Will asks for it
|
||||||
|
- there’s ambiguity or a failure that requires inspection
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
- `stdin is not a terminal`
|
||||||
|
- You invoked interactive `codex` instead of `codex exec`.
|
||||||
|
|
||||||
|
- Output is noisy / includes progress
|
||||||
|
- Ensure you’re reading stdout only; progress typically goes to stderr.
|
||||||
|
|
||||||
|
- Need deterministic structured output
|
||||||
|
- Use `--json` and add a parser (or request an explicit format in the prompt).
|
||||||
Reference in New Issue
Block a user