# Codex CLI as a Flynn subagent (runbook) This runbook documents how Flynn uses the **Codex CLI** as an external “subagent” for certain tasks. It is intentionally pragmatic and should be updated as we learn. ## Goals - Use Codex CLI reliably in **non-interactive** contexts (Flynn backends, scripts). - Prefer predictable, easy-to-ingest output. - Treat subagent output as *advice* that Flynn digests and verifies when possible. ## Key constraints ### Interactive mode requires a TTY Running `codex` (no subcommand) is the interactive TUI and fails in non-interactive usage (e.g., from a backend runner) with errors like: - `Error: stdin is not a terminal` **Therefore, Flynn must use `codex exec`**. Reference: https://developers.openai.com/codex/noninteractive/ ## Default invocation ### Plain-text (default) Flynn’s default is **plain-text stdout** (no JSON streaming) because it is the most compatible with external-backend execution and easiest to digest. Pattern: ```bash codex exec --ephemeral -m gpt-5.3-codex "" ``` Behavioral expectations: - **stdout**: final assistant message (what Flynn consumes) - **stderr**: logs/progress (ignored unless debugging) ### Why `--ephemeral` Use `--ephemeral` for backend/subagent usage to avoid persisting sessions to disk. ## When to use `--json` Codex supports `--json` to emit an event stream (JSONL). Flynn will only opt into this when it is clearly beneficial, e.g.: - debugging / auditing where event-level structure matters - programmatic extraction where the plain-text output is ambiguous Pattern: ```bash codex exec --ephemeral --json -m gpt-5.3-codex "" ``` Notes: - `--json` changes stdout format to **JSONL events**, which is not as drop-in as plain text. - If/when Flynn grows a dedicated parser for Codex JSONL, we can consider making `--json` the default. ## Model selection Default model: - `gpt-5.3-codex` Policy: - Keep a single default unless a task clearly benefits from a different one. - If a run fails with model availability errors, verify via a small `codex exec -m "test"` smoke test and update this runbook + config. ## Prompting guidance (subagent hygiene) - Put the **task** first. - Specify the expected **output format** when necessary. - Provide only relevant context; avoid leaking secrets. - If Codex is being used to draft code changes, ask it for: - exact file paths - minimal diffs/patches - assumptions and risks ## How Flynn digests Codex output When Flynn uses Codex: 1. **Digest**: summarize the useful pieces and discard irrelevant content. 2. **Verify** where possible (local grep/tests/lint) before claiming correctness. 3. **Integrate** into the final response as: - a patch/commit - a concise plan - extracted structured data Raw Codex output is shown only when: - Will asks for it - there’s ambiguity or a failure that requires inspection ## Troubleshooting - `stdin is not a terminal` - You invoked interactive `codex` instead of `codex exec`. - Output is noisy / includes progress - Ensure you’re reading stdout only; progress typically goes to stderr. - Need deterministic structured output - Use `--json` and add a parser (or request an explicit format in the prompt).