# HANDOFF.md ## Purpose Immediate baton-pass for the next fresh implementation session. ## Current objective Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The current target is to verify the newly landed upstream fix for subagent error/outcome handling and then continue on any remaining real runtime failures. ## Use these state files first 1. `WIP.subagent-reliability.md` — canonical state for this pass 2. `memory/tasks.json` — task tracking for reliability items 3. `memory/2026-03-04-subagent-delegation.md` — earlier delegation context 4. `memory/2026-03-13.md` if present, otherwise append today’s evidence there 5. `external/openclaw-upstream/` — for any core-runtime fix work ## Related tasks - `task-20260304-2215-subagent-reliability` — in progress - `task-20260304-211216-acp-claude-codex` — open ## Known truths - TUI noise suppression was already patched locally and upstreamed earlier. - User still wants actual subagent reliability improved, not just UI noise hidden. - Prior ACP failures included Claude/Codex runtime exits. - Fresh-session implementation discipline is now the expected approach for non-trivial work. - One explicit failure mode is already understood: requesting `glm-5` can route into an unavailable GLM-5 provider/entitlement path in this setup. - A deeper bug was also identified: a subagent run could finish with terminal assistant errors yet still be recorded as successful with no frozen result. - An upstream patch for that error/outcome handling now exists in `external/openclaw-upstream` on branch `fix/subagent-wait-error-outcome` with targeted tests passing. ## Highest-priority next actions 1. The success side is now verified on a real fresh `gpt-5.4` subagent run. 2. Find and execute the smallest safe controlled-failure repro on a valid model/runtime (`gpt-5.4` preferred) so we can confirm: - a failing child run is stored as `error` rather than `ok` - a successful child run stores a useful frozen result / announcement payload 3. Re-check whether ACP-specific Claude/Codex runtime failures are still reproducible after separating them from the generic subagent reporting bug. 4. If another core bug appears, continue in `external/openclaw-upstream/` on a focused branch with targeted validation. 5. Update WIP + memory + tasks before ending. ## Success criteria - Real-run verification of the new error/outcome fix. - Clear separation between resolved reporting bug(s) and any still-open ACP/runtime failures. - State files updated with paths, commands, and outcomes.