2.5 KiB
2.5 KiB
HANDOFF.md
Purpose
Immediate baton-pass for the next fresh implementation session.
Current objective
Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The current target is to verify the newly landed upstream fix for subagent error/outcome handling and then continue on any remaining real runtime failures.
Use these state files first
WIP.subagent-reliability.md— canonical state for this passmemory/tasks.json— task tracking for reliability itemsmemory/2026-03-04-subagent-delegation.md— earlier delegation contextmemory/2026-03-13.mdif present, otherwise append today’s evidence thereexternal/openclaw-upstream/— for any core-runtime fix work
Related tasks
task-20260304-2215-subagent-reliability— in progresstask-20260304-211216-acp-claude-codex— open
Known truths
- TUI noise suppression was already patched locally and upstreamed earlier.
- User still wants actual subagent reliability improved, not just UI noise hidden.
- Prior ACP failures included Claude/Codex runtime exits.
- Fresh-session implementation discipline is now the expected approach for non-trivial work.
- One explicit failure mode is already understood: requesting
glm-5can route into an unavailable GLM-5 provider/entitlement path in this setup. - A deeper bug was also identified: a subagent run could finish with terminal assistant errors yet still be recorded as successful with no frozen result.
- An upstream patch for that error/outcome handling now exists in
external/openclaw-upstreamon branchfix/subagent-wait-error-outcomewith targeted tests passing.
Highest-priority next actions
- The success side is now verified on a real fresh
gpt-5.4subagent run. - Find and execute the smallest safe controlled-failure repro on a valid model/runtime (
gpt-5.4preferred) so we can confirm:- a failing child run is stored as
errorrather thanok - a successful child run stores a useful frozen result / announcement payload
- a failing child run is stored as
- Re-check whether ACP-specific Claude/Codex runtime failures are still reproducible after separating them from the generic subagent reporting bug.
- If another core bug appears, continue in
external/openclaw-upstream/on a focused branch with targeted validation. - Update WIP + memory + tasks before ending.
Success criteria
- Real-run verification of the new error/outcome fix.
- Clear separation between resolved reporting bug(s) and any still-open ACP/runtime failures.
- State files updated with paths, commands, and outcomes.