3.9 KiB
3.9 KiB
HANDOFF.md
Purpose
Immediate baton-pass for the next fresh implementation session.
Current objective
Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The failure-path proof for the new subagent outcome handling is captured, and a focused upstream agent.wait semantics fix is now implemented/tested on branch fix/subagent-wait-error-outcome; the remaining follow-up is deployment/live verification, not root-cause discovery.
Use these state files first
WIP.subagent-reliability.md— canonical state for this passmemory/tasks.json— task tracking for reliability itemsmemory/2026-03-04-subagent-delegation.md— earlier delegation contextmemory/2026-03-13.mdif present, otherwise append today’s evidence thereexternal/openclaw-upstream/— for any core-runtime fix work
Related tasks
task-20260304-2215-subagent-reliability— in progresstask-20260304-211216-acp-claude-codex— open
Known truths
- TUI noise suppression was already patched locally and upstreamed earlier.
- User still wants actual subagent reliability improved, not just UI noise hidden.
- Prior ACP failures included Claude/Codex runtime exits.
- Fresh-session implementation discipline is now the expected approach for non-trivial work.
- One explicit failure mode is already understood: requesting
glm-5can route into an unavailable GLM-5 provider/entitlement path in this setup. - A deeper bug was also identified: a subagent run could finish with terminal assistant errors yet still be recorded as successful with no frozen result.
- An upstream patch for that error/outcome handling now exists in
external/openclaw-upstreamon branchfix/subagent-wait-error-outcomewith targeted tests passing.
Highest-priority next actions
- Treat the live
gpt-5.4failure repro as proven for subagent persistence/announcement handling:- run id
b50cb91f-6219-44f7-9d2f-a1264ac7ceaf - child transcript
~/.openclaw/agents/main/sessions/f114b831-000b-4070-a539-85c68d2b7057.jsonl runs.jsonnow storesoutcome.status: "error",endedReason: "subagent-error", and a non-nullfrozenResultText
- run id
- For raw gateway
agent.wait, use the new upstream diagnosis/fix rather than re-arguing semantics:- decision: the previous
status:"ok"was a bug, not intended layering - cause:
src/commands/agent.tsfallback lifecycle emission usedphase:"end"even when resolved runmeta.stopReason:"error" - fix:
src/commands/agent.tsnow emits lifecyclephase:"error"with extracted terminal error text in that case;src/gateway/server-methods/agent-wait-dedupe.tsalso maps resolved agent payloads with terminalstopReason:"error"/aborted:truetoerror/timeout - targeted validation passed:
pnpm -C external/openclaw-upstream test -- --run src/commands/agent.test.ts src/gateway/server-methods/agent-wait-dedupe.test.ts src/gateway/server-methods/server-methods.test.ts
- decision: the previous
- If continuing, do a low-noise live verification on the patched gateway/runtime for the same failure class, then report whether raw
agent.waitnow returnsstatus:"error"as expected. - Re-check whether ACP-specific Claude/Codex runtime failures are still reproducible after separating them from the generic subagent outcome bug.
- Leave the dirty
/subagents logUX diff out of this branch unless you intentionally spin a separate focused pass; it regression-passedsrc/auto-reply/reply/commands.test.tsbut still lacks dedicated feature coverage.
Success criteria
- Real-run verification of the new error/outcome fix. ✅ done for subagent persistence/announcement handling.
- Clear separation between resolved reporting bug(s) and any still-open ACP/runtime failures.
- Explicit decision on whether raw
agent.waitbehavior is acceptable or requires a follow-up fix. - State files updated with paths, commands, and outcomes.