HANDOFF.md

Purpose

Immediate baton-pass for the next fresh implementation session.

Current objective

Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The failure-path proof for the new subagent outcome handling is captured, and a focused upstream agent.wait semantics fix is now implemented/tested on branch fix/subagent-wait-error-outcome; the remaining follow-up is deployment/live verification, not root-cause discovery.

Use these state files first

WIP.subagent-reliability.md — canonical state for this pass
memory/tasks.json — task tracking for reliability items
memory/2026-03-04-subagent-delegation.md — earlier delegation context
memory/2026-03-13.md if present, otherwise append today’s evidence there
external/openclaw-upstream/ — for any core-runtime fix work

task-20260304-2215-subagent-reliability — in progress
task-20260304-211216-acp-claude-codex — open

Known truths

TUI noise suppression was already patched locally and upstreamed earlier.
User still wants actual subagent reliability improved, not just UI noise hidden.
Prior ACP failures included Claude/Codex runtime exits.
Fresh-session implementation discipline is now the expected approach for non-trivial work.
One explicit failure mode is already understood: requesting glm-5 can route into an unavailable GLM-5 provider/entitlement path in this setup.
A deeper bug was also identified: a subagent run could finish with terminal assistant errors yet still be recorded as successful with no frozen result.
An upstream patch for that error/outcome handling now exists in external/openclaw-upstream on branch fix/subagent-wait-error-outcome with targeted tests passing.

Highest-priority next actions

Treat the live gpt-5.4 failure repro as proven for subagent persistence/announcement handling:
- run id b50cb91f-6219-44f7-9d2f-a1264ac7ceaf
- child transcript ~/.openclaw/agents/main/sessions/f114b831-000b-4070-a539-85c68d2b7057.jsonl
- runs.json now stores outcome.status: "error", endedReason: "subagent-error", and a non-null frozenResultText
For raw gateway agent.wait, use the new upstream diagnosis/fix rather than re-arguing semantics:
- decision: the previous status:"ok" was a bug, not intended layering
- cause: src/commands/agent.ts fallback lifecycle emission used phase:"end" even when resolved run meta.stopReason:"error"
- fix: src/commands/agent.ts now emits lifecycle phase:"error" with extracted terminal error text in that case; src/gateway/server-methods/agent-wait-dedupe.ts also maps resolved agent payloads with terminal stopReason:"error" / aborted:true to error / timeout
- targeted validation passed: pnpm -C external/openclaw-upstream test -- --run src/commands/agent.test.ts src/gateway/server-methods/agent-wait-dedupe.test.ts src/gateway/server-methods/server-methods.test.ts
If continuing, do a low-noise live verification on the patched gateway/runtime for the same failure class, then report whether raw agent.wait now returns status:"error" as expected.
Re-check whether ACP-specific Claude/Codex runtime failures are still reproducible after separating them from the generic subagent outcome bug.
Leave the dirty /subagents log UX diff out of this branch unless you intentionally spin a separate focused pass; it regression-passed src/auto-reply/reply/commands.test.ts but still lacks dedicated feature coverage.

Success criteria

Real-run verification of the new error/outcome fix. ✅ done for subagent persistence/announcement handling.
Clear separation between resolved reporting bug(s) and any still-open ACP/runtime failures.
Explicit decision on whether raw agent.wait behavior is acceptable or requires a follow-up fix.
State files updated with paths, commands, and outcomes.

3.9 KiB Raw Blame History Unescape Escape