docs(wip): tighten subagent reliability baton

2026-03-13 19:34:26 +00:00
parent 8998e7535e
commit 3cfa7a158c
1 changed files with 22 additions and 16 deletions
@@ -1,15 +1,30 @@
 # WIP.subagent-reliability.md
 ## Status
-Status: `open`
+Status: `follow-up`
 Owner: `zap`
 Opened: `2026-03-13`
 Last updated: `2026-03-13`
 ## Purpose
 Investigate and improve subagent / ACP delegation reliability, including timeout behavior, runtime failures, and delayed/duplicate completion-event noise.
-## Why now
+## Current state
-This is the highest-leverage remaining open reliability item because it affects trust in delegation and the usability of fresh implementation runs.
+- The core reliability thread tracked in this WIP is now **fixed and live-verified** on `external/openclaw-upstream` branch `fix/subagent-wait-error-outcome`.
 - Verified fixed:
  - subagent persistence / announcement handling for terminal assistant-provider failures
  - raw `agent.wait` semantics for the live direct gateway path
 - Key upstream commits on this branch:
  - `2a2ed0d6f` — `fix(subagents): derive outcome from terminal assistant errors`
  - `5a328d22b` — `fix(agent): surface terminal run errors in wait semantics`
  - `f9a78e8f7` — `fix(gateway): honor terminal assistant errors in live wait path`
 ## Why this file is still open
 - The broader delegation reliability task is not fully done yet.
 - Remaining follow-up work is now narrower:
  1. ACP-specific Claude/Codex runtime failures
  2. optional separate `/subagents log` UX cleanup
  3. push/PR the focused upstream reliability branch when desired
 ## Related tasks
 - `task-20260304-2215-subagent-reliability` — in progress
@@ -21,19 +36,10 @@ This is the highest-leverage remaining open reliability item because it affects
 - User explicitly wants subagent tooling reliability fixed and completion-event spam prevented.
 - Fresh-session implementation discipline and monitoring thresholds were already documented locally.
-## Goals for this pass
+## Immediate baton
-1. Establish the current failure modes with concrete evidence.
+- Do **not** reopen the solved `agent.wait` investigation unless a fresh repro appears.
-2. Separate ACP-specific failures from generic subagent/session issues.
+- If this project is resumed next, start with ACP-specific Claude/Codex runtime failures.
-3. Determine what is already fixed versus still broken.
+- Treat `/subagents log` UX edits as a separate branch/pass so they do not muddy the reliability fix branch.
 4. Produce a concrete recommendation and, if feasible in one pass, implement the highest-confidence fix.
 5. Update task/memory state with evidence before ending.
 ## Suggested investigation plan
 1. Review current OpenClaw docs and local memory around subagent/ACP failures.
 2. Reproduce or inspect recent failures using session/task evidence instead of guessing.
 3. Check current runtime status / relevant logs / known local patches.
 4. If the issue is in OpenClaw core, work in `external/openclaw-upstream/` on a focused branch.
 5. Validate with the smallest reliable reproduction possible.
 ## Evidence gathered so far
 - Fresh subagent run failed immediately when an explicit `glm-5` choice resolved into the Z.AI provider path before any useful task execution.