docs(wip): tighten subagent reliability baton

2026-03-13 19:34:26 +00:00
parent 8998e7535e
commit 3cfa7a158c
1 changed files with 22 additions and 16 deletions
--- a/WIP.subagent-reliability.md
+++ b/WIP.subagent-reliability.md
@@ -1,15 +1,30 @@
 # WIP.subagent-reliability.md

 ## Status
-Status: `open`
+Status: `follow-up`
 Owner: `zap`
 Opened: `2026-03-13`
+Last updated: `2026-03-13`

 ## Purpose
 Investigate and improve subagent / ACP delegation reliability, including timeout behavior, runtime failures, and delayed/duplicate completion-event noise.

-## Why now
-This is the highest-leverage remaining open reliability item because it affects trust in delegation and the usability of fresh implementation runs.
+## Current state
+- The core reliability thread tracked in this WIP is now **fixed and live-verified** on `external/openclaw-upstream` branch `fix/subagent-wait-error-outcome`.
+- Verified fixed:
+  - subagent persistence / announcement handling for terminal assistant-provider failures
+  - raw `agent.wait` semantics for the live direct gateway path
+- Key upstream commits on this branch:
+  - `2a2ed0d6f` — `fix(subagents): derive outcome from terminal assistant errors`
+  - `5a328d22b` — `fix(agent): surface terminal run errors in wait semantics`
+  - `f9a78e8f7` — `fix(gateway): honor terminal assistant errors in live wait path`
+
+## Why this file is still open
+- The broader delegation reliability task is not fully done yet.
+- Remaining follow-up work is now narrower:
+  1. ACP-specific Claude/Codex runtime failures
+  2. optional separate `/subagents log` UX cleanup
+  3. push/PR the focused upstream reliability branch when desired

 ## Related tasks
 - `task-20260304-2215-subagent-reliability` — in progress
@@ -21,19 +36,10 @@ This is the highest-leverage remaining open reliability item because it affects
 - User explicitly wants subagent tooling reliability fixed and completion-event spam prevented.
 - Fresh-session implementation discipline and monitoring thresholds were already documented locally.

-## Goals for this pass
-1. Establish the current failure modes with concrete evidence.
-2. Separate ACP-specific failures from generic subagent/session issues.
-3. Determine what is already fixed versus still broken.
-4. Produce a concrete recommendation and, if feasible in one pass, implement the highest-confidence fix.
-5. Update task/memory state with evidence before ending.
-
-## Suggested investigation plan
-1. Review current OpenClaw docs and local memory around subagent/ACP failures.
-2. Reproduce or inspect recent failures using session/task evidence instead of guessing.
-3. Check current runtime status / relevant logs / known local patches.
-4. If the issue is in OpenClaw core, work in `external/openclaw-upstream/` on a focused branch.
-5. Validate with the smallest reliable reproduction possible.
+## Immediate baton
+- Do **not** reopen the solved `agent.wait` investigation unless a fresh repro appears.
+- If this project is resumed next, start with ACP-specific Claude/Codex runtime failures.
+- Treat `/subagents log` UX edits as a separate branch/pass so they do not muddy the reliability fix branch.

 ## Evidence gathered so far
 - Fresh subagent run failed immediately when an explicit `glm-5` choice resolved into the Z.AI provider path before any useful task execution.