docs(wip): tighten subagent reliability baton

This commit is contained in:
zap
2026-03-13 19:34:26 +00:00
parent 8998e7535e
commit 3cfa7a158c

View File

@@ -1,15 +1,30 @@
# WIP.subagent-reliability.md
## Status
Status: `open`
Status: `follow-up`
Owner: `zap`
Opened: `2026-03-13`
Last updated: `2026-03-13`
## Purpose
Investigate and improve subagent / ACP delegation reliability, including timeout behavior, runtime failures, and delayed/duplicate completion-event noise.
## Why now
This is the highest-leverage remaining open reliability item because it affects trust in delegation and the usability of fresh implementation runs.
## Current state
- The core reliability thread tracked in this WIP is now **fixed and live-verified** on `external/openclaw-upstream` branch `fix/subagent-wait-error-outcome`.
- Verified fixed:
- subagent persistence / announcement handling for terminal assistant-provider failures
- raw `agent.wait` semantics for the live direct gateway path
- Key upstream commits on this branch:
- `2a2ed0d6f``fix(subagents): derive outcome from terminal assistant errors`
- `5a328d22b``fix(agent): surface terminal run errors in wait semantics`
- `f9a78e8f7``fix(gateway): honor terminal assistant errors in live wait path`
## Why this file is still open
- The broader delegation reliability task is not fully done yet.
- Remaining follow-up work is now narrower:
1. ACP-specific Claude/Codex runtime failures
2. optional separate `/subagents log` UX cleanup
3. push/PR the focused upstream reliability branch when desired
## Related tasks
- `task-20260304-2215-subagent-reliability` — in progress
@@ -21,19 +36,10 @@ This is the highest-leverage remaining open reliability item because it affects
- User explicitly wants subagent tooling reliability fixed and completion-event spam prevented.
- Fresh-session implementation discipline and monitoring thresholds were already documented locally.
## Goals for this pass
1. Establish the current failure modes with concrete evidence.
2. Separate ACP-specific failures from generic subagent/session issues.
3. Determine what is already fixed versus still broken.
4. Produce a concrete recommendation and, if feasible in one pass, implement the highest-confidence fix.
5. Update task/memory state with evidence before ending.
## Suggested investigation plan
1. Review current OpenClaw docs and local memory around subagent/ACP failures.
2. Reproduce or inspect recent failures using session/task evidence instead of guessing.
3. Check current runtime status / relevant logs / known local patches.
4. If the issue is in OpenClaw core, work in `external/openclaw-upstream/` on a focused branch.
5. Validate with the smallest reliable reproduction possible.
## Immediate baton
- Do **not** reopen the solved `agent.wait` investigation unless a fresh repro appears.
- If this project is resumed next, start with ACP-specific Claude/Codex runtime failures.
- Treat `/subagents log` UX edits as a separate branch/pass so they do not muddy the reliability fix branch.
## Evidence gathered so far
- Fresh subagent run failed immediately when an explicit `glm-5` choice resolved into the Z.AI provider path before any useful task execution.