docs(reliability): record live agent.wait blocker evidence

This commit is contained in:
zap
2026-03-13 18:56:52 +00:00
parent 0c25426974
commit f2b99841af
4 changed files with 68 additions and 10 deletions

View File

@@ -4,7 +4,7 @@
Immediate baton-pass for the next fresh implementation session.
## Current objective
Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The failure-path proof for the new subagent outcome handling is captured, and a focused upstream `agent.wait` semantics fix is now implemented/tested on branch `fix/subagent-wait-error-outcome`; the remaining follow-up is deployment/live verification, not root-cause discovery.
Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The failure-path proof for the new subagent outcome handling is captured, but the focused upstream `agent.wait` semantics fix on branch `fix/subagent-wait-error-outcome` did **not** hold in a fresh live source-gateway repro, so the remaining work is a narrower root-cause follow-up on the still-open live `agent.wait => ok` path.
## Use these state files first
1. `WIP.subagent-reliability.md` — canonical state for this pass
@@ -31,12 +31,19 @@ Investigate and improve subagent / ACP delegation reliability with evidence-firs
- run id `b50cb91f-6219-44f7-9d2f-a1264ac7ceaf`
- child transcript `~/.openclaw/agents/main/sessions/f114b831-000b-4070-a539-85c68d2b7057.jsonl`
- `runs.json` now stores `outcome.status: "error"`, `endedReason: "subagent-error"`, and a non-null `frozenResultText`
2. For raw gateway `agent.wait`, use the new upstream diagnosis/fix rather than re-arguing semantics:
- decision: the previous `status:"ok"` was a bug, not intended layering
- cause: `src/commands/agent.ts` fallback lifecycle emission used `phase:"end"` even when resolved run `meta.stopReason:"error"`
- fix: `src/commands/agent.ts` now emits lifecycle `phase:"error"` with extracted terminal error text in that case; `src/gateway/server-methods/agent-wait-dedupe.ts` also maps resolved agent payloads with terminal `stopReason:"error"` / `aborted:true` to `error` / `timeout`
- targeted validation passed: `pnpm -C external/openclaw-upstream test -- --run src/commands/agent.test.ts src/gateway/server-methods/agent-wait-dedupe.test.ts src/gateway/server-methods/server-methods.test.ts`
3. If continuing, do a low-noise live verification on the patched gateway/runtime for the same failure class, then report whether raw `agent.wait` now returns `status:"error"` as expected.
2. Treat raw gateway `agent.wait` as still **open** despite the current follow-up fix branch.
- decisive live source-gateway repro:
- gateway launch: `OPENCLAW_SKIP_CHANNELS=1 CLAWDBOT_SKIP_CHANNELS=1 pnpm exec tsx src/index.ts gateway run --port 18902 --bind loopback --auth none --allow-unconfigured`
- session key: `agent:main:subagent:agent-wait-gpt53-live-source-1773427981586`
- run id: `gwc-live-agent-wait-gpt53-source-1773427981614`
- `agent.wait`: `{"runId":"gwc-live-agent-wait-gpt53-source-1773427981614","status":"ok","endedAt":1773427984243}`
- last assistant: `provider:"openai-codex" model:"gpt-5.3-codex" stopReason:"error" errorMessage contains context_length_exceeded`
- this is the current canonical blocker evidence for the still-open live path
3. Most likely remaining gap to investigate next:
- `src/commands/agent.ts` only applies the new fallback correction when `!lifecycleEnded`
- `lifecycleEnded` flips true on any inner lifecycle `phase:"end"` or `phase:"error"`
- `src/gateway/server-methods/agent-job.ts` resolves/caches `phase:"end"` as terminal `status:"ok"`
- so an inner lifecycle emitter is still the likeliest place where terminal assistant/provider failures are being marked `end` too early on the live direct gateway path
4. Re-check whether ACP-specific Claude/Codex runtime failures are still reproducible after separating them from the generic subagent outcome bug.
5. Leave the dirty `/subagents log` UX diff out of this branch unless you intentionally spin a separate focused pass; it regression-passed `src/auto-reply/reply/commands.test.ts` but still lacks dedicated feature coverage.