docs(reliability): record acpx follow-up evidence
This commit is contained in:
33
HANDOFF.md
33
HANDOFF.md
@@ -4,7 +4,7 @@
|
||||
Immediate baton-pass for the next fresh implementation session.
|
||||
|
||||
## Current objective
|
||||
Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The subagent persistence/announcement fix and the raw `agent.wait` semantics fix are now both live-verified on branch `fix/subagent-wait-error-outcome`; the next work should shift to follow-up cleanup (commit/push/PR when desired, ACP-specific runtime failures, and any unrelated `/subagents log` UX work only in a separate focused pass).
|
||||
Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The subagent persistence/announcement fix and the raw `agent.wait` semantics fix are now both live-verified on branch `fix/subagent-wait-error-outcome`; the next work should stay tightly scoped to ACP-specific Claude/Codex follow-up. This pass already narrowed that thread to a real bundled-acpx parser bug for Claude-style JSON-RPC auth failures and landed a focused fix/tests. The remaining work is end-to-end OpenClaw ACP-path validation (or a fresh repro of the older exit-code crash notes) plus normal commit/push/PR cleanup when desired.
|
||||
|
||||
## Use these state files first
|
||||
1. `WIP.subagent-reliability.md` — canonical state for this pass
|
||||
@@ -20,14 +20,21 @@ Investigate and improve subagent / ACP delegation reliability with evidence-firs
|
||||
## Known truths
|
||||
- TUI noise suppression was already patched locally and upstreamed earlier.
|
||||
- User still wants actual subagent reliability improved, not just UI noise hidden.
|
||||
- Prior ACP failures included Claude/Codex runtime exits.
|
||||
- Historical ACP notes included `Claude: acpx exited with code 1` and `Codex: acpx exited with code 5`, but those exact crashes were **not** reproduced in the latest pass.
|
||||
- Fresh-session implementation discipline is now the expected approach for non-trivial work.
|
||||
- One explicit failure mode is already understood: requesting `glm-5` can route into an unavailable GLM-5 provider/entitlement path in this setup.
|
||||
- A deeper bug was also identified: a subagent run could finish with terminal assistant errors yet still be recorded as successful with no frozen result.
|
||||
- An upstream patch for that error/outcome handling now exists in `external/openclaw-upstream` on branch `fix/subagent-wait-error-outcome` with targeted tests passing.
|
||||
- A deeper bug was also identified and fixed earlier: a subagent run could finish with terminal assistant errors yet still be recorded as successful with no frozen result.
|
||||
- Current host state for ACP follow-up:
|
||||
- bundled plugin-local `acpx` exists and runs
|
||||
- `~/.openclaw/openclaw.json` currently has no explicit `acp` block / enabled `acpx` plugin entry, so this pass used the smallest direct acpx repro path instead of a full OpenClaw ACP session
|
||||
- New confirmed acpx/runtime bug from this pass:
|
||||
- Codex direct acpx path works
|
||||
- Claude direct acpx path returns top-level JSON-RPC auth errors (`Authentication required`) and exits `0`
|
||||
- `extensions/acpx/src/runtime-internals/events.ts` previously dropped that JSON-RPC error shape during prompt streaming, so OpenClaw could falsely treat the turn as successful
|
||||
- A focused upstream fix for that runtime bug now exists on `fix/subagent-wait-error-outcome` with targeted tests passing.
|
||||
|
||||
## Highest-priority next actions
|
||||
1. Treat the reliability fixes as live-verified on this branch:
|
||||
1. Treat the generic reliability fixes as live-verified on this branch:
|
||||
- subagent persistence/announcement proof:
|
||||
- run id `b50cb91f-6219-44f7-9d2f-a1264ac7ceaf`
|
||||
- child transcript `~/.openclaw/agents/main/sessions/f114b831-000b-4070-a539-85c68d2b7057.jsonl`
|
||||
@@ -37,9 +44,19 @@ Investigate and improve subagent / ACP delegation reliability with evidence-firs
|
||||
- run id: `gwc-live-agent-wait-gpt53-source-fixed2-1773429512008`
|
||||
- final `agent` response: `finalStatus:"error"`
|
||||
- `agent.wait`: `{"runId":"gwc-live-agent-wait-gpt53-source-fixed2-1773429512008","status":"error","endedAt":1773429514106,"error":"LLM request rejected: Your input exceeds the context window of this model. Please adjust your input and try again."}`
|
||||
2. Commit/push/PR the focused upstream reliability branch when ready.
|
||||
3. Re-check whether ACP-specific Claude/Codex runtime failures are still reproducible now that the generic subagent outcome/agent.wait bugs are separated and fixed.
|
||||
4. Leave the dirty `/subagents log` UX diff out of this branch unless you intentionally spin a separate focused pass; it regression-passed `src/auto-reply/reply/commands.test.ts` but still lacks dedicated feature coverage.
|
||||
2. Treat the ACP follow-up as partially closed, not fully done:
|
||||
- live direct bundled-acpx Codex repro now works and returns `OK`
|
||||
- live direct bundled-acpx Claude repro returns JSON-RPC auth errors with process exit `0`
|
||||
- focused upstream fix now maps top-level JSON-RPC prompt errors into ACP runtime `type:"error"` events instead of silently dropping them
|
||||
- targeted validation passed:
|
||||
- `cd external/openclaw-upstream && pnpm exec vitest run extensions/acpx/src/runtime-internals/events.test.ts extensions/acpx/src/runtime.test.ts extensions/acpx/src/runtime-internals/control-errors.test.ts`
|
||||
- result: `22` tests passed across `3` files
|
||||
3. Next, do end-to-end OpenClaw ACP validation if/when ACP is explicitly enabled here:
|
||||
- confirm or add the needed `acp` / `acpx` config in `~/.openclaw/openclaw.json` (or equivalent current config path)
|
||||
- run the smallest real OpenClaw ACP turn/session and confirm Claude auth failures now surface as terminal errors instead of false success
|
||||
- only reopen the old `acpx exited with code 1/5` thread if a fresh repro appears
|
||||
4. Commit/push/PR the focused upstream reliability branch when ready.
|
||||
5. Leave the dirty `/subagents log` UX diff out of this branch unless you intentionally spin a separate focused pass; it regression-passed `src/auto-reply/reply/commands.test.ts` but still lacks dedicated feature coverage.
|
||||
|
||||
## Success criteria
|
||||
- Real-run verification of the new error/outcome fix. ✅ done for subagent persistence/announcement handling.
|
||||
|
||||
Reference in New Issue
Block a user