docs(reliability): record acpx follow-up evidence
This commit is contained in:
@@ -88,6 +88,43 @@
|
||||
- subagent persistence/announcement fix: live-verified ✅
|
||||
- raw `agent.wait` semantics fix: live-verified ✅
|
||||
- Side note: unrelated dirty `/subagents log` UX changes in `external/openclaw-upstream` regression-passed `src/auto-reply/reply/commands.test.ts` (44 tests) but were intentionally left out-of-scope for this focused reliability pass.
|
||||
|
||||
## ACP Claude/Codex follow-up (post-`agent.wait` fix)
|
||||
- Historical deferred task `task-20260304-211216-acp-claude-codex` still referenced old failures `Claude: acpx exited with code 1` and `Codex: acpx exited with code 5`, but those exact crashes were **not** reproduced in the latest focused pass.
|
||||
- Current host state check:
|
||||
- `claude` installed: `/home/linuxbrew/.linuxbrew/bin/claude` (`2.1.63`)
|
||||
- `codex` installed: `/home/linuxbrew/.linuxbrew/bin/codex` (`0.107.0`)
|
||||
- no global `acpx` on PATH, but bundled plugin-local runtime exists at `~/.local/share/pnpm/.../openclaw/extensions/acpx/node_modules/.bin/acpx`
|
||||
- current `~/.openclaw/openclaw.json` only showed `plugins.entries.telegram.enabled=true`; no explicit `acp` block / `acpx` plugin entry was present, so the smallest reliable repro path used the bundled `acpx` directly rather than a full OpenClaw ACP session
|
||||
- Live direct bundled-acpx repro results:
|
||||
- Codex command:
|
||||
- `.../acpx --format json --json-strict --timeout 15 codex exec 'reply with OK only'`
|
||||
- result: clean JSON-RPC/session stream ended with `agent_message_chunk: "OK"`, `id:2 result:{stopReason:"end_turn"}`, process `exit=0`
|
||||
- Claude command:
|
||||
- `.../acpx --format json --json-strict --timeout 20 claude exec 'reply with OK only'`
|
||||
- stdout included top-level JSON-RPC errors:
|
||||
- `{"jsonrpc":"2.0","id":2,"error":{"code":-32000,"message":"Authentication required"}}`
|
||||
- `{"jsonrpc":"2.0","id":null,"error":{"code":-32000,"message":"Authentication required"}}`
|
||||
- process still exited `0`
|
||||
- Source-level finding in `external/openclaw-upstream/extensions/acpx/src/runtime-internals/events.ts`:
|
||||
- prompt parsing handled typed `{type:"error"}` lines but dropped top-level JSON-RPC `error` responses
|
||||
- that meant `runtime.runTurn()` could treat a Claude auth failure as success (`done`) when the agent emitted JSON-RPC errors yet exited cleanly
|
||||
- Implemented focused upstream fix on branch `fix/subagent-wait-error-outcome`:
|
||||
- `extensions/acpx/src/runtime-internals/events.ts`
|
||||
- `toAcpxErrorEvent()` now also recognizes top-level JSON-RPC `error` responses via `parseControlJsonError()`
|
||||
- `parsePromptEventLine()` now emits ACP runtime `type:"error"` events for that shape instead of dropping it
|
||||
- added regression coverage:
|
||||
- `extensions/acpx/src/runtime-internals/events.test.ts`
|
||||
- `extensions/acpx/src/runtime-internals/test-fixtures.ts`
|
||||
- `extensions/acpx/src/runtime.test.ts`
|
||||
- Targeted validation passed:
|
||||
- `cd /home/openclaw/.openclaw/workspace/external/openclaw-upstream && pnpm exec vitest run extensions/acpx/src/runtime-internals/events.test.ts extensions/acpx/src/runtime.test.ts extensions/acpx/src/runtime-internals/control-errors.test.ts`
|
||||
- result: `22` tests passed across `3` files
|
||||
- Net status after this pass:
|
||||
- old `acpx exited with code 1/5` reports remain historical evidence only
|
||||
- Codex ACP direct runtime path works today
|
||||
- Claude ACP direct runtime path currently fails for auth, and OpenClaw had a real bug in how the bundled acpx runtime parsed that failure shape
|
||||
- remaining follow-up is end-to-end OpenClaw ACP-path validation once ACP is explicitly configured here (or if a fresh exit-code repro appears)
|
||||
- Will also explicitly requested that zap keep a light eye on active subagents and check whether they look stuck instead of assuming they are fine until completion.
|
||||
- Will explicitly reinforced on 2026-03-13 that once planning is done, zap should use subagents ASAP and start implementation in a fresh session rather than continuing to implement inside the long-lived main chat.
|
||||
- Will explicitly asked on 2026-03-13 for more frequent checks on active subagent runs; zap should inspect/steer sooner instead of waiting for long silent stretches.
|
||||
|
||||
@@ -5,11 +5,14 @@
|
||||
"title": "Fix ACP runtime failures for Claude Code and Codex agents",
|
||||
"owner": "zap",
|
||||
"priority": "high",
|
||||
"status": "open",
|
||||
"details": "Both ACP runs failed during this session (Claude: acpx exited with code 1, Codex: acpx exited with code 5). Investigate acpx/ACP runtime failure path and restore reliable delegation for claude/codex agents.",
|
||||
"status": "in-progress",
|
||||
"details": "Historical evidence said Claude/Codex ACP runs failed with `acpx exited with code 1/5`. Latest focused pass narrowed the live issue: direct bundled `acpx` now shows Codex working, while Claude returns top-level JSON-RPC `Authentication required` errors and exits 0. A focused upstream fix now makes the bundled acpx runtime surface those JSON-RPC prompt errors instead of silently treating them as success. Remaining work: validate through the real OpenClaw ACP session path once ACP is explicitly configured here, or capture a fresh repro of the older exit-code crashes.",
|
||||
"notes": [
|
||||
"Reported by Will on 2026-03-04.",
|
||||
"Added as deferred follow-up while immediate LiteLLM route fix was applied directly."
|
||||
"Added as deferred follow-up while immediate LiteLLM route fix was applied directly.",
|
||||
"2026-03-13 follow-up: exact historical `acpx exited with code 1/5` crashes were not reproduced. Live direct bundled-acpx repros showed Codex success and Claude top-level JSON-RPC auth errors with clean exit 0.",
|
||||
"2026-03-13 follow-up: fixed bundled acpx prompt parsing in external/openclaw-upstream so top-level JSON-RPC error responses now emit ACP runtime error events instead of being dropped. Targeted validation passed: 22 tests across events/control-errors/runtime test files.",
|
||||
"2026-03-13 remaining step: validate the fix through a real OpenClaw ACP session once `acp`/`acpx` is explicitly enabled in local config, or wait for a fresh end-to-end repro of the older exit-code failures."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user