4.5 KiB
4.5 KiB
WIP.subagent-reliability.md
Status
Status: open
Owner: zap
Opened: 2026-03-13
Purpose
Investigate and improve subagent / ACP delegation reliability, including timeout behavior, runtime failures, and delayed/duplicate completion-event noise.
Why now
This is the highest-leverage remaining open reliability item because it affects trust in delegation and the usability of fresh implementation runs.
Related tasks
task-20260304-2215-subagent-reliability— in progresstask-20260304-211216-acp-claude-codex— open
Known context
- Prior work already patched TUI formatting to suppress internal runtime completion context blocks.
- Upstream patch exists in
external/openclaw-upstreamon branchfix/tui-hide-internal-runtime-contextcommit0f66a4547. - User explicitly wants subagent tooling reliability fixed and completion-event spam prevented.
- Fresh-session implementation discipline and monitoring thresholds were already documented locally.
Goals for this pass
- Establish the current failure modes with concrete evidence.
- Separate ACP-specific failures from generic subagent/session issues.
- Determine what is already fixed versus still broken.
- Produce a concrete recommendation and, if feasible in one pass, implement the highest-confidence fix.
- Update task/memory state with evidence before ending.
Suggested investigation plan
- Review current OpenClaw docs and local memory around subagent/ACP failures.
- Reproduce or inspect recent failures using session/task evidence instead of guessing.
- Check current runtime status / relevant logs / known local patches.
- If the issue is in OpenClaw core, work in
external/openclaw-upstream/on a focused branch. - Validate with the smallest reliable reproduction possible.
Evidence gathered so far
- Fresh subagent run failed immediately when an explicit
glm-5choice resolved into the Z.AI provider path before any useful task execution. - Current installed agent auth profile keys inspected in agent stores include
openai-codex:default,litellm:default, andgithub-copilot:github. - Will clarified that Z.AI auth does exist, but this account is not entitled for
glm-5. - Root cause for this immediate repro is therefore best described as a provider/model entitlement mismatch caused by the explicit spawn model choice, not missing auth propagation between agents.
- A later "corrected" run using
litellm/glm-5also did not succeed: child transcript~/.openclaw/agents/main/sessions/1615a980-cf92-4d5e-845a-a2abe77c0418.jsonlcontains repeated assistantstopReason:"error"entries with429 ... subscription plan does not yet include access to GLM-5, while~/.openclaw/subagents/runs.jsonrecorded that run (776a8b51-6fdc-448e-83bc-55418814a05b) asoutcome.status: "ok"withfrozenResultText: null. - This separates the problems:
- ACP/operator/model-selection issue: explicit
glm-5→zai/glm-5without auth (already understood). - Generic subagent completion/reporting issue: terminal assistant errors can still be stored/announced as successful completion with no frozen result.
- ACP/operator/model-selection issue: explicit
- Implemented upstream patch on branch
fix/subagent-wait-error-outcomeinexternal/openclaw-upstreamso subagent completion paths inspect the latest assistant terminal message and treat terminal assistant errors asoutcome.status: "error"rather thanok. - Validation completed for targeted non-E2E coverage:
pnpm -C external/openclaw-upstream test -- --run src/agents/tools/sessions-helpers.terminal-text.test.ts src/agents/subagent-registry.persistence.test.ts src/gateway/server-methods/server-methods.test.ts- result: passed (
50 testsacross3files).
- E2E-style
subagent-announce.format.e2e.test.tscoverage was updated but the normal Vitest include rules exclude*.e2e.test.ts; directpnpm test -- --run ...e2e...confirms exclusion rather than executing that file. - Next step after this patch: rerun a real subagent with a known-working model (
gpt-5.4or another actually entitled model) and confirmruns.jsonstoreserroron terminal assistant failure and a useful frozen result on success.
Constraints
- Prefer evidence over theory.
- Do not claim a fix without concrete validation.
- Keep the main session clean; use this file as the canonical baton.
Success criteria
- Clear diagnosis of the current reliability problem(s).
- At least one of:
- implemented fix with validation, or
- sharply scoped next fix plan with exact evidence and files.
memory/2026-03-13.md(or current daily note),memory/tasks.json, and this WIP updated.