3.1 KiB
3.1 KiB
2026-03-13
Subagent reliability investigation
- Fresh implementation subagent launch for subagent/ACP reliability failed immediately before doing any task work.
- Failure mode: delegated run was spawned with model
glm-5, which resolved to provider modelzai/glm-5. - Current installed agent auth profile keys inspected in agent stores include
openai-codex:default,litellm:default, andgithub-copilot:github. - Will clarified on 2026-03-13 that Z.AI auth does exist in the environment, but the account is not entitled for
glm-5. - Verified by inspecting agent auth profile keys under:
/home/openclaw/.openclaw/agents/*/agent/auth-profiles.json
- Relevant OpenClaw docs confirm:
- subagent spawns inherit caller model when
sessions_spawn.modelis omitted - provider/model auth errors like
No API key found for provider "zai"occur when a provider model is selected without matching auth - multi-agent auth is per-agent via
~/.openclaw/agents/<agentId>/agent/auth-profiles.json
- subagent spawns inherit caller model when
- Conclusion: the immediate failure was caused by an incorrect explicit model selection in the spawn request, not by missing auth propagation between agents.
- Corrective action: retry fresh delegation with
litellm/glm-5(the intended medium-tier routed model for delegated implementation work in this setup). - Will explicitly requested on 2026-03-13 to use
gpt-5.4for subagents for now while debugging delegation reliability. - New evidence from the corrected run:
~/.openclaw/agents/main/sessions/1615a980-cf92-4d5e-845a-a2abe77c0418.jsonlshows repeated assistantstopReason:"error"entries with429 ... GLM-5 not included in current subscription plan, but~/.openclaw/subagents/runs.jsonrecorded run776a8b51-6fdc-448e-83bc-55418814a05basoutcome.status: "ok"andfrozenResultText: null. - That separates ACP/runtime choice problems from a generic subagent completion/reporting bug: a terminal assistant error can still be persisted/announced as success with no useful result.
- Implemented upstream fix on branch
external/openclaw-upstream@fix/subagent-wait-error-outcome:- added assistant terminal-outcome helper so empty-content assistant errors still yield usable terminal text
- subagent registry now downgrades
agent.wait => oktoerrorwhen the child session's terminal assistant message is actually an error - subagent announce flow now reports terminal assistant errors as failed outcomes instead of successful
(no output)completions
- Targeted validation passed:
pnpm -C /home/openclaw/.openclaw/workspace/external/openclaw-upstream test -- --run src/agents/tools/sessions-helpers.terminal-text.test.ts src/agents/subagent-registry.persistence.test.ts src/gateway/server-methods/server-methods.test.ts- result:
50 testspassed across3files
- Follow-up still needed: rerun a real delegated subagent using a known-working model entitlement (
gpt-5.4preferred for now) to verify successful runs leave a useful frozen result and failed runs now persist aserror. - Will also explicitly requested that zap keep a light eye on active subagents and check whether they look stuck instead of assuming they are fine until completion.