docs(subagents): record live gpt-5.4 failure verification
This commit is contained in:
@@ -23,6 +23,18 @@
|
||||
- Targeted validation passed:
|
||||
- `pnpm -C /home/openclaw/.openclaw/workspace/external/openclaw-upstream test -- --run src/agents/tools/sessions-helpers.terminal-text.test.ts src/agents/subagent-registry.persistence.test.ts src/gateway/server-methods/server-methods.test.ts`
|
||||
- result: `50 tests` passed across `3` files
|
||||
- Follow-up still needed: rerun a real delegated subagent using a known-working model entitlement (`gpt-5.4` preferred for now) to verify successful runs leave a useful frozen result and failed runs now persist as `error`.
|
||||
- Real success-path verification later passed on `gpt-5.4` with run `23750d80-b481-4f50-b219-cc9245be405f` and final child result `SUCCESS-PROBE-OK`.
|
||||
- Real failure-path verification later also passed on valid `gpt-5.4` by intentionally triggering a `context_length_exceeded` provider error with a token-dense oversized task payload.
|
||||
- child run: `b50cb91f-6219-44f7-9d2f-a1264ac7ceaf`
|
||||
- child session: `agent:main:subagent:4c0dd686-cd2e-4cba-b80b-2fbf309a4594`
|
||||
- child transcript: `~/.openclaw/agents/main/sessions/f114b831-000b-4070-a539-85c68d2b7057.jsonl`
|
||||
- transcript terminal assistant entry recorded `provider:"openai-codex"`, `model:"gpt-5.4"`, `stopReason:"error"`, `errorMessage:"Codex error: {...context_length_exceeded...}"`
|
||||
- matching `~/.openclaw/subagents/runs.json` now correctly stored:
|
||||
- `outcome.status: "error"`
|
||||
- `outcome.error: "Codex error: {...context_length_exceeded...}"`
|
||||
- `endedReason: "subagent-error"`
|
||||
- `frozenResultText: "Codex error: {...context_length_exceeded...}"`
|
||||
- Important remaining nuance: raw gateway `agent.wait` for that same failed child still returned `status:"ok"` with only `endedAt`, so the current fix is verified for subagent outcome persistence/announcements but not for lower-level `agent.wait` semantics.
|
||||
- Side note: unrelated dirty `/subagents log` UX changes in `external/openclaw-upstream` regression-passed `src/auto-reply/reply/commands.test.ts` (44 tests) but were intentionally left out-of-scope for this focused reliability pass.
|
||||
- Will also explicitly requested that zap keep a light eye on active subagents and check whether they look stuck instead of assuming they are fine until completion.
|
||||
- Will explicitly reinforced on 2026-03-13 that once planning is done, zap should use subagents ASAP and start implementation in a fresh session rather than continuing to implement inside the long-lived main chat.
|
||||
|
||||
@@ -27,7 +27,8 @@
|
||||
"Patch timestamp: 2026-03-04T22:31:50Z",
|
||||
"Upstream patch committed in external/openclaw-upstream on branch fix/tui-hide-internal-runtime-context commit 0f66a4547 (suppress internal runtime completion context blocks in TUI formatter).",
|
||||
"Validation: pnpm test:fast completed successfully (812 files / 6599 tests passing) at 2026-03-04T22:53:29Z",
|
||||
"2026-03-13: confirmed corrected LiteLLM run was still failing (child transcript showed assistant 429/plan error for GLM-5) while runs.json incorrectly stored outcome.status=ok and frozenResultText=null; implemented upstream branch fix/subagent-wait-error-outcome to derive terminal subagent outcome from latest assistant error state, with targeted validation (50 tests passed across 3 files). Still needs live rerun with a known-working model such as gpt-5.4."
|
||||
"2026-03-13: confirmed corrected LiteLLM run was still failing (child transcript showed assistant 429/plan error for GLM-5) while runs.json incorrectly stored outcome.status=ok and frozenResultText=null; implemented upstream branch fix/subagent-wait-error-outcome to derive terminal subagent outcome from latest assistant error state, with targeted validation (50 tests passed across 3 files).",
|
||||
"2026-03-13 later: live gpt-5.4 success repro passed (run 23750d80-b481-4f50-b219-cc9245be405f). Live gpt-5.4 failure repro also passed for subagent persistence/announcement handling: child run b50cb91f-6219-44f7-9d2f-a1264ac7ceaf ended with transcript stopReason=error + context_length_exceeded, and runs.json now stored outcome.status=error / endedReason=subagent-error / frozenResultText non-null. Remaining open nuance: raw agent.wait for that same failed child still returned status=ok."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user