Files
swarm-zap/HANDOFF.md

4.2 KiB
Raw Blame History

HANDOFF.md

Purpose

Immediate baton-pass for the next fresh implementation session.

Current objective

Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The failure-path proof for the new subagent outcome handling is captured, but the focused upstream agent.wait semantics fix on branch fix/subagent-wait-error-outcome did not hold in a fresh live source-gateway repro, so the remaining work is a narrower root-cause follow-up on the still-open live agent.wait => ok path.

Use these state files first

  1. WIP.subagent-reliability.md — canonical state for this pass
  2. memory/tasks.json — task tracking for reliability items
  3. memory/2026-03-04-subagent-delegation.md — earlier delegation context
  4. memory/2026-03-13.md if present, otherwise append todays evidence there
  5. external/openclaw-upstream/ — for any core-runtime fix work
  • task-20260304-2215-subagent-reliability — in progress
  • task-20260304-211216-acp-claude-codex — open

Known truths

  • TUI noise suppression was already patched locally and upstreamed earlier.
  • User still wants actual subagent reliability improved, not just UI noise hidden.
  • Prior ACP failures included Claude/Codex runtime exits.
  • Fresh-session implementation discipline is now the expected approach for non-trivial work.
  • One explicit failure mode is already understood: requesting glm-5 can route into an unavailable GLM-5 provider/entitlement path in this setup.
  • A deeper bug was also identified: a subagent run could finish with terminal assistant errors yet still be recorded as successful with no frozen result.
  • An upstream patch for that error/outcome handling now exists in external/openclaw-upstream on branch fix/subagent-wait-error-outcome with targeted tests passing.

Highest-priority next actions

  1. Treat the live gpt-5.4 failure repro as proven for subagent persistence/announcement handling:
    • run id b50cb91f-6219-44f7-9d2f-a1264ac7ceaf
    • child transcript ~/.openclaw/agents/main/sessions/f114b831-000b-4070-a539-85c68d2b7057.jsonl
    • runs.json now stores outcome.status: "error", endedReason: "subagent-error", and a non-null frozenResultText
  2. Treat raw gateway agent.wait as still open despite the current follow-up fix branch.
    • decisive live source-gateway repro:
      • gateway launch: OPENCLAW_SKIP_CHANNELS=1 CLAWDBOT_SKIP_CHANNELS=1 pnpm exec tsx src/index.ts gateway run --port 18902 --bind loopback --auth none --allow-unconfigured
      • session key: agent:main:subagent:agent-wait-gpt53-live-source-1773427981586
      • run id: gwc-live-agent-wait-gpt53-source-1773427981614
      • agent.wait: {"runId":"gwc-live-agent-wait-gpt53-source-1773427981614","status":"ok","endedAt":1773427984243}
      • last assistant: provider:"openai-codex" model:"gpt-5.3-codex" stopReason:"error" errorMessage contains context_length_exceeded
    • this is the current canonical blocker evidence for the still-open live path
  3. Most likely remaining gap to investigate next:
    • src/commands/agent.ts only applies the new fallback correction when !lifecycleEnded
    • lifecycleEnded flips true on any inner lifecycle phase:"end" or phase:"error"
    • src/gateway/server-methods/agent-job.ts resolves/caches phase:"end" as terminal status:"ok"
    • so an inner lifecycle emitter is still the likeliest place where terminal assistant/provider failures are being marked end too early on the live direct gateway path
  4. Re-check whether ACP-specific Claude/Codex runtime failures are still reproducible after separating them from the generic subagent outcome bug.
  5. Leave the dirty /subagents log UX diff out of this branch unless you intentionally spin a separate focused pass; it regression-passed src/auto-reply/reply/commands.test.ts but still lacks dedicated feature coverage.

Success criteria

  • Real-run verification of the new error/outcome fix. done for subagent persistence/announcement handling.
  • Clear separation between resolved reporting bug(s) and any still-open ACP/runtime failures.
  • Explicit decision on whether raw agent.wait behavior is acceptable or requires a follow-up fix.
  • State files updated with paths, commands, and outcomes.