Files
swarm-zap/HANDOFF.md

42 lines
2.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# HANDOFF.md
## Purpose
Immediate baton-pass for the next fresh implementation session.
## Current objective
Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The current target is to verify the newly landed upstream fix for subagent error/outcome handling and then continue on any remaining real runtime failures.
## Use these state files first
1. `WIP.subagent-reliability.md` — canonical state for this pass
2. `memory/tasks.json` — task tracking for reliability items
3. `memory/2026-03-04-subagent-delegation.md` — earlier delegation context
4. `memory/2026-03-13.md` if present, otherwise append todays evidence there
5. `external/openclaw-upstream/` — for any core-runtime fix work
## Related tasks
- `task-20260304-2215-subagent-reliability` — in progress
- `task-20260304-211216-acp-claude-codex` — open
## Known truths
- TUI noise suppression was already patched locally and upstreamed earlier.
- User still wants actual subagent reliability improved, not just UI noise hidden.
- Prior ACP failures included Claude/Codex runtime exits.
- Fresh-session implementation discipline is now the expected approach for non-trivial work.
- One explicit failure mode is already understood: requesting `glm-5` can route into an unavailable GLM-5 provider/entitlement path in this setup.
- A deeper bug was also identified: a subagent run could finish with terminal assistant errors yet still be recorded as successful with no frozen result.
- An upstream patch for that error/outcome handling now exists in `external/openclaw-upstream` on branch `fix/subagent-wait-error-outcome` with targeted tests passing.
## Highest-priority next actions
1. The success side is now verified on a real fresh `gpt-5.4` subagent run.
2. Find and execute the smallest safe controlled-failure repro on a valid model/runtime (`gpt-5.4` preferred) so we can confirm:
- a failing child run is stored as `error` rather than `ok`
- a successful child run stores a useful frozen result / announcement payload
3. Re-check whether ACP-specific Claude/Codex runtime failures are still reproducible after separating them from the generic subagent reporting bug.
4. If another core bug appears, continue in `external/openclaw-upstream/` on a focused branch with targeted validation.
5. Update WIP + memory + tasks before ending.
## Success criteria
- Real-run verification of the new error/outcome fix.
- Clear separation between resolved reporting bug(s) and any still-open ACP/runtime failures.
- State files updated with paths, commands, and outcomes.