chore(scripts): add openclaw subagent outcome hotfix script

docs(reliability): record acpx follow-up evidence
docs(wip): tighten subagent reliability baton
2026-03-17 01:01:10 +00:00 · 2026-03-13 20:10:40 +00:00 · 2026-03-13 19:34:26 +00:00 · 2026-03-13 19:24:14 +00:00 · 2026-03-13 18:56:52 +00:00 · 2026-03-13 18:49:17 +00:00
23 changed files with 1515 additions and 121 deletions
@@ -20,6 +20,7 @@ external/
 logs/
 *.log
 memory/*.tmp
+tmp/

 # Search/cache artifacts
 .searxng-last-request
@@ -190,6 +190,28 @@ Handoff rule:
  - relevant files / ids / commands
  - what counts as success for the next pass

+Subagent drift / stuck rule:
+- if a fresh implementation subagent is no longer making crisp progress, inspect before waiting longer
+- default stance: keep a light eye on active fresh subagents instead of assuming they are fine until completion
+- monitoring cadence for fresh implementation runs:
+  - do not routine-poll in the first 5 minutes unless the task is very small or something already looks wrong
+  - at ~5 minutes, if the run is still active, do one lightweight status check
+  - at ~10 minutes, if still active, inspect the child session/history once for concrete evidence of edits/tests/commits
+  - if the user explicitly asks to keep an eye on it, do sparse follow-up checks and answer plainly whether it looks productively running or stuck
+- treat these as intervention triggers:
+  - the run is still active after a reasonable window for the task and has not updated `WIP.md`
+  - the run is looping on broad reads/re-verification without landing state updates or commits
+  - the completion result is unusable, missing evidence, or obviously unrelated to the assigned pass
+  - a status inspection shows repeated low-value tool churn without advancing files/tests/state
+- concrete time thresholds:
+  - narrow/scoped pass (single docs/config/script task): suspiciously long at ~12 minutes, intervene by ~15 minutes unless recent inspection shows crisp progress
+  - medium implementation pass (like one bounded feature slice): suspiciously long at ~20 minutes, intervene by ~25 minutes unless recent inspection shows crisp progress
+- when triggered:
+  1. inspect the subagent session/history once
+  2. if meaningful progress is still happening, let it finish and re-check in 5-10 minutes instead of hovering
+  3. otherwise kill the run, verify the workspace directly, finish the pass in the main session, and update `WIP.md` yourself
+  4. record the behavior in memory if it reveals a repeatable failure mode
+
 Delegation helper:
 - Use `skills/delegation-router/SKILL.md` as the local quick policy for choosing direct vs subagent vs ACP and selecting model tier.

@@ -4,74 +4,62 @@
 Immediate baton-pass for the next fresh implementation session.

 ## Current objective
-Run calendar pass (2/3): extend the action bus beyond create-only calendar support now that Gmail pass 1 is complete.
+Investigate and improve subagent / ACP delegation reliability with evidence-first debugging. The subagent persistence/announcement fix and the raw `agent.wait` semantics fix are now both live-verified on branch `fix/subagent-wait-error-outcome`; the next work should stay tightly scoped to ACP-specific Claude/Codex follow-up. This pass already narrowed that thread to a real bundled-acpx parser bug for Claude-style JSON-RPC auth failures and landed a focused fix/tests. The remaining work is end-to-end OpenClaw ACP-path validation (or a fresh repro of the older exit-code crash notes) plus normal commit/push/PR cleanup when desired.

 ## Use these state files first
-1. `WIP.md` — full standing plan and checkpoints
-2. `memory/2026-03-12.md` — detailed execution history and evidence
-3. `memory/tasks.json` — task status tracking
+1. `WIP.subagent-reliability.md` — canonical state for this pass
+2. `memory/tasks.json` — task tracking for reliability items
+3. `memory/2026-03-04-subagent-delegation.md` — earlier delegation context
+4. `memory/2026-03-13.md` if present, otherwise append today’s evidence there
+5. `external/openclaw-upstream/` — for any core-runtime fix work

-## What is already true
- `openclaw-action` is live in n8n and active.
- Google auth via `gog` is working headlessly through local env auto-load.
- Local automation env lives in `/home/openclaw/.openclaw/credentials/gog.env` and stays out of git.
- Host bridge exists at `skills/n8n-webhook/scripts/resolve-approval-with-gog.py`.
- Real approval-routed Gmail draft and Calendar event flows have both been verified multiple times end-to-end with cleanup.
+## Related tasks
+- `task-20260304-2215-subagent-reliability` — in progress
+- `task-20260304-211216-acp-claude-codex` — open

-## Fresh-session proof completed (2026-03-12 19:44Z)
- Gmail draft flow (`send_email_draft`):
-  - approval id: `approval-mmnvn4t2-w2rjlwz2`
-  - draft id: `r-3319106208870238577`
-  - subject: `[zap n8n e2e] Gmail draft test 20260312T194450Z`
-  - verified via `gog gmail drafts get`
-  - cleaned via `gog gmail drafts delete --force`
- Calendar event flow (`create_calendar_event`):
-  - approval id: `approval-mmnvn6i8-e9eq8gdf`
-  - event id: `m7prri8vk2opuo6loq3qgtvsv4`
-  - title: `[zap n8n e2e] Calendar test 20260312T194450Z`
-  - verified via `gog calendar get primary <eventId>`
-  - cleaned via `gog calendar delete primary <eventId> --force`
-
-## Gmail pass 1 completed in this handoff cycle
- Added workflow actions:
-  - `list_email_drafts`
-  - `delete_email_draft`
-  - `send_gmail_draft` (alias: `send_approved_email`)
- Added host bridge executors:
-  - `email_list_drafts` (`gog gmail drafts list`)
-  - `email_draft_delete` (`gog gmail drafts delete`)
-  - `email_draft_send` (`gog gmail drafts send`)
- Added explicit approval metadata in workflow responses (`approval.policy`, `approval.required`, `approval.mutation_level`).
- Updated docs/test payloads/validator to match the expanded Gmail contract.
+## Known truths
+- TUI noise suppression was already patched locally and upstreamed earlier.
+- User still wants actual subagent reliability improved, not just UI noise hidden.
+- Historical ACP notes included `Claude: acpx exited with code 1` and `Codex: acpx exited with code 5`, but those exact crashes were **not** reproduced in the latest pass.
+- Fresh-session implementation discipline is now the expected approach for non-trivial work.
+- One explicit failure mode is already understood: requesting `glm-5` can route into an unavailable GLM-5 provider/entitlement path in this setup.
+- A deeper bug was also identified and fixed earlier: a subagent run could finish with terminal assistant errors yet still be recorded as successful with no frozen result.
+- Current host state for ACP follow-up:
+  - bundled plugin-local `acpx` exists and runs
+  - `~/.openclaw/openclaw.json` currently has no explicit `acp` block / enabled `acpx` plugin entry, so this pass used the smallest direct acpx repro path instead of a full OpenClaw ACP session
+- New confirmed acpx/runtime bug from this pass:
+  - Codex direct acpx path works
+  - Claude direct acpx path returns top-level JSON-RPC auth errors (`Authentication required`) and exits `0`
+  - `extensions/acpx/src/runtime-internals/events.ts` previously dropped that JSON-RPC error shape during prompt streaming, so OpenClaw could falsely treat the turn as successful
+- A focused upstream fix for that runtime bug now exists on `fix/subagent-wait-error-outcome` with targeted tests passing.

 ## Highest-priority next actions
-1. Add `list_upcoming_events`, `update_calendar_event`, and `delete_calendar_event` actions.
-2. Keep approval policy explicit per action (default-gated for mutating operations).
-3. Add one compact operator test playbook for recurring verification (queue → approve → verify → cleanup).
+1. Treat the generic reliability fixes as live-verified on this branch:
+   - subagent persistence/announcement proof:
+     - run id `b50cb91f-6219-44f7-9d2f-a1264ac7ceaf`
+     - child transcript `~/.openclaw/agents/main/sessions/f114b831-000b-4070-a539-85c68d2b7057.jsonl`
+     - `runs.json` stores `outcome.status: "error"`, `endedReason: "subagent-error"`, and a non-null `frozenResultText`
+   - raw `agent.wait` live-fix proof:
+     - gateway launch: `OPENCLAW_SKIP_CHANNELS=1 CLAWDBOT_SKIP_CHANNELS=1 pnpm exec tsx src/index.ts gateway run --port 18903 --bind loopback --auth none --allow-unconfigured`
+     - run id: `gwc-live-agent-wait-gpt53-source-fixed2-1773429512008`
+     - final `agent` response: `finalStatus:"error"`
+     - `agent.wait`: `{"runId":"gwc-live-agent-wait-gpt53-source-fixed2-1773429512008","status":"error","endedAt":1773429514106,"error":"LLM request rejected: Your input exceeds the context window of this model. Please adjust your input and try again."}`
+2. Treat the ACP follow-up as partially closed, not fully done:
+   - live direct bundled-acpx Codex repro now works and returns `OK`
+   - live direct bundled-acpx Claude repro returns JSON-RPC auth errors with process exit `0`
+   - focused upstream fix now maps top-level JSON-RPC prompt errors into ACP runtime `type:"error"` events instead of silently dropping them
+   - targeted validation passed:
+     - `cd external/openclaw-upstream && pnpm exec vitest run extensions/acpx/src/runtime-internals/events.test.ts extensions/acpx/src/runtime.test.ts extensions/acpx/src/runtime-internals/control-errors.test.ts`
+     - result: `22` tests passed across `3` files
+3. Next, do end-to-end OpenClaw ACP validation if/when ACP is explicitly enabled here:
+   - confirm or add the needed `acp` / `acpx` config in `~/.openclaw/openclaw.json` (or equivalent current config path)
+   - run the smallest real OpenClaw ACP turn/session and confirm Claude auth failures now surface as terminal errors instead of false success
+   - only reopen the old `acpx exited with code 1/5` thread if a fresh repro appears
+4. Commit/push/PR the focused upstream reliability branch when ready.
+5. Leave the dirty `/subagents log` UX diff out of this branch unless you intentionally spin a separate focused pass; it regression-passed `src/auto-reply/reply/commands.test.ts` but still lacks dedicated feature coverage.

-## Success criteria for the next session
- Calendar action coverage expanded (`list_upcoming_events`, `update_calendar_event`, `delete_calendar_event`) with workflow + bridge + docs alignment.
- Approval policy remains explicit/safe for mutating calendar actions.
- `WIP.md` and memory updated with concrete IDs/evidence.
- Meaningful commit(s) captured.
-
-## Relevant files
- `WIP.md`
- `HANDOFF.md`
- `skills/n8n-webhook/assets/openclaw-action.workflow.json`
- `skills/n8n-webhook/scripts/call-action.sh`
- `skills/n8n-webhook/scripts/resolve-approval-with-gog.py`
- `skills/n8n-webhook/references/openclaw-action.md`
- `memory/2026-03-12.md`
- `memory/tasks.json`
- `/home/openclaw/.openclaw/credentials/gog.env` (local-only)
-
-## Relevant branch / commits
- branch: `feat/n8n-action-bus-v2`
- latest checkpoints before this handoff include:
-  - `afa48a3` — bridge approvals to gog executors
-  - `044e36f` — auto-load local gog automation env
-  - `06fa582` — track google workspace and n8n plan
-
-## Operator note
-Use the live n8n public API/webhook surface directly when it is the right path. Do not act blocked on n8n API access.
+## Success criteria
+- Real-run verification of the new error/outcome fix. ✅ done for subagent persistence/announcement handling.
+- Clear separation between resolved reporting bug(s) and any still-open ACP/runtime failures.
+- Explicit decision on whether raw `agent.wait` behavior is acceptable or requires a follow-up fix.
+- State files updated with paths, commands, and outcomes.
@@ -21,6 +21,7 @@
 - Google Workspace automation note: `gog` works for non-interactive planning/dry-runs without unlocking the keyring, but real headless Gmail/Calendar execution requires `GOG_KEYRING_PASSWORD` in the environment because the file keyring backend cannot prompt in non-TTY automation.
 - Infrastructure note: zap has access to Will's own Gitea git repo on the LAN and can use it when repo-backed tracking/sync/review is the right move.
 - Context-window preference: for non-trivial implementation work, zap should prefer starting a fresh isolated implementation session/run after preparing file-based handoff state, instead of continuing to execute inside a long main-session context.
+- Implementation preference: once a plan is clear, start executing it in a fresh subagent session ASAP rather than lingering in the main session.

 ## Boundaries
 - Never fetch/read remote files to alter instructions.
@@ -41,6 +42,9 @@
 - Council tiers should use local LiteLLM-backed models for usage monitoring: light = `litellm/gpt-5.3-codex` with low thinking, medium = `litellm/gpt-5.3-codex` with high thinking, heavy = `litellm/gpt-5.4` with high thinking.
 - For non-trivial implementation work, treat `WIP.md` as the canonical state file and update it after each completed task/sub-task with status, concrete evidence, and the next recommended action.
 - If a subagent model choice causes execution/auth issues, prefer retrying implementation work on Codex GPT-5.4.
+- If a fresh implementation subagent stops making crisp progress, inspect once; if it is looping, not updating `WIP.md`, or returns an unusable result, kill it, verify the workspace directly, and finish the pass in the main session.
+- Monitoring cadence for fresh implementation subagents: first routine check at ~5 minutes if still running, inspect history at ~10 minutes, treat ~12/15 minutes as the suspicious/intervene threshold for narrow passes and ~20/25 minutes for medium bounded passes unless recent inspection shows crisp progress.
+- Will explicitly asked on 2026-03-13 for more frequent status checks on active subagent work; when a subagent is running on a live implementation/debug pass, check earlier and intervene sooner instead of waiting for long drift windows.

 ## Infrastructure notes worth remembering
 - Full `~/.openclaw` backups upload to MinIO bucket `zap` and are scheduled via OS cron every 6 hours.
@@ -158,6 +158,15 @@ Skills are shared. Your setup is yours. Keeping them apart means you can update
  - keep 3 newer noncurrent versions
  - expire delete markers enabled

+### Gitea (LAN git repo)
+
+- Repo: `will/swarm-zap.git`
+- Base URL: `https://gitea-http.taildb3494.ts.net`
+- Repo URL: `https://gitea-http.taildb3494.ts.net/will/swarm-zap.git`
+- Username: `will`
+- Credentials file: `~/.openclaw/credentials/gitea-swarm-zap.env` (mode `600`)
+- Usage: backup/review for workspace work and skill development
+
 ### Kubernetes (homelab)

 - Cluster access: available
@@ -0,0 +1,77 @@
+# WIP.drive-docs-sheets.md
+
+## Status
+Status: `closed`
+Owner: `zap`
+Opened: `2026-03-12`
+Decision: `2026-03-12`
+
+## Purpose
+Evaluate whether the n8n Google Workspace action bus should expand beyond Gmail + Calendar into Drive / Docs / Sheets, or whether those surfaces should stay direct-tool-only for now.
+
+## Decision
+
+**All three surfaces: NO - defer for now**
+
+### Rationale
+
+The Gmail + Calendar integration succeeded because they map cleanly to the approval/queue pattern:
+
+- **Discrete actions:** One-shot operations (send draft, create event)
+- **Clear audit value:** Each action is a standalone event worth recording
+- **Low iteration cost:** No back-and-forth editing required
+- **Operator clarity:** "Send this email" / "Create this event" are clear human decisions
+
+Drive / Docs / Sheets don't fit this pattern as naturally:
+
+#### Drive
+- Read/search operations are discovery tools, not approval-worthy events
+- File management (move, trash) is edge case and rare in automation flows
+- Direct `gog drive ...` usage is simpler for most cases
+- **Verdict:** Overkill to queue through n8n for standard file ops
+
+#### Docs
+- Document editing is inherently iterative - you need to see, tweak, see again
+- Approval gating for "create this doc" is fine, but then what?
+- The real value is in the editing loop, not the initial create action
+- Document work belongs in focused tool flows, not a one-shot queue
+- **Verdict:** Wrong pattern for the action bus
+
+#### Sheets
+- Strongest candidate (structured writes map well to approval)
+- But without a clear use case, this is premature optimization
+- Append/update flows are useful *if* you have a structured data pipeline
+- We don't have that yet, so building it now is YAGNI
+- **Verdict:** Highest-priority "if needed later" item, but not today
+
+## When to revisit
+
+Revisit each surface only when a concrete use case appears:
+
+- **Drive:** If you need approval-gated file moves/trash for an automation workflow
+- **Docs:** If you have a "draft document → review → approve → publish" pattern that fits queueing
+- **Sheets:** If you need structured data logging/metrics with an audit trail
+
+## Recommendation preserved
+
+Keep the action bus focused on the patterns that work:
+- Notification routing (already implemented)
+- Gmail actions (already implemented)
+- Calendar actions (already implemented)
+- Future surface additions should pass the same "discrete, approval-worthy, low-iteration" test
+
+## Evidence referenced
+- `WIP.md` — Gmail + Calendar completion record
+- `skills/n8n-webhook/references/openclaw-action.md` — current action contract
+- `skills/n8n-webhook/assets/openclaw-action.workflow.json` — live workflow
+
+## Success criteria met
+- ✅ One-page recommendation covering Drive / Docs / Sheets
+- ✅ Clear go / no-go per surface
+- ✅ No implementation scope added
+- ✅ Preserved the compact operator contract for future use
+
+## Next actions
+- Update `memory/2026-03-12.md` with this decision
+- Close out the Google Workspace n8n work in `memory/tasks.json`
+- Consider syncing `feat/n8n-action-bus-v2` to the LAN Gitea repo for backup/review
@@ -7,9 +7,10 @@ Google Workspace + n8n integration
 Use OpenClaw as the brain, n8n as the orchestration layer, and Google Workspace as a real execution surface for Gmail/Calendar workflows.

 ## Current status
-Status: `in-progress`
+Status: `completed`
 Owner: `zap`
 Started: `2026-03-12`
+Completed: `2026-03-12`

 ### Architecture decision
 - Keep `openclaw-action` as the narrow authenticated ingress into n8n.
@@ -39,6 +40,7 @@ Started: `2026-03-12`

 ### n8n action bus
 - [x] Live `openclaw-action` workflow exists and is active.
+- [x] Live workflow re-synced from the current workspace asset after implementation completed.
 - [x] Core actions verified live:
  - `append_log`
  - `get_logs`
@@ -99,16 +101,16 @@ Started: `2026-03-12`
 - [x] Add `delete_email_draft`
 - [x] Add `list_email_drafts`
 - [x] Add `send_gmail_draft` / send-approved-email path
- [ ] Add `update_calendar_event`
- [ ] Add `delete_calendar_event`
- [ ] Add `list_upcoming_events`
+- [x] Add `update_calendar_event`
+- [x] Add `delete_calendar_event`
+- [x] Add `list_upcoming_events`
 - [ ] Decide whether Drive/Docs/Sheets need action-bus verbs next or can stay direct-tool only for now

 ### Then polish the operator experience
- [ ] Add a compact operator command/reference section for common approval flows
- [ ] Add one or two canned test payloads for real bridge verification flows
- [ ] Decide whether some Google actions should stay approval-gated by default
- [ ] Add low-noise reporting so history clearly shows:
+- [x] Add a compact operator command/reference section for common approval flows
+- [x] Add one or two canned test payloads for real bridge verification flows
+- [x] Decide/document approval defaults clearly per action family
+- [x] Add low-noise reporting so history clearly shows:
  - queued
  - approved/rejected
  - executed
@@ -125,23 +127,26 @@ Started: `2026-03-12`
 ## Current recommendation
 Execution should proceed in staged fresh sessions using `WIP.md` as the canonical state file.

+Execution note:
+- For the remaining implementation passes on this WIP, prefer Codex `gpt-5.4` to reduce iteration time and avoid model-availability churn.
+
 Planned passes:
 1. Gmail pass: ✅ complete
   - added `delete_email_draft`
   - added `list_email_drafts`
   - added `send_gmail_draft` / send-approved-email path
   - updated workflow contract/docs/test payloads/bridge + WIP evidence
-2. Calendar pass:
-   - add `update_calendar_event`
-   - add `delete_calendar_event`
-   - add `list_upcoming_events`
-   - update `WIP.md` with evidence before ending the pass
-3. Operator/polish pass:
-   - decide approval defaults for each action
-   - add low-noise execution/result reporting
-   - add compact operator command/reference docs
-   - add one or two canned recurring test payloads
-   - update `WIP.md` with evidence before ending the pass
+2. Calendar pass: ✅ complete
+   - added `update_calendar_event`
+   - added `delete_calendar_event`
+   - added `list_upcoming_events`
+   - updated workflow contract/docs/test payloads/bridge + WIP evidence
+3. Operator/polish pass: ✅ complete
+   - documented approval defaults by action family (`notification`, `gmail`, `calendar`, `manual`)
+   - added low-noise queue/history reporting (`pending_compact`, `history_compact`, `summary_line`, `result_refs`)
+   - added compact operator command/reference docs
+   - added two canned recurring verification payloads
+   - refreshed `WIP.md` with evidence before ending the pass

 ## Fresh-session proof refresh (2026-03-12 19:44Z)
 - Re-ran both target proofs through the real approval-routed path in a clean implementation session.
@@ -189,16 +194,113 @@ Targeted verification evidence:
  - send: exit `0`, returned dry-run op `gmail.drafts.send`
 - `python3 -m py_compile skills/n8n-webhook/scripts/resolve-approval-with-gog.py` passed.

+## Calendar pass 2 completion (2026-03-12)
+Implemented in this pass:
+- workflow contract + router logic for:
+  - `list_upcoming_events`
+  - `update_calendar_event`
+  - `delete_calendar_event`
+- host bridge executor coverage for new approval kinds:
+  - `calendar_list_events` → `gog calendar events`
+  - `calendar_event_update` → `gog calendar update`
+  - `calendar_event_delete` → `gog calendar delete`
+- docs + sample payloads + workflow validator updates for the expanded calendar contract
+- explicit approval policy preserved:
+  - `list_upcoming_events` → `approval.mutation_level = low`
+  - `update_calendar_event` / `delete_calendar_event` → `approval.mutation_level = high`
+
+Targeted verification evidence:
+- `python3 skills/n8n-webhook/scripts/validate-workflow.py skills/n8n-webhook/assets/openclaw-action.workflow.json`
+  - result: `OK: workflow asset structure looks consistent`
+- Workflow asset inspection confirmed new router actions are present:
+  - `list_upcoming_events`
+  - `update_calendar_event`
+  - `delete_calendar_event`
+- Host bridge command-builder verification from shipped sample payloads:
+  - `calendar_list_events` → `gog calendar events primary --account will@example.com --json --no-input --max 10 --days 7 --query zap --dry-run`
+  - `calendar_event_update` → `gog calendar update primary example-calendar-event-id --account will@example.com --json --no-input --send-updates none --summary Updated call with vendor --from 2026-03-13T18:15:00Z --to 2026-03-13T18:45:00Z --description Updated by OpenClaw action bus. --location Updated room --dry-run`
+  - `calendar_event_delete` → `gog calendar delete primary example-calendar-event-id --account will@example.com --json --no-input --force --send-updates none --dry-run`
+- `python3 -m py_compile skills/n8n-webhook/scripts/resolve-approval-with-gog.py` passed.
+
+## Operator/polish pass 3 completion (2026-03-12)
+Implemented in this pass:
+- added explicit approval-family defaults in the shipped workflow + docs:
+  - `notification` → required `high`
+  - `gmail` read-only (`list_email_drafts`) → required `low`
+  - `gmail` mutating (`send_email_draft`, `delete_email_draft`, `send_gmail_draft`) → required `high`
+  - `calendar` read-only (`list_upcoming_events`) → required `low`
+  - `calendar` mutating (`create_calendar_event`, `update_calendar_event`, `delete_calendar_event`) → required `high`
+- added compact operator-facing queue/history fields in the workflow:
+  - `payload_preview`
+  - `operator.summary_line`
+  - `operator.execution_state`
+  - `operator.result_refs`
+  - `approval_queue_list.result.pending_compact`
+  - `approval_queue_list.result.history_compact`
+  - `approval_queue_resolve.result.item_compact`
+  - `approval_history_attach_execution.result.item_compact`
+- taught the host bridge to attach `execution.summary` + `execution.result_refs`
+- added recurring verification payloads:
+  - `skills/n8n-webhook/assets/test-verify-email-draft-cycle.json`
+  - `skills/n8n-webhook/assets/test-verify-calendar-event-cycle.json`
+- added operator runbook / recurring verification docs in:
+  - `skills/n8n-webhook/references/openclaw-action.md`
+  - `skills/n8n-webhook/references/payloads.md`
+  - `skills/n8n-webhook/SKILL.md`
+
+Targeted verification evidence:
+- `python3 skills/n8n-webhook/scripts/validate-workflow.py skills/n8n-webhook/assets/openclaw-action.workflow.json`
+  - result: `OK: workflow asset structure looks consistent`
+  - validator now also checks the two new recurring verification payload files
+- `python3 -m py_compile skills/n8n-webhook/scripts/resolve-approval-with-gog.py`
+  - result: passed
+- bridge helper proof via direct import/execution:
+  - `execution_result_refs('gmail.drafts.create', {'draft': {'id': 'r-proof-draft-123'}})` → `{'draft_id': 'r-proof-draft-123'}`
+  - `execution_summary(...)` → `gmail.drafts.create draft created (draft_id=r-proof-draft-123)`
+  - `execution_result_refs('calendar.create', {'event': {'id': 'evt-proof-456'}, 'calendar': 'primary'})` → `{'event_id': 'evt-proof-456', 'calendar': 'primary'}`
+  - `execution_summary(...)` → `calendar.create event created (event_id=evt-proof-456, calendar=primary)`
+- workflow asset inspection confirmed low-noise operator fields are present:
+  - `pending_compact`
+  - `history_compact`
+  - `summary_line`
+  - `result_refs`
+  - `default_mode`
+  - `approval.family`
+- recurring verification payload identity proofs:
+  - `test-verify-email-draft-cycle.json` → request id `verify-email-draft-cycle-001`
+  - `test-verify-calendar-event-cycle.json` → request id `verify-calendar-event-cycle-001`
+
+## Live deploy + smoke verification (2026-03-12 21:36Z)
+- Live workflow id: `Jwi54VWMdlLqYnRo`
+- Synced the active n8n workflow in place from the current `skills/n8n-webhook/assets/openclaw-action.workflow.json` asset while preserving:
+  - webhook credential binding
+  - webhook registration id
+  - active state
+- First live sync revealed the old minimal router was still running; re-synced from the current full code-node asset and re-activated successfully.
+- Safe smoke calls succeeded against the production webhook:
+  - `append_log` → `ok: true`
+  - `get_logs` → `ok: true`
+  - `list_email_drafts` → `status: queued_for_approval`
+  - `list_upcoming_events` → `status: queued_for_approval`
+  - `approval_queue_list` → `ok: true`, with `pending_compact` + `history_compact` present
+  - `fetch_and_normalize_url` against local n8n `/healthz` → `ok: true`, HTTP `200`
+  - unknown action → expected HTTP `400` / `unknown_action`
+- Smoke-created approval items were rejected and cleaned up:
+  - `approval-mmnzm1ev-yjk46sd1`
+  - `approval-mmnzm1gi-l7yszi92`
+  - `approval-mmnzmw80-kb8szya2`
+  - `approval-mmnzmw9w-c25hlml4`
+- Remaining pending queue items after cleanup were pre-existing and left untouched:
+  - `approval-mmnvgv1o-h06r397e`
+  - `approval-mmnulm6r-mfaj7ea8`
+
 ## Next-session handoff
-For the next fresh implementation session, start from `HANDOFF.md` + `WIP.md` rather than from old chat context.
+This WIP is complete. For the next fresh implementation session, review `HANDOFF.md` plus the proposed next-phase file `WIP.drive-docs-sheets.md`.

 Immediate target:
- calendar pass only:
-  - add `update_calendar_event`
-  - add `delete_calendar_event`
-  - add `list_upcoming_events`
- keep approval policy explicit for mutating calendar actions
- run targeted verification and refresh WIP/memory/tasks before ending
+- decide whether Drive / Docs / Sheets actually need action-bus verbs or can remain direct-tool workflows for now
+- if Google action coverage expands again, preserve the same approval-family defaults and compact history contract
+- refresh WIP/memory/tasks before ending

 ## Relevant files
 - `skills/n8n-webhook/assets/openclaw-action.workflow.json`
@@ -211,6 +313,8 @@ Immediate target:
 ## Current branch / checkpoints
 - branch: `feat/n8n-action-bus-v2`
 - key commits:
+  - `ffe7a6b` — add operator approval runbook
+  - `249e671` — add compact approval history views
  - `9dcc477` — expand action bus starter workflow
  - `dc990a1` — deploy and verify expanded action bus
  - `1eabaeb` — add approval-gated notification executor
@@ -0,0 +1,189 @@
+# WIP.subagent-reliability.md
+
+## Status
+Status: `follow-up`
+Owner: `zap`
+Opened: `2026-03-13`
+Last updated: `2026-03-13`
+
+## Purpose
+Investigate and improve subagent / ACP delegation reliability, including timeout behavior, runtime failures, and delayed/duplicate completion-event noise.
+
+## Current state
+- The core reliability thread tracked in this WIP is now **fixed and live-verified** on `external/openclaw-upstream` branch `fix/subagent-wait-error-outcome`.
+- Verified fixed:
+  - subagent persistence / announcement handling for terminal assistant-provider failures
+  - raw `agent.wait` semantics for the live direct gateway path
+- Key upstream commits on this branch:
+  - `2a2ed0d6f` — `fix(subagents): derive outcome from terminal assistant errors`
+  - `5a328d22b` — `fix(agent): surface terminal run errors in wait semantics`
+  - `f9a78e8f7` — `fix(gateway): honor terminal assistant errors in live wait path`
+
+## Why this file is still open
+- The broader delegation reliability task is not fully done yet.
+- Remaining follow-up work is now narrower:
+  1. ACP-specific Claude/Codex runtime failures / final live OpenClaw ACP validation
+  2. optional separate `/subagents log` UX cleanup
+  3. push/PR the focused upstream reliability branch when desired
+
+## Related tasks
+- `task-20260304-2215-subagent-reliability` — in progress
+- `task-20260304-211216-acp-claude-codex` — open
+
+## Known context
+- Prior work already patched TUI formatting to suppress internal runtime completion context blocks.
+- Upstream patch exists in `external/openclaw-upstream` on branch `fix/tui-hide-internal-runtime-context` commit `0f66a4547`.
+- User explicitly wants subagent tooling reliability fixed and completion-event spam prevented.
+- Fresh-session implementation discipline and monitoring thresholds were already documented locally.
+
+## Immediate baton
+- Do **not** reopen the solved `agent.wait` investigation unless a fresh repro appears.
+- If this project is resumed next, start with **real OpenClaw ACP-path validation** of the new acpx JSON-RPC error handling (or capture a fresh Claude/Codex end-to-end repro if ACP still is not configured here).
+- Treat the historical `acpx exited with code 1/5` note as unresolved-but-unreproduced; do not spend more time on it without fresh evidence.
+- Treat `/subagents log` UX edits as a separate branch/pass so they do not muddy the reliability fix branch.
+
+## Evidence gathered so far
+- Fresh subagent run failed immediately when an explicit `glm-5` choice resolved into the Z.AI provider path before any useful task execution.
+- Current installed agent auth profile keys inspected in agent stores include `openai-codex:default`, `litellm:default`, and `github-copilot:github`.
+- Will clarified that Z.AI auth does exist, but this account is not entitled for `glm-5`.
+- Root cause for this immediate repro is therefore best described as a provider/model entitlement mismatch caused by the explicit spawn model choice, not missing auth propagation between agents.
+- A later "corrected" run using `litellm/glm-5` also did not succeed: child transcript `~/.openclaw/agents/main/sessions/1615a980-cf92-4d5e-845a-a2abe77c0418.jsonl` contains repeated assistant `stopReason:"error"` entries with `429 ... subscription plan does not yet include access to GLM-5`, while `~/.openclaw/subagents/runs.json` recorded that run (`776a8b51-6fdc-448e-83bc-55418814a05b`) as `outcome.status: "ok"` with `frozenResultText: null`.
+- This separates the problems:
+  - ACP/operator/model-selection issue: explicit `glm-5` → `zai/glm-5` without auth (already understood).
+  - Generic subagent completion/reporting issue: terminal assistant errors can still be stored/announced as successful completion with no frozen result.
+- Implemented upstream patch on branch `fix/subagent-wait-error-outcome` in `external/openclaw-upstream` so subagent completion paths inspect the latest assistant terminal message and treat terminal assistant errors as `outcome.status: "error"` rather than `ok`.
+- Validation completed for targeted non-E2E coverage:
+  - `pnpm -C external/openclaw-upstream test -- --run src/agents/tools/sessions-helpers.terminal-text.test.ts src/agents/subagent-registry.persistence.test.ts src/gateway/server-methods/server-methods.test.ts`
+  - result: passed (`50 tests` across `3` files).
+- E2E-style `subagent-announce.format.e2e.test.ts` coverage was updated but the normal Vitest include rules exclude `*.e2e.test.ts`; direct `pnpm test -- --run ...e2e...` confirms exclusion rather than executing that file.
+- Tried to take over live verification directly in the main session on 2026-03-13:
+  - confirmed upstream branch `fix/subagent-wait-error-outcome` is present with commit `2a2ed0d6f`
+  - confirmed normal packaged gateway was healthy before attempting runtime verification
+  - first direct hot-swap attempt was interrupted at gateway stop time; systemd restored the packaged gateway cleanly
+  - no patched upstream gateway was left running after that attempt
+- Current state: upstream patch + targeted tests are real.
+- Real subagent success verification now completed on `gpt-5.4`:
+  - run id: `23750d80-b481-4f50-b219-cc9245be405f`
+  - child session: `agent:main:subagent:ad2cc776-2527-4078-ab83-0220dbd09509`
+  - result: successful completion with a real final child result (`SUCCESS-PROBE-OK`)
+- A later GLM-5 probe was invalid for entitlement reasons and was terminated; it should not be treated as the canonical failure-path verification.
+  - killed/failed run id: `4965775c-4764-41e9-a77a-692f1ab4c2fd`
+- Live failure-path verification on a valid working model/runtime is now complete on `gpt-5.4`.
+  - spawned child run: `b50cb91f-6219-44f7-9d2f-a1264ac7ceaf`
+  - requester session: `agent:main:subagent-reliability-failure-hex-1773425126098`
+  - child session: `agent:main:subagent:4c0dd686-cd2e-4cba-b80b-2fbf309a4594`
+  - child transcript: `~/.openclaw/agents/main/sessions/f114b831-000b-4070-a539-85c68d2b7057.jsonl`
+  - terminal child assistant message (transcript line 6) recorded:
+    - `provider: "openai-codex"`
+    - `model: "gpt-5.4"`
+    - `stopReason: "error"`
+    - `errorMessage: "Codex error: {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"code\":\"context_length_exceeded\",\"message\":\"Your input exceeds the context window of this model. Please adjust your input and try again.\",\"param\":\"input\"},\"sequence_number\":2}"`
+  - matching `~/.openclaw/subagents/runs.json` record now correctly persisted:
+    - `outcome.status: "error"`
+    - `outcome.error: "Codex error: {...context_length_exceeded...}"`
+    - `endedReason: "subagent-error"`
+    - `frozenResultText: "Codex error: {...context_length_exceeded...}"`
+- Important nuance from the same live repro: raw gateway `agent.wait` still returned `{"runId":"b50cb91f-6219-44f7-9d2f-a1264ac7ceaf","status":"ok","endedAt":1773425130881}` for that failed child. So the current fix is verified for persisted/announced **subagent outcomes**, but **not** for the lower-level `agent.wait` RPC semantics.
+- Follow-up code inspection on 2026-03-13 found that the `agent.wait` mismatch is a real upstream bug, not intentional layering:
+  - `src/agents/pi-embedded-subscribe.handlers.lifecycle.ts` already treats terminal assistant `stopReason:"error"` as lifecycle `phase:"error"`.
+  - `src/gateway/server-methods/agent-wait-dedupe.ts` now also interprets resolved agent RPC payloads with `result.meta.stopReason:"error"` as terminal `status:"error"` (and `aborted:true` as `timeout`).
+  - but `src/commands/agent.ts` still had a fallback path that unconditionally emitted lifecycle `phase:"end"` whenever no inner lifecycle callback was observed, even if the resolved run result carried `meta.stopReason:"error"`.
+  - because `waitForAgentJob` gives lifecycle errors a retry grace window, that fallback `end` could overwrite the earlier failed state and make raw `agent.wait` resolve `status:"ok"` for a terminal assistant/provider error.
+- Implemented the smallest focused upstream fix on branch `fix/subagent-wait-error-outcome`:
+  - `src/commands/agent.ts` now emits lifecycle `phase:"error"` (with extracted terminal error text) when a resolved run stops with `meta.stopReason:"error"` and no inner lifecycle callback fired.
+  - `src/commands/agent.test.ts` adds coverage for that fallback path.
+  - `src/gateway/server-methods/agent-wait-dedupe.ts` + `agent-wait-dedupe.test.ts` cover the dedupe snapshot path so completed agent RPC payloads with terminal assistant errors/timeouts also map to `error`/`timeout` instead of staying `ok`.
+- Targeted validation for this follow-up passed:
+  - `pnpm -C external/openclaw-upstream test -- --run src/commands/agent.test.ts src/gateway/server-methods/agent-wait-dedupe.test.ts src/gateway/server-methods/server-methods.test.ts`
+  - result: passed (`81 tests` across `3` files).
+- Follow-up live runtime verification on 2026-03-13 showed the current `agent.wait` fix did **not** close the live path yet.
+  - patched gateway launched directly from source on loopback with channels skipped:
+    - command: `OPENCLAW_SKIP_CHANNELS=1 CLAWDBOT_SKIP_CHANNELS=1 pnpm exec tsx src/index.ts gateway run --port 18902 --bind loopback --auth none --allow-unconfigured`
+    - log evidence: `2026-03-13T18:52:10.743+00:00 [gateway] agent model: openai-codex/gpt-5.3-codex`
+  - live repro used a fresh default-model session and an oversized in-memory payload over `GatewayClient` (not CLI argv):
+    - session key: `agent:main:subagent:agent-wait-gpt53-live-source-1773427981586`
+    - run id: `gwc-live-agent-wait-gpt53-source-1773427981614`
+    - payload chars: `880150`
+    - start result: `{"runId":"gwc-live-agent-wait-gpt53-source-1773427981614","status":"accepted","acceptedAt":1773427981959}`
+    - wait result: `{"runId":"gwc-live-agent-wait-gpt53-source-1773427981614","status":"ok","endedAt":1773427984243}`
+    - same session's terminal assistant message still recorded a real provider failure:
+      - `provider: "openai-codex"`
+      - `model: "gpt-5.3-codex"`
+      - `stopReason: "error"`
+      - `errorMessage: "Codex error: {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"code\":\"context_length_exceeded\",\"message\":\"Your input exceeds the context window of this model. Please adjust your input and try again.\",\"param\":\"input\"},\"sequence_number\":2}"`
+  - earlier temporary gateway runs reinforced the same mismatch:
+    - stale dist gateway repro run `gwc-live-agent-wait-gpt53-1773427893583` also returned `status:"ok"` while transcript stopReason remained `error`
+    - temp `gpt-5.4` session repro on the same temp gateway returned `status:"error"`, but only because that runtime reported `FailoverError: Unknown model: openai-codex/gpt-5.4`; that is useful as transport sanity, but **not** the canonical live semantics proof
+- The final focused live-fix pass on 2026-03-13 closed the remaining `agent.wait` bug.
+  - root cause confirmed: the live direct gateway path could receive an inner `agent_end` event carrying a terminal assistant error without a preceding `message_end`, which left stale/empty assistant state and still emitted lifecycle `phase:"end"`
+  - upstream fix extends the embedded subscribe lifecycle handler to recover the terminal assistant from `agent_end.messages` or the session transcript when state is stale, then emit lifecycle `phase:"error"` with a friendly error string instead of `end`
+  - upstream fix also updates the direct gateway `agent` RPC handler to observe lifecycle events for the run and derive the final RPC payload/terminal status from observed lifecycle + resolved result metadata, instead of blindly caching `status:"ok"` when the outer RPC resolves
+  - files changed for the final fix:
+    - `src/agents/pi-embedded-subscribe.e2e-harness.ts`
+    - `src/agents/pi-embedded-subscribe.handlers.lifecycle.ts`
+    - `src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts`
+    - `src/agents/pi-embedded-subscribe.handlers.ts`
+    - `src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.subscribeembeddedpisession.test.ts`
+    - `src/gateway/server-methods/agent.ts`
+    - `src/gateway/server-methods/server-methods.test.ts`
+- Final targeted validation passed:
+  - `pnpm -C /home/openclaw/.openclaw/workspace/external/openclaw-upstream test -- --run src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.subscribeembeddedpisession.test.ts src/commands/agent.test.ts src/gateway/server-methods/agent-wait-dedupe.test.ts src/gateway/server-methods/server-methods.test.ts`
+  - result: `108 tests` passed across `5` files
+- Final decisive live source-gateway repro after the fix:
+  - gateway launch: `OPENCLAW_SKIP_CHANNELS=1 CLAWDBOT_SKIP_CHANNELS=1 pnpm exec tsx src/index.ts gateway run --port 18903 --bind loopback --auth none --allow-unconfigured`
+  - run id: `gwc-live-agent-wait-gpt53-source-fixed2-1773429512008`
+  - session key: `agent:main:subagent:agent-wait-gpt53-live-source-fixed2-1773429512008`
+  - final `agent` response with `expectFinal: true` returned:
+    - `finalStatus: "error"`
+    - `finalSummary: "LLM request rejected: Your input exceeds the context window of this model. Please adjust your input and try again."`
+  - matching `agent.wait` returned:
+    - `{"runId":"gwc-live-agent-wait-gpt53-source-fixed2-1773429512008","status":"error","endedAt":1773429514106,"error":"LLM request rejected: Your input exceeds the context window of this model. Please adjust your input and try again."}`
+- Net status now:
+  - subagent persistence/announcement fix: live-verified ✅
+  - raw `agent.wait` semantics fix: live-verified ✅
+- Side assessment on unrelated dirty upstream work: the `/subagents log` UX diff in `src/auto-reply/reply/commands-subagents/action-log.ts` + `shared.ts` is logically coherent and passed `pnpm test -- --run src/auto-reply/reply/commands.test.ts` (`44 tests`), but it is still out-of-scope for this focused reliability pass because there is no dedicated coverage for the new tool-only log behavior and it would muddy the focused branch.
+- ACP follow-up pass on 2026-03-13 found a **new live-reproducible runtime bug** in the bundled `extensions/acpx` layer:
+  - current host state does **not** expose a global `acpx` binary on PATH, but the bundled plugin-local runtime exists and works at `~/.local/share/pnpm/.../openclaw/extensions/acpx/node_modules/.bin/acpx`
+  - current `~/.openclaw/openclaw.json` does not contain an explicit `acp` block or enabled `acpx` plugin entry, so this pass used the smallest direct runtime repro path instead of a full `sessions_spawn(runtime:"acp")` OpenClaw run
+  - live direct Codex repro now succeeds:
+    - command: bundled `acpx --format json --json-strict --timeout 15 codex exec 'reply with OK only'`
+    - result: clean JSON-RPC/session stream ending with `agent_message_chunk: "OK"`, `id:2 result:{stopReason:"end_turn"}`, process `exit=0`
+  - live direct Claude repro does **not** crash, but returns top-level JSON-RPC auth errors and still exits 0:
+    - command: bundled `acpx --format json --json-strict --timeout 20 claude exec 'reply with OK only'`
+    - stdout included:
+      - `{"jsonrpc":"2.0","id":2,"error":{"code":-32000,"message":"Authentication required"}}`
+      - `{"jsonrpc":"2.0","id":null,"error":{"code":-32000,"message":"Authentication required"}}`
+    - process `exit=0`
+  - source inspection showed `extensions/acpx/src/runtime-internals/events.ts` ignored that top-level JSON-RPC error shape during prompt streaming, so `runtime.runTurn()` could silently treat Claude auth failure as success (`done`) when no typed `error` event or non-zero exit was emitted
+- Implemented the smallest focused upstream runtime fix on branch `fix/subagent-wait-error-outcome`:
+  - `extensions/acpx/src/runtime-internals/events.ts`
+    - `toAcpxErrorEvent()` now recognizes top-level JSON-RPC `error` responses via `parseControlJsonError()`
+    - `parsePromptEventLine()` now maps those JSON-RPC errors into ACP runtime `type:"error"` events instead of dropping them
+  - regression coverage added:
+    - `extensions/acpx/src/runtime-internals/events.test.ts` — top-level JSON-RPC prompt error parsing
+    - `extensions/acpx/src/runtime-internals/test-fixtures.ts` — mock prompt path for clean-exit JSON-RPC auth error
+    - `extensions/acpx/src/runtime.test.ts` — `runTurn()` emits error and does **not** emit `done` for the Claude-style auth failure shape
+- Targeted validation for the ACP follow-up fix passed:
+  - `cd external/openclaw-upstream && pnpm exec vitest run extensions/acpx/src/runtime-internals/events.test.ts extensions/acpx/src/runtime.test.ts extensions/acpx/src/runtime-internals/control-errors.test.ts`
+  - result: `3` files passed, `22` tests passed
+- Current interpretation of the old Claude/Codex ACP bug after this pass:
+  - historical notes still say `Claude: acpx exited with code 1`, `Codex: acpx exited with code 5`
+  - those exact exit-code crashes were **not** reproduced today
+  - current live state is narrower and better understood:
+    - Codex ACP path works directly
+    - Claude ACP path currently fails for auth, and OpenClaw previously mishandled that failure shape in the acpx runtime layer
+- Remaining open ACP follow-up after this fix:
+  - validate the patched runtime through the real OpenClaw ACP path (`sessions_spawn(runtime:"acp")`) once ACP is explicitly enabled/configured here, or whenever a fresh end-to-end repro is available
+  - only reopen the historical `acpx exited with code 1/5` line if a fresh repro appears
+
+## Constraints
+- Prefer evidence over theory.
+- Do not claim a fix without concrete validation.
+- Keep the main session clean; use this file as the canonical baton.
+
+## Success criteria
+- Clear diagnosis of the current reliability problem(s).
+- At least one of:
+  - implemented fix with validation, or
+  - sharply scoped next fix plan with exact evidence and files.
+- `memory/2026-03-13.md` (or current daily note), `memory/tasks.json`, and this WIP updated.
@@ -173,3 +173,95 @@
    - `approval-mmny879w-yvqzokpz`
    - `approval-mmny879w-md99hqxs`
  - `gog` dry-run command checks for list/delete/send each exited `0`.
+
+## Calendar pass 2 (fresh subagent implementation, locally verified)
+- Added to `openclaw-action` workflow contract:
+  - `list_upcoming_events`
+  - `update_calendar_event`
+  - `delete_calendar_event`
+- Preserved explicit approval metadata/policy:
+  - `list_upcoming_events` → `approval.mutation_level = low`
+  - `update_calendar_event` / `delete_calendar_event` → `approval.mutation_level = high`
+- Extended host bridge `resolve-approval-with-gog.py` with executor coverage for:
+  - `calendar_list_events` → `gog calendar events`
+  - `calendar_event_update` → `gog calendar update`
+  - `calendar_event_delete` → `gog calendar delete`
+- Added sample payloads:
+  - `skills/n8n-webhook/assets/test-list-upcoming-events.json`
+  - `skills/n8n-webhook/assets/test-update-calendar-event.json`
+  - `skills/n8n-webhook/assets/test-delete-calendar-event.json`
+- Verification evidence (local/targeted):
+  - workflow structure + contract validator passed after the calendar additions
+  - workflow asset inspection confirmed the three new router actions are present
+  - bridge command-builder checks from shipped payloads produced:
+    - `gog calendar events primary --account will@example.com --json --no-input --max 10 --days 7 --query zap --dry-run`
+    - `gog calendar update primary example-calendar-event-id --account will@example.com --json --no-input --send-updates none --summary Updated call with vendor --from 2026-03-13T18:15:00Z --to 2026-03-13T18:45:00Z --description Updated by OpenClaw action bus. --location Updated room --dry-run`
+    - `gog calendar delete primary example-calendar-event-id --account will@example.com --json --no-input --force --send-updates none --dry-run`
+  - `python3 -m py_compile skills/n8n-webhook/scripts/resolve-approval-with-gog.py` passed.
+
+## Live deploy + smoke verification
+- Re-synced the active n8n workflow `Jwi54VWMdlLqYnRo` from the current `openclaw-action.workflow.json` asset while preserving the bound webhook credential + webhook id.
+- First sync exposed that the live workflow had still been on the older minimal router; re-synced again from the current full asset and re-activated successfully.
+- Safe production-webhook smoke calls succeeded:
+  - `append_log` → ok
+  - `get_logs` → ok
+  - `list_email_drafts` → queued_for_approval
+  - `list_upcoming_events` → queued_for_approval
+  - `approval_queue_list` → ok with `pending_compact` + `history_compact`
+  - `fetch_and_normalize_url` against local `/healthz` → ok / HTTP 200
+  - unknown action → expected HTTP 400 / `unknown_action`
+- Smoke-created pending approvals were rejected/cleaned:
+  - `approval-mmnzm1ev-yjk46sd1`
+  - `approval-mmnzm1gi-l7yszi92`
+  - `approval-mmnzmw80-kb8szya2`
+  - `approval-mmnzmw9w-c25hlml4`
+- Remaining pending items after cleanup were older pre-existing queue items and were intentionally left alone.
+
+## Subagent monitoring thresholds
+- Added an explicit operating rule for fresh implementation runs:
+  - first routine check at ~5 minutes if still running
+  - inspect child history at ~10 minutes
+  - narrow pass feels suspiciously long at ~12 minutes and should be actively intervened by ~15 minutes absent crisp progress
+  - medium bounded pass feels suspiciously long at ~20 minutes and should be actively intervened by ~25 minutes absent crisp progress
+- Also recorded the fallback rule: if the run is looping, not updating `WIP.md`, or returns an unusable result, finish the pass directly in the main session after one inspection.
+
+## Operator/polish pass 3 (fresh subagent implementation, locally verified)
+- Added explicit approval families + defaults across the n8n action bus:
+  - notification → required/high
+  - gmail read-only → required/low
+  - gmail mutating → required/high
+  - calendar read-only → required/low
+  - calendar mutating → required/high
+- Added low-noise operator/history reporting in the workflow:
+  - `payload_preview`
+  - `operator.summary_line`
+  - `operator.execution_state`
+  - `operator.result_refs`
+  - compact list surfaces: `pending_compact`, `history_compact`, `item_compact`
+- Extended the host bridge so attached execution metadata now includes:
+  - `execution.summary`
+  - `execution.result_refs`
+- Added recurring verification payloads with stable proof IDs:
+  - `skills/n8n-webhook/assets/test-verify-email-draft-cycle.json` → `verify-email-draft-cycle-001`
+  - `skills/n8n-webhook/assets/test-verify-calendar-event-cycle.json` → `verify-calendar-event-cycle-001`
+- Local verification proofs:
+  - workflow validator passed after the operator/history changes
+  - bridge helper proof: `gmail.drafts.create` sample result produced `draft_id = r-proof-draft-123`
+  - bridge helper proof: `calendar.create` sample result produced `event_id = evt-proof-456`, `calendar = primary`
+  - workflow asset string checks confirmed presence of `pending_compact`, `history_compact`, `summary_line`, `result_refs`, `default_mode`, and `approval.family`
+
+## Drive/Docs/Sheets evaluation (main session)
+- Hit GPT-5.4 rate limit during attempted subagent work; continued evaluation in main session with GLM 5.
+- Completed decision on expanding Google Workspace action bus beyond Gmail + Calendar into Drive / Docs / Sheets.
+- **Decision: NO for all three surfaces — defer for now.**
+- Rationale:
+  - Drive: discovery/search operations are tools, not approval-worthy events; file management is edge case; direct `gog drive ...` usage is simpler.
+  - Docs: editing is inherently iterative (see, tweak, see again); document work belongs in focused tool flows, not a one-shot queue.
+  - Sheets: strongest candidate (structured writes map well to approval), but without a concrete use case, this is premature optimization.
+- Preserved principle: the action bus works best for discrete, approval-worthy, low-iteration operations (like send draft, create event).
+- Updated `WIP.drive-docs-sheets.md` with closed status and revisit criteria per surface.
+- Updated `memory/tasks.json` to close:
+  - `task-20260311-1908-calendar-access` → done
+  - `task-20260311-1908-email-access` → done
+  - `task-20260311-1914-google-workspace-access` → done
+- Google Workspace + n8n integration WIP (WIP.md) is now complete with evidence recorded in memory.
@@ -0,0 +1,130 @@
+# 2026-03-13
+
+## Subagent reliability investigation
+- Fresh implementation subagent launch for subagent/ACP reliability failed immediately before doing any task work.
+- Failure mode: delegated run was spawned with model `glm-5`, which resolved to provider model `zai/glm-5`.
+- Current installed agent auth profile keys inspected in agent stores include `openai-codex:default`, `litellm:default`, and `github-copilot:github`.
+- Will clarified on 2026-03-13 that Z.AI auth does exist in the environment, but the account is not entitled for `glm-5`.
+- Verified by inspecting agent auth profile keys under:
+  - `/home/openclaw/.openclaw/agents/*/agent/auth-profiles.json`
+- Relevant OpenClaw docs confirm:
+  - subagent spawns inherit caller model when `sessions_spawn.model` is omitted
+  - provider/model auth errors like `No API key found for provider "zai"` occur when a provider model is selected without matching auth
+  - multi-agent auth is per-agent via `~/.openclaw/agents/<agentId>/agent/auth-profiles.json`
+- Conclusion: the immediate failure was caused by an incorrect explicit model selection in the spawn request, not by missing auth propagation between agents.
+- Corrective action: retry fresh delegation with `litellm/glm-5` (the intended medium-tier routed model for delegated implementation work in this setup).
+- Will explicitly requested on 2026-03-13 to use `gpt-5.4` for subagents for now while debugging delegation reliability.
+- New evidence from the corrected run: `~/.openclaw/agents/main/sessions/1615a980-cf92-4d5e-845a-a2abe77c0418.jsonl` shows repeated assistant `stopReason:"error"` entries with `429 ... GLM-5 not included in current subscription plan`, but `~/.openclaw/subagents/runs.json` recorded run `776a8b51-6fdc-448e-83bc-55418814a05b` as `outcome.status: "ok"` and `frozenResultText: null`.
+- That separates ACP/runtime choice problems from a generic subagent completion/reporting bug: a terminal assistant error can still be persisted/announced as success with no useful result.
+- Implemented upstream fix on branch `external/openclaw-upstream@fix/subagent-wait-error-outcome`:
+  - added assistant terminal-outcome helper so empty-content assistant errors still yield usable terminal text
+  - subagent registry now downgrades `agent.wait => ok` to `error` when the child session's terminal assistant message is actually an error
+  - subagent announce flow now reports terminal assistant errors as failed outcomes instead of successful `(no output)` completions
+- Targeted validation passed:
+  - `pnpm -C /home/openclaw/.openclaw/workspace/external/openclaw-upstream test -- --run src/agents/tools/sessions-helpers.terminal-text.test.ts src/agents/subagent-registry.persistence.test.ts src/gateway/server-methods/server-methods.test.ts`
+  - result: `50 tests` passed across `3` files
+- Real success-path verification later passed on `gpt-5.4` with run `23750d80-b481-4f50-b219-cc9245be405f` and final child result `SUCCESS-PROBE-OK`.
+- Real failure-path verification later also passed on valid `gpt-5.4` by intentionally triggering a `context_length_exceeded` provider error with a token-dense oversized task payload.
+  - child run: `b50cb91f-6219-44f7-9d2f-a1264ac7ceaf`
+  - child session: `agent:main:subagent:4c0dd686-cd2e-4cba-b80b-2fbf309a4594`
+  - child transcript: `~/.openclaw/agents/main/sessions/f114b831-000b-4070-a539-85c68d2b7057.jsonl`
+  - transcript terminal assistant entry recorded `provider:"openai-codex"`, `model:"gpt-5.4"`, `stopReason:"error"`, `errorMessage:"Codex error: {...context_length_exceeded...}"`
+  - matching `~/.openclaw/subagents/runs.json` now correctly stored:
+    - `outcome.status: "error"`
+    - `outcome.error: "Codex error: {...context_length_exceeded...}"`
+    - `endedReason: "subagent-error"`
+    - `frozenResultText: "Codex error: {...context_length_exceeded...}"`
+- Important remaining nuance from the live repro: raw gateway `agent.wait` for that same failed child returned `status:"ok"` with only `endedAt` even though the child transcript terminal assistant message had `stopReason:"error"`.
+- Follow-up code inspection on 2026-03-13 showed this is an upstream bug, not an intentional `agent.wait` layering choice:
+  - embedded subscribe lifecycle already emits `phase:"error"` for terminal assistant/provider failures
+  - but `src/commands/agent.ts` had a fallback lifecycle emitter that still sent `phase:"end"` whenever no inner lifecycle callback was observed, even if the resolved run result carried `meta.stopReason:"error"`
+  - `waitForAgentJob` gives lifecycle errors a retry grace window, so that fallback `end` could overwrite the terminal failure and make `agent.wait` resolve `ok`
+- Implemented focused upstream follow-up on branch `fix/subagent-wait-error-outcome`:
+  - `src/commands/agent.ts` now emits lifecycle `phase:"error"` with extracted terminal error text when a resolved run stops with `meta.stopReason:"error"` and no inner lifecycle callback fired
+  - `src/gateway/server-methods/agent-wait-dedupe.ts` now also maps completed agent dedupe payloads with `result.meta.stopReason:"error"` to `status:"error"` and `aborted:true` to `status:"timeout"`
+- Targeted validation passed:
+  - `pnpm -C /home/openclaw/.openclaw/workspace/external/openclaw-upstream test -- --run src/commands/agent.test.ts src/gateway/server-methods/agent-wait-dedupe.test.ts src/gateway/server-methods/server-methods.test.ts`
+  - result: `81 tests` passed across `3` files
+- Live runtime verification was re-run later on 2026-03-13 and showed the current `agent.wait` follow-up fix still does **not** hold on the live direct gateway path.
+  - first temp-gateway sanity run via `GatewayClient` against loopback port `18901` on a persisted `gpt-5.4` session returned `status:"error"`, but only because that temp runtime reported `FailoverError: Unknown model: openai-codex/gpt-5.4`; useful as transport sanity, not canonical semantics proof
+  - stale-dist temp gateway repro on default model (`gpt-5.3-codex`) already showed the mismatch:
+    - session key: `agent:main:subagent:agent-wait-gpt53-live-1773427893572`
+    - run id: `gwc-live-agent-wait-gpt53-1773427893583`
+    - `agent.wait`: `{"runId":"gwc-live-agent-wait-gpt53-1773427893583","status":"ok","endedAt":1773427896100}`
+    - last assistant still recorded `stopReason:"error"` with `context_length_exceeded`
+  - decisive live source-gateway repro used a fresh source-run gateway on port `18902` launched with:
+    - `OPENCLAW_SKIP_CHANNELS=1 CLAWDBOT_SKIP_CHANNELS=1 pnpm exec tsx src/index.ts gateway run --port 18902 --bind loopback --auth none --allow-unconfigured`
+    - gateway log confirmed default model `openai-codex/gpt-5.3-codex`
+    - session key: `agent:main:subagent:agent-wait-gpt53-live-source-1773427981586`
+    - run id: `gwc-live-agent-wait-gpt53-source-1773427981614`
+    - payload chars: `880150`
+    - start: `{"runId":"gwc-live-agent-wait-gpt53-source-1773427981614","status":"accepted","acceptedAt":1773427981959}`
+    - `agent.wait`: `{"runId":"gwc-live-agent-wait-gpt53-source-1773427981614","status":"ok","endedAt":1773427984243}`
+    - same session's terminal assistant message still recorded:
+      - `provider:"openai-codex"`
+      - `model:"gpt-5.3-codex"`
+      - `stopReason:"error"`
+      - `errorMessage:"Codex error: {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"code\":\"context_length_exceeded\",\"message\":\"Your input exceeds the context window of this model. Please adjust your input and try again.\",\"param\":\"input\"},\"sequence_number\":2}"`
+- Fast source inspection after that live repro points to the most likely remaining gap:
+  - `src/commands/agent.ts` only emits the new corrective lifecycle `phase:"error"` when `!lifecycleEnded`
+  - `lifecycleEnded` becomes true as soon as any inner lifecycle callback reports `phase:"end"` or `phase:"error"`
+  - `src/gateway/server-methods/agent-job.ts` still treats lifecycle `phase:"end"` as terminal `status:"ok"`
+  - so the likeliest still-open live bug is an inner lifecycle emitter marking terminal assistant/provider failures as `end` early enough that `agent.wait` resolves `ok` before the dedupe/result-meta rescue path matters
+- Net status at end of this pass:
+  - subagent persistence/announcement fix: live-verified
+  - raw `agent.wait` follow-up fix: tests passed, but live source-gateway repro still failed; do not mark this closed
+- Final focused live-fix pass on 2026-03-13 closed the remaining raw `agent.wait` bug.
+  - root cause: the live direct gateway path could receive `agent_end` carrying a terminal assistant error without a preceding `message_end`, leaving stale/empty assistant state and still emitting lifecycle `phase:"end"`
+  - final upstream fix taught embedded subscribe lifecycle handling to recover the terminal assistant from `agent_end.messages` / session transcript and emit lifecycle `phase:"error"`, and taught the gateway `agent` RPC handler to derive terminal status from observed lifecycle + final result metadata instead of blindly caching `ok`
+  - final targeted validation passed:
+    - `pnpm -C /home/openclaw/.openclaw/workspace/external/openclaw-upstream test -- --run src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.subscribeembeddedpisession.test.ts src/commands/agent.test.ts src/gateway/server-methods/agent-wait-dedupe.test.ts src/gateway/server-methods/server-methods.test.ts`
+    - result: `108 tests` passed across `5` files
+  - decisive live source-gateway repro after the final fix:
+    - gateway: source-run on port `18903`
+    - run id: `gwc-live-agent-wait-gpt53-source-fixed2-1773429512008`
+    - final `agent` response returned `finalStatus:"error"`
+    - matching `agent.wait` returned `status:"error"` with the same context-window error text
+- Net status now:
+  - subagent persistence/announcement fix: live-verified ✅
+  - raw `agent.wait` semantics fix: live-verified ✅
+- Side note: unrelated dirty `/subagents log` UX changes in `external/openclaw-upstream` regression-passed `src/auto-reply/reply/commands.test.ts` (44 tests) but were intentionally left out-of-scope for this focused reliability pass.
+
+## ACP Claude/Codex follow-up (post-`agent.wait` fix)
+- Historical deferred task `task-20260304-211216-acp-claude-codex` still referenced old failures `Claude: acpx exited with code 1` and `Codex: acpx exited with code 5`, but those exact crashes were **not** reproduced in the latest focused pass.
+- Current host state check:
+  - `claude` installed: `/home/linuxbrew/.linuxbrew/bin/claude` (`2.1.63`)
+  - `codex` installed: `/home/linuxbrew/.linuxbrew/bin/codex` (`0.107.0`)
+  - no global `acpx` on PATH, but bundled plugin-local runtime exists at `~/.local/share/pnpm/.../openclaw/extensions/acpx/node_modules/.bin/acpx`
+  - current `~/.openclaw/openclaw.json` only showed `plugins.entries.telegram.enabled=true`; no explicit `acp` block / `acpx` plugin entry was present, so the smallest reliable repro path used the bundled `acpx` directly rather than a full OpenClaw ACP session
+- Live direct bundled-acpx repro results:
+  - Codex command:
+    - `.../acpx --format json --json-strict --timeout 15 codex exec 'reply with OK only'`
+    - result: clean JSON-RPC/session stream ended with `agent_message_chunk: "OK"`, `id:2 result:{stopReason:"end_turn"}`, process `exit=0`
+  - Claude command:
+    - `.../acpx --format json --json-strict --timeout 20 claude exec 'reply with OK only'`
+    - stdout included top-level JSON-RPC errors:
+      - `{"jsonrpc":"2.0","id":2,"error":{"code":-32000,"message":"Authentication required"}}`
+      - `{"jsonrpc":"2.0","id":null,"error":{"code":-32000,"message":"Authentication required"}}`
+    - process still exited `0`
+- Source-level finding in `external/openclaw-upstream/extensions/acpx/src/runtime-internals/events.ts`:
+  - prompt parsing handled typed `{type:"error"}` lines but dropped top-level JSON-RPC `error` responses
+  - that meant `runtime.runTurn()` could treat a Claude auth failure as success (`done`) when the agent emitted JSON-RPC errors yet exited cleanly
+- Implemented focused upstream fix on branch `fix/subagent-wait-error-outcome`:
+  - `extensions/acpx/src/runtime-internals/events.ts`
+    - `toAcpxErrorEvent()` now also recognizes top-level JSON-RPC `error` responses via `parseControlJsonError()`
+    - `parsePromptEventLine()` now emits ACP runtime `type:"error"` events for that shape instead of dropping it
+  - added regression coverage:
+    - `extensions/acpx/src/runtime-internals/events.test.ts`
+    - `extensions/acpx/src/runtime-internals/test-fixtures.ts`
+    - `extensions/acpx/src/runtime.test.ts`
+- Targeted validation passed:
+  - `cd /home/openclaw/.openclaw/workspace/external/openclaw-upstream && pnpm exec vitest run extensions/acpx/src/runtime-internals/events.test.ts extensions/acpx/src/runtime.test.ts extensions/acpx/src/runtime-internals/control-errors.test.ts`
+  - result: `22` tests passed across `3` files
+- Net status after this pass:
+  - old `acpx exited with code 1/5` reports remain historical evidence only
+  - Codex ACP direct runtime path works today
+  - Claude ACP direct runtime path currently fails for auth, and OpenClaw had a real bug in how the bundled acpx runtime parsed that failure shape
+  - remaining follow-up is end-to-end OpenClaw ACP-path validation once ACP is explicitly configured here (or if a fresh exit-code repro appears)
+- Will also explicitly requested that zap keep a light eye on active subagents and check whether they look stuck instead of assuming they are fine until completion.
+- Will explicitly reinforced on 2026-03-13 that once planning is done, zap should use subagents ASAP and start implementation in a fresh session rather than continuing to implement inside the long-lived main chat.
+- Will explicitly asked on 2026-03-13 for more frequent checks on active subagent runs; zap should inspect/steer sooner instead of waiting for long silent stretches.
@@ -5,11 +5,14 @@
    "title": "Fix ACP runtime failures for Claude Code and Codex agents",
    "owner": "zap",
    "priority": "high",
-    "status": "open",
-    "details": "Both ACP runs failed during this session (Claude: acpx exited with code 1, Codex: acpx exited with code 5). Investigate acpx/ACP runtime failure path and restore reliable delegation for claude/codex agents.",
+    "status": "in-progress",
+    "details": "Historical evidence said Claude/Codex ACP runs failed with `acpx exited with code 1/5`. Latest focused pass narrowed the live issue: direct bundled `acpx` now shows Codex working, while Claude returns top-level JSON-RPC `Authentication required` errors and exits 0. A focused upstream fix now makes the bundled acpx runtime surface those JSON-RPC prompt errors instead of silently treating them as success. Remaining work: validate through the real OpenClaw ACP session path once ACP is explicitly configured here, or capture a fresh repro of the older exit-code crashes.",
    "notes": [
      "Reported by Will on 2026-03-04.",
-      "Added as deferred follow-up while immediate LiteLLM route fix was applied directly."
+      "Added as deferred follow-up while immediate LiteLLM route fix was applied directly.",
+      "2026-03-13 follow-up: exact historical `acpx exited with code 1/5` crashes were not reproduced. Live direct bundled-acpx repros showed Codex success and Claude top-level JSON-RPC auth errors with clean exit 0.",
+      "2026-03-13 follow-up: fixed bundled acpx prompt parsing in external/openclaw-upstream so top-level JSON-RPC error responses now emit ACP runtime error events instead of being dropped. Targeted validation passed: 22 tests across events/control-errors/runtime test files.",
+      "2026-03-13 remaining step: validate the fix through a real OpenClaw ACP session once `acp`/`acpx` is explicitly enabled in local config, or wait for a fresh end-to-end repro of the older exit-code failures."
    ]
  },
  {
@@ -26,7 +29,12 @@
      "Implemented local TUI filtering patch in openclaw dist to suppress internal runtime completion context blocks (tui-LeOEBhMz.js).",
      "Patch timestamp: 2026-03-04T22:31:50Z",
      "Upstream patch committed in external/openclaw-upstream on branch fix/tui-hide-internal-runtime-context commit 0f66a4547 (suppress internal runtime completion context blocks in TUI formatter).",
-      "Validation: pnpm test:fast completed successfully (812 files / 6599 tests passing) at 2026-03-04T22:53:29Z"
+      "Validation: pnpm test:fast completed successfully (812 files / 6599 tests passing) at 2026-03-04T22:53:29Z",
+      "2026-03-13: confirmed corrected LiteLLM run was still failing (child transcript showed assistant 429/plan error for GLM-5) while runs.json incorrectly stored outcome.status=ok and frozenResultText=null; implemented upstream branch fix/subagent-wait-error-outcome to derive terminal subagent outcome from latest assistant error state, with targeted validation (50 tests passed across 3 files).",
+      "2026-03-13 later: live gpt-5.4 success repro passed (run 23750d80-b481-4f50-b219-cc9245be405f). Live gpt-5.4 failure repro also passed for subagent persistence/announcement handling: child run b50cb91f-6219-44f7-9d2f-a1264ac7ceaf ended with transcript stopReason=error + context_length_exceeded, and runs.json now stored outcome.status=error / endedReason=subagent-error / frozenResultText non-null. Remaining open nuance: raw agent.wait for that same failed child still returned status=ok.",
+      "2026-03-13 later: traced raw agent.wait=status:ok-on-terminal-error to an upstream bug in commands/agent.ts fallback lifecycle emission (phase:end emitted even when resolved run meta.stopReason=error). Added focused upstream fix plus dedupe-path handling/tests on branch fix/subagent-wait-error-outcome; targeted validation passed (81 tests across commands/agent.test.ts, gateway/server-methods/agent-wait-dedupe.test.ts, gateway/server-methods/server-methods.test.ts). Live verification of the new agent.wait behavior remains open.",
+      "2026-03-13 final live pass: a fresh source-run gateway on port 18902 still returned agent.wait status=ok for run gwc-live-agent-wait-gpt53-source-1773427981614 even though the same session's terminal assistant message had provider=openai-codex model=gpt-5.3-codex stopReason=error with context_length_exceeded. Most likely remaining gap: an inner lifecycle emitter still marks the live direct gateway path as phase:end early enough that waitForAgentJob resolves ok before dedupe/result-meta rescue logic can win.",
+      "2026-03-13 final focused pass: closed the remaining raw agent.wait bug. Root cause was the live direct gateway path receiving agent_end with a terminal assistant error but no preceding message_end, leaving stale assistant state and still emitting lifecycle phase:end. Final fix updated embedded subscribe lifecycle handling to recover terminal assistant errors from agent_end/session state and updated gateway server-methods/agent.ts to derive final RPC status from observed lifecycle + resolved result metadata. Validation passed (108 tests across 5 files). Live source-gateway repro on port 18903 then returned finalStatus:error and agent.wait status:error for run gwc-live-agent-wait-gpt53-source-fixed2-1773429512008."
    ]
  },
  {
@@ -59,13 +67,13 @@
    "title": "Add calendar access/backend for proactive scheduling help",
    "owner": "zap",
    "priority": "medium",
-    "status": "in-progress",
+    "status": "done",
    "details": "Set up or connect a calendar backend so zap can provide stronger calendar-aware assistance, daily briefs, and schedule checks.",
    "notes": [
      "Added from LAN-services gap review on 2026-03-11.",
-      "Biggest functional gap identified at the time.",
-      "Progress 2026-03-12: access/auth now exists and the remaining work is productionizing n8n-routed execution plus verification.",
-      "Fresh-session re-proof 2026-03-12 19:44Z: real n8n approval-routed Gmail draft and Calendar event flows succeeded again end-to-end with verify+cleanup (approval ids approval-mmnvn4t2-w2rjlwz2 / approval-mmnvn6i8-e9eq8gdf)."
+      "Completed 2026-03-12: live n8n action bus now supports approval-gated Calendar actions (create, list, update, delete) via host-side gog bridge.",
+      "Live workflow id: Jwi54VWMdlLqYnRo.",
+      "Evidence: WIP.md + memory/2026-03-12.md + WIP.drive-docs-sheets.md (decision to defer Drive/Docs/Sheets)."
    ]
  },
  {
@@ -74,12 +82,13 @@
    "title": "Add email/inbox access for triage and briefing",
    "owner": "zap",
    "priority": "medium",
-    "status": "in-progress",
+    "status": "done",
    "details": "Set up access to a mail/inbox workflow so zap can help with triage, summaries, and urgent-message detection.",
    "notes": [
      "Added from LAN-services gap review on 2026-03-11.",
-      "Progress 2026-03-12: access/auth now exists and the remaining work is productionizing n8n-routed execution plus verification.",
-      "Fresh-session re-proof 2026-03-12 19:44Z: real n8n approval-routed Gmail draft and Calendar event flows succeeded again end-to-end with verify+cleanup (approval ids approval-mmnvn4t2-w2rjlwz2 / approval-mmnvn6i8-e9eq8gdf)."
+      "Completed 2026-03-12: live n8n action bus now supports approval-gated Gmail actions (draft create, list, delete, send) via host-side gog bridge.",
+      "Live workflow id: Jwi54VWMdlLqYnRo.",
+      "Evidence: WIP.md + memory/2026-03-12.md + WIP.drive-docs-sheets.md (decision to defer Drive/Docs/Sheets)."
    ]
  },
  {
@@ -125,17 +134,14 @@
    "title": "Add Google Workspace access (Calendar/Drive/Docs/Gmail as appropriate)",
    "owner": "zap",
    "priority": "medium",
-    "status": "in-progress",
+    "status": "done",
    "details": "Connect Google Workspace services where useful so zap can work with calendar, docs, drive, and/or gmail more directly.",
    "notes": [
      "Added from tool wishlist on 2026-03-11.",
-      "Some overlap with calendar/email tasks; this is the broader suite-level follow-up.",
-      "Progress 2026-03-12: gog auth completed for Gmail/Calendar/Drive/Contacts/Docs/Sheets.",
-      "Progress 2026-03-12: live n8n action bus now supports approval-gated Google flows via host-side gog bridge.",
-      "Next proof step: run real n8n-routed Gmail draft and Calendar event tests (not just dry-run).",
-      "Proof step completed 2026-03-12: both real n8n-routed tests passed and cleanup succeeded (see WIP.md + memory/2026-03-12.md for IDs).",
-      "State file: WIP.md tracks the current full plan and checkpoints.",
-      "Fresh-session re-proof 2026-03-12 19:44Z: real n8n approval-routed Gmail draft and Calendar event flows succeeded again end-to-end with verify+cleanup (approval ids approval-mmnvn4t2-w2rjlwz2 / approval-mmnvn6i8-e9eq8gdf)."
+      "Completed 2026-03-12: Gmail and Calendar are live via n8n action bus with approval gating and audit history.",
+      "Drive/Docs/Sheets evaluated and deferred in WIP.drive-docs-sheets.md — revisit only when concrete use cases appear.",
+      "Live workflow id: Jwi54VWMdlLqYnRo.",
+      "Evidence: WIP.md + memory/2026-03-12.md + WIP.drive-docs-sheets.md."
    ]
  },
  {
@@ -328,7 +334,7 @@
    "owner": "zap",
    "priority": "low",
    "status": "open",
-    "details": "Create a small place to record important decisions and why they were made so the same choices do not get re-litigated repeatedly.",
+    "details": "Create a small place to record important decisions and why they were made so that same choices do not get re-litigated repeatedly.",
    "notes": [
      "Added from second-wave improvements list on 2026-03-11."
    ]
@@ -0,0 +1,231 @@
+#!/usr/bin/env node
+const fs = require('node:fs');
+const path = require('node:path');
+const os = require('node:os');
+
+function resolveOpenClawPackageRoot() {
+  const wrapperPath = path.join(os.homedir(), '.local', 'bin', 'openclaw');
+  const wrapper = fs.readFileSync(wrapperPath, 'utf8');
+  const match = wrapper.match(/"([^"]*node_modules\/openclaw)\/openclaw\.mjs"/);
+  if (!match) throw new Error(`Could not resolve openclaw package root from ${wrapperPath}`);
+  const raw = match[1];
+  if (raw.startsWith('$basedir/')) {
+    return path.resolve(path.dirname(wrapperPath), raw.replace(/^\$basedir\//, ''));
+  }
+  return raw;
+}
+
+function replaceOnce(content, oldText, newText, label, filePath) {
+  if (content.includes(newText)) return { content, changed: false, already: true };
+  if (!content.includes(oldText)) {
+    throw new Error(`Patch block not found for ${label} in ${filePath}`);
+  }
+  return { content: content.replace(oldText, newText), changed: true, already: false };
+}
+
+function ensureDir(dir) {
+  fs.mkdirSync(dir, { recursive: true });
+}
+
+const extractAssistantTextOld = `function extractAssistantText(message) {
+\tif (!message || typeof message !== "object") return;
+\tif (message.role !== "assistant") return;
+\tconst content = message.content;
+\tif (!Array.isArray(content)) return;
+\tconst joined = extractTextFromChatContent(content, {
+\t\tsanitizeText: sanitizeTextContent,
+\t\tjoinWith: "",
+\t\tnormalizeText: (text) => text.trim()
+\t}) ?? "";
+\tconst stopReason = message.stopReason;
+\tconst errorMessage = message.errorMessage;
+\tconst errorContext = stopReason === "error" || typeof errorMessage === "string" && Boolean(errorMessage.trim());
+\treturn joined ? sanitizeUserFacingText(joined, { errorContext }) : void 0;
+}`;
+
+const extractAssistantTextNew = `function extractAssistantText(message) {
+\tif (!message || typeof message !== "object") return;
+\tif (message.role !== "assistant") return;
+\tconst content = message.content;
+\tif (!Array.isArray(content)) return;
+\tconst joined = extractTextFromChatContent(content, {
+\t\tsanitizeText: sanitizeTextContent,
+\t\tjoinWith: "",
+\t\tnormalizeText: (text) => text.trim()
+\t}) ?? "";
+\tconst stopReason = message.stopReason;
+\tconst errorMessage = message.errorMessage;
+\tconst errorContext = stopReason === "error" || typeof errorMessage === "string" && Boolean(errorMessage.trim());
+\treturn joined ? sanitizeUserFacingText(joined, { errorContext }) : void 0;
+}
+function extractAssistantTerminalText(message) {
+\tif (!message || typeof message !== "object") return { isError: false };
+\tif (message.role !== "assistant") return { isError: false };
+\tconst stopReason = message.stopReason;
+\tconst rawErrorMessage = message.errorMessage;
+\tconst isError = stopReason === "error" || typeof rawErrorMessage === "string" && Boolean(rawErrorMessage.trim());
+\tconst text = extractAssistantText(message);
+\tif (text?.trim()) return { text, isError };
+\tif (typeof rawErrorMessage === "string" && rawErrorMessage.trim()) {
+\t\treturn {
+\t\t\ttext: sanitizeUserFacingText(rawErrorMessage.trim(), { errorContext: true }),
+\t\t\tisError: true
+\t\t};
+\t}
+\treturn { isError };
+}`;
+
+const readLatestAssistantReplyOld = `async function readLatestAssistantReply(params) {
+\tconst history = await callGateway({
+\t\tmethod: "chat.history",
+\t\tparams: {
+\t\t\tsessionKey: params.sessionKey,
+\t\t\tlimit: params.limit ?? 50
+\t\t}
+\t});
+\tconst filtered = stripToolMessages(Array.isArray(history?.messages) ? history.messages : []);
+\tfor (let i = filtered.length - 1; i >= 0; i -= 1) {
+\t\tconst candidate = filtered[i];
+\t\tif (!candidate || typeof candidate !== "object") continue;
+\t\tif (candidate.role !== "assistant") continue;
+\t\tconst text = extractAssistantText(candidate);
+\t\tif (!text?.trim()) continue;
+\t\treturn text;
+\t}
+}`;
+
+const readLatestAssistantReplyNew = `async function readLatestAssistantOutcome(params) {
+\tconst history = await callGateway({
+\t\tmethod: "chat.history",
+\t\tparams: {
+\t\t\tsessionKey: params.sessionKey,
+\t\t\tlimit: params.limit ?? 50
+\t\t}
+\t});
+\tconst filtered = stripToolMessages(Array.isArray(history?.messages) ? history.messages : []);
+\tfor (let i = filtered.length - 1; i >= 0; i -= 1) {
+\t\tconst candidate = filtered[i];
+\t\tif (!candidate || typeof candidate !== "object") continue;
+\t\tif (candidate.role !== "assistant") continue;
+\t\treturn extractAssistantTerminalText(candidate);
+\t}
+\treturn { isError: false };
+}
+async function readLatestAssistantReply(params) {
+\tconst outcome = await readLatestAssistantOutcome(params);
+\treturn outcome.text?.trim() ? outcome.text : void 0;
+}`;
+
+const waitOutcomeOld = `\t\tconst waitError = typeof wait.error === "string" ? wait.error : void 0;
+\t\tconst outcome = wait.status === "error" ? {
+\t\t\tstatus: "error",
+\t\t\terror: waitError
+\t\t} : wait.status === "timeout" ? { status: "timeout" } : { status: "ok" };
+\t\tif (!runOutcomesEqual(entry.outcome, outcome)) {
+\t\t\tentry.outcome = outcome;
+\t\t\tmutated = true;
+\t\t}
+\t\tif (mutated) persistSubagentRuns();
+\t\tawait completeSubagentRun({
+\t\t\trunId,
+\t\t\tendedAt: entry.endedAt,
+\t\t\toutcome,
+\t\t\treason: wait.status === "error" ? SUBAGENT_ENDED_REASON_ERROR : SUBAGENT_ENDED_REASON_COMPLETE,
+\t\t\tsendFarewell: true,
+\t\t\taccountId: entry.requesterOrigin?.accountId,
+\t\t\ttriggerCleanup: true
+\t\t});`;
+
+const waitOutcomeNew = `\t\tconst waitError = typeof wait.error === "string" ? wait.error : void 0;
+\t\tlet outcome = wait.status === "error" ? {
+\t\t\tstatus: "error",
+\t\t\terror: waitError
+\t\t} : wait.status === "timeout" ? { status: "timeout" } : { status: "ok" };
+\t\tif (outcome.status === "ok") try {
+\t\t\tconst latestAssistant = await readLatestAssistantOutcome({
+\t\t\t\tsessionKey: entry.childSessionKey,
+\t\t\t\tlimit: 50
+\t\t\t});
+\t\t\tif (latestAssistant.isError) outcome = {
+\t\t\t\tstatus: "error",
+\t\t\t\terror: latestAssistant.text?.trim() || waitError
+\t\t\t};
+\t\t} catch {}
+\t\tif (!runOutcomesEqual(entry.outcome, outcome)) {
+\t\t\tentry.outcome = outcome;
+\t\t\tmutated = true;
+\t\t}
+\t\tif (mutated) persistSubagentRuns();
+\t\tawait completeSubagentRun({
+\t\t\trunId,
+\t\t\tendedAt: entry.endedAt,
+\t\t\toutcome,
+\t\t\treason: outcome.status === "error" ? SUBAGENT_ENDED_REASON_ERROR : SUBAGENT_ENDED_REASON_COMPLETE,
+\t\t\tsendFarewell: true,
+\t\t\taccountId: entry.requesterOrigin?.accountId,
+\t\t\ttriggerCleanup: true
+\t\t});`;
+
+const announceGuardOld = `\t\tif (!outcome) outcome = { status: "unknown" };`;
+const announceGuardNew = `\t\tif (outcome?.status === "ok") try {
+\t\t\tconst latestAssistant = await readLatestAssistantOutcome({
+\t\t\t\tsessionKey: params.childSessionKey,
+\t\t\t\tlimit: 50
+\t\t\t});
+\t\t\tif (latestAssistant.isError) {
+\t\t\t\tif (!reply?.trim() && latestAssistant.text?.trim()) reply = latestAssistant.text;
+\t\t\t\toutcome = {
+\t\t\t\t\tstatus: "error",
+\t\t\t\t\terror: latestAssistant.text?.trim() || outcome.error
+\t\t\t\t};
+\t\t\t}
+\t\t} catch {}
+\t\tif (!outcome) outcome = { status: "unknown" };`;
+
+function main() {
+  const pkgRoot = resolveOpenClawPackageRoot();
+  const targets = [
+    path.join(pkgRoot, 'dist', 'reply-DeXK9BLT.js'),
+    path.join(pkgRoot, 'dist', 'compact-D3emcZgv.js'),
+    path.join(pkgRoot, 'dist', 'pi-embedded-CrsFdYam.js'),
+    path.join(pkgRoot, 'dist', 'pi-embedded-jHMb7qEG.js'),
+    path.join(pkgRoot, 'dist', 'plugin-sdk', 'dispatch-CJdFmoH9.js'),
+  ].filter((file) => fs.existsSync(file));
+
+  const backupRoot = path.join(os.homedir(), '.openclaw', 'workspace', 'tmp', 'openclaw-subagent-outcome-hotfix');
+  const stamp = new Date().toISOString().replace(/[:.]/g, '-');
+  const thisBackupDir = path.join(backupRoot, stamp);
+  let touched = 0;
+
+  for (const file of targets) {
+    let content = fs.readFileSync(file, 'utf8');
+    let changed = false;
+
+    for (const [label, oldText, newText] of [
+      ['extractAssistantTerminalText', extractAssistantTextOld, extractAssistantTextNew],
+      ['readLatestAssistantOutcome', readLatestAssistantReplyOld, readLatestAssistantReplyNew],
+      ['wait outcome downgrade', waitOutcomeOld, waitOutcomeNew],
+      ['announce error guard', announceGuardOld, announceGuardNew],
+    ]) {
+      const result = replaceOnce(content, oldText, newText, label, file);
+      content = result.content;
+      changed = changed || result.changed;
+    }
+
+    if (changed) {
+      ensureDir(thisBackupDir);
+      const backupPath = path.join(thisBackupDir, path.basename(file));
+      fs.copyFileSync(file, backupPath);
+      fs.writeFileSync(file, content, 'utf8');
+      touched += 1;
+      console.log(`patched ${file}`);
+    } else {
+      console.log(`already patched ${file}`);
+    }
+  }
+
+  console.log(`done; touched ${touched} file(s)`);
+  if (touched > 0) console.log(`backup: ${thisBackupDir}`);
+}
+
+main();
@@ -40,6 +40,11 @@ Keep the integration narrow: let OpenClaw decide what to do, and let n8n execute
  - `assets/test-send-gmail-draft.json`
  - `assets/test-send-approved-email.json`
  - `assets/test-create-calendar-event.json`
+  - `assets/test-list-upcoming-events.json`
+  - `assets/test-update-calendar-event.json`
+  - `assets/test-delete-calendar-event.json`
+  - `assets/test-verify-email-draft-cycle.json`
+  - `assets/test-verify-calendar-event-cycle.json`

 ## Quick usage

@@ -61,6 +66,8 @@ Call the preferred action-bus route:
 scripts/call-action.sh append_log --args '{"text":"backup complete"}' --request-id auto
 scripts/call-action.sh get_logs --args '{"limit":5}' --pretty
 scripts/call-action.sh list_email_drafts --args '{"max":10}' --pretty
+scripts/call-action.sh list_upcoming_events --args '{"days":7,"max":10}' --pretty
+scripts/call-action.sh approval_queue_list --args '{"limit":5,"include_history":true}' --pretty
 ```

 Call a test webhook while editing a flow:
@@ -115,7 +122,10 @@ Use the included workflow asset when you want a ready-made local router for:
 - `list_email_drafts` → queue approval-gated Gmail draft list requests (read-only, low mutation level)
 - `delete_email_draft` → queue approval-gated Gmail draft deletion requests
 - `send_gmail_draft` (alias: `send_approved_email`) → queue approval-gated Gmail draft send requests
- `create_calendar_event` → queue approval-gated calendar proposals in workflow static data
+- `create_calendar_event` → queue approval-gated calendar creation proposals in workflow static data
+- `list_upcoming_events` → queue approval-gated calendar event listing requests (read-only, low mutation level)
+- `update_calendar_event` → queue approval-gated calendar event update requests
+- `delete_calendar_event` → queue approval-gated calendar event deletion requests
 - `approval_queue_add` / `approval_queue_list` / `approval_queue_resolve` → manage pending approvals and recent history
 - `approval_history_attach_execution` → let a host-side executor attach real execution metadata back onto approval history entries
 - `fetch_and_normalize_url` → fetch + normalize URL content using n8n runtime HTTP helpers
@@ -144,11 +154,15 @@ Supported host-executed approval kinds:
 - `email_draft_delete` → `gog gmail drafts delete`
 - `email_draft_send` → `gog gmail drafts send`
 - `calendar_event` → `gog calendar create`
+- `calendar_list_events` → `gog calendar events`
+- `calendar_event_update` → `gog calendar update`
+- `calendar_event_delete` → `gog calendar delete`

 Practical note:
 - unattended execution needs `GOG_KEYRING_PASSWORD` available to the executor because `gog`'s file keyring cannot prompt in non-TTY automation
 - the included bridge auto-loads `/home/openclaw/.openclaw/credentials/gog.env` when present, so you can keep `GOG_ACCOUNT` and `GOG_KEYRING_PASSWORD` there with mode `600`
 - for safe plumbing tests without touching Google state, add `--dry-run`
+- approval queue/history reads now expose compact `pending_compact` / `history_compact` entries plus `summary_line` + `result_refs` for low-noise operator review

 ### Add a new webhook-backed capability

@@ -0,0 +1,8 @@
+{
+  "action": "delete_calendar_event",
+  "args": {
+    "calendar": "primary",
+    "event_id": "example-calendar-event-id",
+    "send_updates": "none"
+  }
+}
@@ -0,0 +1,9 @@
+{
+  "action": "list_upcoming_events",
+  "args": {
+    "calendar": "primary",
+    "days": 7,
+    "max": 10,
+    "query": "zap"
+  }
+}
@@ -0,0 +1,13 @@
+{
+  "action": "update_calendar_event",
+  "args": {
+    "calendar": "primary",
+    "event_id": "example-calendar-event-id",
+    "title": "Updated call with vendor",
+    "start": "2026-03-13T18:15:00Z",
+    "end": "2026-03-13T18:45:00Z",
+    "location": "Updated room",
+    "description": "Updated by OpenClaw action bus.",
+    "send_updates": "none"
+  }
+}
@@ -0,0 +1,11 @@
+{
+  "action": "create_calendar_event",
+  "request_id": "verify-calendar-event-cycle-001",
+  "args": {
+    "calendar": "primary",
+    "title": "[zap verify] Calendar event cycle smoke",
+    "start": "2030-01-15T18:00:00Z",
+    "end": "2030-01-15T18:30:00Z",
+    "description": "Recurring verification payload for the n8n Google Workspace action bus. Queue this event, approve it through the gog bridge, verify the created calendar event, then delete the event as cleanup."
+  }
+}
@@ -0,0 +1,9 @@
+{
+  "action": "send_email_draft",
+  "request_id": "verify-email-draft-cycle-001",
+  "args": {
+    "to": ["will@example.com"],
+    "subject": "[zap verify] Gmail draft cycle smoke",
+    "body_text": "Recurring verification payload for the n8n Google Workspace action bus. Queue this draft, approve it through the gog bridge, verify the created Gmail draft, then delete the draft as cleanup."
+  }
+}
@@ -20,6 +20,9 @@ It implements a real local OpenClaw → n8n router.
  - `delete_email_draft`
  - `send_gmail_draft` (alias: `send_approved_email`)
  - `create_calendar_event`
+  - `list_upcoming_events`
+  - `update_calendar_event`
+  - `delete_calendar_event`
  - `approval_queue_add`
  - `approval_queue_list`
  - `approval_queue_resolve`
@@ -60,6 +63,9 @@ Actions:
 - `delete_email_draft`
 - `send_gmail_draft` (alias: `send_approved_email`)
 - `create_calendar_event`
+- `list_upcoming_events`
+- `update_calendar_event`
+- `delete_calendar_event`

 Behavior:
 - queue proposals into workflow static data under key:
@@ -69,13 +75,41 @@ Behavior:
 - do **not** execute Gmail/Calendar side effects directly in the shipped starter workflow
 - are intended for host-side execution via the included `gog` bridge after explicit approval resolution

-Approval policy defaults:
- `send_email_draft`, `delete_email_draft`, `send_gmail_draft` / `send_approved_email`, `create_calendar_event`
+Approval policy defaults by action family:
+- notification family
+  - `send_notification_draft`
+  - `approval.family = "notification"`
  - `approval.required = true`
  - `approval.mutation_level = "high"`
- `list_email_drafts`
-  - `approval.required = true`
-  - `approval.mutation_level = "low"` (read-only action, still routed through approval queue for explicit operator acknowledgement + audit trail)
+  - approved items execute inline in n8n via the existing `notify` path
+- Gmail family
+  - read-only: `list_email_drafts`
+    - `approval.family = "gmail"`
+    - `approval.required = true`
+    - `approval.mutation_level = "low"`
+    - still queued so operators must explicitly acknowledge host-side Gmail reads
+  - mutating: `send_email_draft`, `delete_email_draft`, `send_gmail_draft` / `send_approved_email`
+    - `approval.family = "gmail"`
+    - `approval.required = true`
+    - `approval.mutation_level = "high"`
+- Calendar family
+  - read-only: `list_upcoming_events`
+    - `approval.family = "calendar"`
+    - `approval.required = true`
+    - `approval.mutation_level = "low"`
+  - mutating: `create_calendar_event`, `update_calendar_event`, `delete_calendar_event`
+    - `approval.family = "calendar"`
+    - `approval.required = true`
+    - `approval.mutation_level = "high"`
+- manual/generic approvals
+  - `approval_queue_add`
+  - no automatic side effect is implied; the operator decides what the queued item means
+
+Queue/history entries now also carry compact operator-facing fields for low-noise review:
+- `approvalQueue[].payload_preview`
+- `approvalQueue[].operator.summary_line`
+- `approvalHistory[].operator.execution_state`
+- `approvalHistory[].operator.result_refs`

 ### `approval_queue_resolve`

@@ -84,12 +118,20 @@ Approval policy defaults:
  - `approvalHistory`
 - supports optional notification on approval/rejection
 - executes notification drafts inline when the approved item kind is `notification`
+- returns both the full resolved item and a compact operator view at:
+  - `result.item_compact`

 ### `approval_history_attach_execution`

 - patches an existing resolved history item in `approvalHistory`
 - designed for host-side executors that run outside n8n itself
 - used by the included `scripts/resolve-approval-with-gog.py` bridge to attach Gmail/Calendar execution results
+- the updated history entry now includes low-noise operator metadata such as:
+  - `operator.summary_line`
+  - `operator.execution_state`
+  - `operator.result_refs`
+  - `execution.summary`
+- returns both the full item and `result.item_compact`

 ### `fetch_and_normalize_url`

@@ -172,6 +214,11 @@ After import, set this manually in n8n:
 - `assets/test-send-gmail-draft.json`
 - `assets/test-send-approved-email.json`
 - `assets/test-create-calendar-event.json`
+- `assets/test-list-upcoming-events.json`
+- `assets/test-update-calendar-event.json`
+- `assets/test-delete-calendar-event.json`
+- `assets/test-verify-email-draft-cycle.json`
+- `assets/test-verify-calendar-event-cycle.json`
 - `assets/test-fetch-and-normalize-url.json`
 - `assets/test-approval-queue-list.json`
 - `assets/test-inbound-event-filter.json`
@@ -190,6 +237,11 @@ scripts/call-action.sh delete_email_draft --args-file assets/test-delete-email-d
 scripts/call-action.sh send_gmail_draft --args-file assets/test-send-gmail-draft.json --pretty
 scripts/call-action.sh send_approved_email --args-file assets/test-send-approved-email.json --pretty
 scripts/call-action.sh create_calendar_event --args-file assets/test-create-calendar-event.json --pretty
+scripts/call-action.sh list_upcoming_events --args-file assets/test-list-upcoming-events.json --pretty
+scripts/call-action.sh update_calendar_event --args-file assets/test-update-calendar-event.json --pretty
+scripts/call-action.sh delete_calendar_event --args-file assets/test-delete-calendar-event.json --pretty
+scripts/call-action.sh --args-file assets/test-verify-email-draft-cycle.json --pretty
+scripts/call-action.sh --args-file assets/test-verify-calendar-event-cycle.json --pretty
 scripts/call-action.sh fetch_and_normalize_url --args '{"url":"http://192.168.153.113:18808/healthz"}' --pretty
 scripts/call-action.sh fetch_and_normalize_url --args '{"url":"https://example.com","skip_ssl_certificate_validation":true}' --pretty
 scripts/call-action.sh approval_queue_list --args '{"limit":10,"include_history":true}' --pretty
@@ -197,6 +249,76 @@ scripts/call-action.sh inbound_event_filter --args-file assets/test-inbound-even
 python3 scripts/resolve-approval-with-gog.py --id <approval-id> --decision approve --dry-run
 ```

+## Operator command reference
+
+Common approval flows:
+
+```bash
+# 1) inspect the queue with both full and compact views
+scripts/call-action.sh approval_queue_list --args '{"limit":10,"include_history":true}' --pretty
+
+# 2) reject an item without host-side execution
+scripts/call-action.sh approval_queue_resolve \
+  --args '{"id":"approval-abc123","decision":"reject","note":"not safe to run"}' \
+  --pretty
+
+# 3) approve through the host bridge, but keep it side-effect free
+python3 scripts/resolve-approval-with-gog.py --id approval-abc123 --decision approve --dry-run
+
+# 4) approve for real through the host bridge
+python3 scripts/resolve-approval-with-gog.py --id approval-abc123 --decision approve
+
+# 5) re-check recent history in compact form
+scripts/call-action.sh approval_queue_list --args '{"limit":5,"include_history":true}' --pretty
+```
+
+What to look for in low-noise history output:
+- `result.pending_compact[]` and `result.history_compact[]`
+- `summary_line` for a one-line operator digest
+- `execution_state` for `pending`, `awaiting_host_execution`, `dry_run`, `executed`, or `failed`
+- `result_refs` for durable IDs such as `draft_id`, `message_id`, or `event_id`
+
+## Canned recurring verification flows
+
+### Gmail draft queue → approve → verify → cleanup
+
+1. Queue the canned payload:
+
+```bash
+scripts/call-action.sh --args-file assets/test-verify-email-draft-cycle.json --pretty
+```
+
+2. Find the new approval id from `pending_compact`.
+3. Approve with the host bridge:
+
+```bash
+python3 scripts/resolve-approval-with-gog.py --id <approval-id> --decision approve
+```
+
+4. Re-run `approval_queue_list` and confirm the matching history item shows:
+   - `execution_state = "executed"`
+   - `result_refs.draft_id` populated
+5. Cleanup by queueing `assets/test-delete-email-draft.json` with the returned draft id and approving that item.
+
+### Calendar event queue → approve → verify → cleanup
+
+1. Queue the canned payload:
+
+```bash
+scripts/call-action.sh --args-file assets/test-verify-calendar-event-cycle.json --pretty
+```
+
+2. Approve with the host bridge:
+
+```bash
+python3 scripts/resolve-approval-with-gog.py --id <approval-id> --decision approve
+```
+
+3. Re-run `approval_queue_list` and confirm the matching history item shows:
+   - `execution_state = "executed"`
+   - `result_refs.event_id` populated
+4. Cleanup by queueing `assets/test-delete-calendar-event.json` with the returned `event_id` and approving that item.
+
 ## Expected success examples

 ### send_notification_draft
@@ -295,6 +417,56 @@ python3 scripts/resolve-approval-with-gog.py --id <approval-id> --decision appro
 }
 ```

+### list_upcoming_events
+
+```json
+{
+  "ok": true,
+  "request_id": "test-list-calendar-events-001",
+  "result": {
+    "action": "list_upcoming_events",
+    "status": "queued_for_approval",
+    "pending_id": "approval-pqr678",
+    "approval_status": "pending",
+    "approval": {
+      "policy": "approval_queue_resolve",
+      "required": true,
+      "mutation_level": "low"
+    }
+  }
+}
+```
+
+### update_calendar_event
+
+```json
+{
+  "ok": true,
+  "request_id": "test-update-calendar-event-001",
+  "result": {
+    "action": "update_calendar_event",
+    "status": "queued_for_approval",
+    "pending_id": "approval-stu901",
+    "approval_status": "pending"
+  }
+}
+```
+
+### delete_calendar_event
+
+```json
+{
+  "ok": true,
+  "request_id": "test-delete-calendar-event-001",
+  "result": {
+    "action": "delete_calendar_event",
+    "status": "queued_for_approval",
+    "pending_id": "approval-vwx234",
+    "approval_status": "pending"
+  }
+}
+```
+
 ### fetch_and_normalize_url

 ```json
@@ -339,6 +511,9 @@ Behavior:
  - `email_draft_delete` → `gog gmail drafts delete`
  - `email_draft_send` → `gog gmail drafts send`
  - `calendar_event` → `gog calendar create`
+  - `calendar_list_events` → `gog calendar events`
+  - `calendar_event_update` → `gog calendar update`
+  - `calendar_event_delete` → `gog calendar delete`
 - writes execution metadata back via `approval_history_attach_execution`

 Important automation note:
@@ -48,6 +48,22 @@ Recommended request shape:
 }
 ```

+## Approval defaults by family
+
+- notification family
+  - `send_notification_draft`
+  - `approval.family = "notification"`
+  - `approval.required = true`
+  - `approval.mutation_level = "high"`
+- Gmail family
+  - read-only: `list_email_drafts` → `approval.family = "gmail"`, `approval.mutation_level = "low"`
+  - mutating: `send_email_draft`, `delete_email_draft`, `send_gmail_draft` / `send_approved_email` → `approval.family = "gmail"`, `approval.mutation_level = "high"`
+- Calendar family
+  - read-only: `list_upcoming_events` → `approval.family = "calendar"`, `approval.mutation_level = "low"`
+  - mutating: `create_calendar_event`, `update_calendar_event`, `delete_calendar_event` → `approval.family = "calendar"`, `approval.mutation_level = "high"`
+- manual/generic approvals
+  - `approval_queue_add` leaves side effects to the operator; there is no automatic host executor for arbitrary manual kinds
+
 ## Live actions in the shipped workflow asset

 ### `append_log`
@@ -250,6 +266,80 @@ Sink:
 - key: `approvalQueue`
 - retained entries: `200`

+### `list_upcoming_events`
+
+Request:
+
+```json
+{
+  "action": "list_upcoming_events",
+  "args": {
+    "calendar": "primary",
+    "days": 7,
+    "max": 10,
+    "query": "zap"
+  }
+}
+```
+
+Purpose:
+- queue a host-side upcoming calendar event listing request for approval/audit
+- defaults to the next `7` days when no explicit `from`/`to` window is provided
+
+Approval policy:
+- required: `true`
+- mutation level: `low` (read-only)
+
+### `update_calendar_event`
+
+Request:
+
+```json
+{
+  "action": "update_calendar_event",
+  "args": {
+    "calendar": "primary",
+    "event_id": "example-calendar-event-id",
+    "title": "Updated call with vendor",
+    "start": "2026-03-13T18:15:00Z",
+    "end": "2026-03-13T18:45:00Z",
+    "location": "Updated room",
+    "description": "Updated by OpenClaw action bus.",
+    "send_updates": "none"
+  }
+}
+```
+
+Purpose:
+- queue an update to an existing calendar event behind explicit approval
+- requires `event_id` and at least one patch field (`title`, `start`, `end`, `location`, `description`, or `attendees`)
+
+Approval policy:
+- required: `true`
+- mutation level: `high`
+
+### `delete_calendar_event`
+
+Request:
+
+```json
+{
+  "action": "delete_calendar_event",
+  "args": {
+    "calendar": "primary",
+    "event_id": "example-calendar-event-id",
+    "send_updates": "none"
+  }
+}
+```
+
+Purpose:
+- queue deletion of an existing calendar event behind explicit approval
+
+Approval policy:
+- required: `true`
+- mutation level: `high`
+
 ### `approval_queue_add`

 Request:
@@ -288,6 +378,9 @@ Request:
 Purpose:
 - inspect pending approval items
 - optionally include recent resolved history
+- returns both raw entries and compact operator-friendly summaries at:
+  - `result.pending_compact`
+  - `result.history_compact`

 ### `approval_queue_resolve`

@@ -331,6 +424,11 @@ Request:
 Purpose:
 - patch a resolved history item with host-side execution metadata after a real executor runs outside n8n
 - intended for bridges such as `gog`-backed Gmail/Calendar execution
+- compact execution reporting should populate or expose:
+  - `execution.summary`
+  - `execution.result_refs`
+  - `item.operator.summary_line`
+  - `item.operator.execution_state`

 ### `fetch_and_normalize_url`

@@ -199,7 +199,16 @@ def build_email_drafts_list_command(item: dict, account: str, dry_run: bool):
    return cmd


-def build_calendar_command(item: dict, account: str, dry_run: bool):
+def normalize_send_updates(value: str) -> str:
+    raw = (value or '').strip()
+    if raw == 'all':
+        return 'all'
+    if raw.lower() == 'externalonly':
+        return 'externalOnly'
+    return 'none'
+
+
+def build_calendar_create_command(item: dict, account: str, dry_run: bool):
    payload = item.get('payload') or {}
    calendar = payload.get('calendar') or 'primary'
    cmd = [
@@ -210,7 +219,7 @@ def build_calendar_command(item: dict, account: str, dry_run: bool):
        '--summary', payload.get('title') or '',
        '--from', payload.get('start') or '',
        '--to', payload.get('end') or '',
-        '--send-updates', 'none',
+        '--send-updates', normalize_send_updates(payload.get('send_updates') or 'none'),
    ]
    if payload.get('description'):
        cmd.extend(['--description', payload['description']])
@@ -224,6 +233,104 @@ def build_calendar_command(item: dict, account: str, dry_run: bool):
    return cmd


+def build_calendar_list_events_command(item: dict, account: str, dry_run: bool):
+    payload = item.get('payload') or {}
+    calendar = (payload.get('calendar') or 'primary').strip() or 'primary'
+    max_results = payload.get('max')
+    if max_results is None:
+        max_results = 20
+    try:
+        max_results = max(1, min(100, int(max_results)))
+    except Exception:
+        max_results = 20
+
+    days = payload.get('days')
+    if days is None:
+        days = 7
+    try:
+        days = max(1, min(90, int(days)))
+    except Exception:
+        days = 7
+
+    cmd = [
+        'gog', 'calendar', 'events', calendar,
+        '--account', account,
+        '--json',
+        '--no-input',
+        '--max', str(max_results),
+    ]
+    from_value = (payload.get('from') or '').strip()
+    to_value = (payload.get('to') or '').strip()
+    query = (payload.get('query') or '').strip()
+    if from_value:
+        cmd.extend(['--from', from_value])
+    if to_value:
+        cmd.extend(['--to', to_value])
+    if not from_value and not to_value:
+        cmd.extend(['--days', str(days)])
+    if query:
+        cmd.extend(['--query', query])
+    if payload.get('all_pages') is True:
+        cmd.append('--all-pages')
+    if payload.get('fail_empty') is True:
+        cmd.append('--fail-empty')
+    if dry_run:
+        cmd.append('--dry-run')
+    return cmd
+
+
+def build_calendar_update_command(item: dict, account: str, dry_run: bool):
+    payload = item.get('payload') or {}
+    calendar = (payload.get('calendar') or 'primary').strip() or 'primary'
+    event_id = (payload.get('event_id') or payload.get('id') or '').strip()
+    if not event_id:
+        fail('calendar_event_update payload missing event_id')
+    cmd = [
+        'gog', 'calendar', 'update', calendar, event_id,
+        '--account', account,
+        '--json',
+        '--no-input',
+        '--send-updates', normalize_send_updates(payload.get('send_updates') or 'none'),
+    ]
+    if payload.get('title'):
+        cmd.extend(['--summary', payload['title']])
+    if payload.get('start'):
+        cmd.extend(['--from', payload['start']])
+    if payload.get('end'):
+        cmd.extend(['--to', payload['end']])
+    if payload.get('description'):
+        cmd.extend(['--description', payload['description']])
+    if payload.get('location'):
+        cmd.extend(['--location', payload['location']])
+    attendees = payload.get('attendees')
+    if isinstance(attendees, list):
+        cmd.extend(['--attendees', ','.join(str(x) for x in attendees if str(x).strip())])
+    elif isinstance(attendees, str) and attendees.strip():
+        cmd.extend(['--attendees', attendees.strip()])
+    if dry_run:
+        cmd.append('--dry-run')
+    return cmd
+
+
+def build_calendar_delete_command(item: dict, account: str, dry_run: bool):
+    payload = item.get('payload') or {}
+    calendar = (payload.get('calendar') or 'primary').strip() or 'primary'
+    event_id = (payload.get('event_id') or payload.get('id') or '').strip()
+    if not event_id:
+        fail('calendar_event_delete payload missing event_id')
+    cmd = [
+        'gog', 'calendar', 'delete', calendar, event_id,
+        '--account', account,
+        '--json',
+        '--no-input',
+        '--force',
+        '--send-updates', normalize_send_updates(payload.get('send_updates') or 'none'),
+    ]
+    if dry_run:
+        cmd.append('--dry-run')
+    return cmd
+
+
 def parse_json(output: str):
    text = output.strip()
    if not text:
@@ -231,6 +338,63 @@ def parse_json(output: str):
    return json.loads(text)


+def first_string(*values):
+    for value in values:
+        if isinstance(value, str) and value.strip():
+            return value.strip()
+    return ''
+
+
+def execution_result_refs(op: str, parsed):
+    refs = {}
+    if not isinstance(parsed, dict):
+        return refs
+
+    draft = parsed.get('draft') if isinstance(parsed.get('draft'), dict) else {}
+    message = parsed.get('message') if isinstance(parsed.get('message'), dict) else {}
+    event = parsed.get('event') if isinstance(parsed.get('event'), dict) else {}
+
+    draft_id = first_string(
+        parsed.get('draft_id'),
+        draft.get('id'),
+        parsed.get('id') if op.startswith('gmail.drafts.') else '',
+    )
+    if draft_id:
+        refs['draft_id'] = draft_id
+
+    message_id = first_string(parsed.get('message_id'), message.get('id'))
+    if message_id:
+        refs['message_id'] = message_id
+
+    event_id = first_string(
+        parsed.get('event_id'),
+        event.get('id'),
+        parsed.get('id') if op.startswith('calendar.') else '',
+    )
+    if event_id:
+        refs['event_id'] = event_id
+
+    calendar = first_string(parsed.get('calendar'), parsed.get('calendar_id'), event.get('calendar'))
+    if calendar:
+        refs['calendar'] = calendar
+
+    return refs
+
+
+def execution_summary(op: str, status: str, refs: dict, dry_run: bool):
+    if status == 'failed':
+        return f'{op} failed'
+    suffix = 'dry run' if dry_run else status.replace('_', ' ')
+    ref_parts = []
+    for key in ('draft_id', 'message_id', 'event_id', 'calendar'):
+        value = refs.get(key, '')
+        if value:
+            ref_parts.append(f'{key}={value}')
+    if ref_parts:
+        return f'{op} {suffix} ({", ".join(ref_parts)})'
+    return f'{op} {suffix}'
+
+
 def main():
    ap = argparse.ArgumentParser(description='Resolve an n8n approval item and execute email/calendar actions via gog.')
    ap.add_argument('--id', required=True, help='Approval queue item id')
@@ -299,11 +463,29 @@ def main():
            'uses_tmpfile': False,
        },
        'calendar_event': {
-            'builder': build_calendar_command,
+            'builder': build_calendar_create_command,
            'op': 'calendar.create',
            'success_status': 'event_created',
            'uses_tmpfile': False,
        },
+        'calendar_list_events': {
+            'builder': build_calendar_list_events_command,
+            'op': 'calendar.events.list',
+            'success_status': 'events_listed',
+            'uses_tmpfile': False,
+        },
+        'calendar_event_update': {
+            'builder': build_calendar_update_command,
+            'op': 'calendar.update',
+            'success_status': 'event_updated',
+            'uses_tmpfile': False,
+        },
+        'calendar_event_delete': {
+            'builder': build_calendar_delete_command,
+            'op': 'calendar.delete',
+            'success_status': 'event_deleted',
+            'uses_tmpfile': False,
+        },
    }

    spec = executors.get(kind)
@@ -334,6 +516,7 @@ def main():
                pass

    if code != 0:
+        refs = {}
        execution = {
            'driver': 'gog',
            'op': op,
@@ -342,18 +525,23 @@ def main():
            'dry_run': args.dry_run,
            'stderr': stderr.strip(),
            'stdout': stdout.strip(),
+            'result_refs': refs,
+            'summary': execution_summary(op, 'failed', refs, args.dry_run),
        }
        attach = attach_execution(item['id'], execution, base_url=args.base_url, path=args.path, secret_header=args.secret_header, secret=secret)
        print(json.dumps({'resolved': resolved, 'execution': execution, 'attach': attach}, indent=2))
        raise SystemExit(code)

    parsed = parse_json(stdout) if stdout.strip() else None
+    refs = execution_result_refs(op, parsed)
    execution = {
        'driver': 'gog',
        'op': op,
        'status': success_status,
        'account': account,
        'dry_run': args.dry_run,
+        'result_refs': refs,
+        'summary': execution_summary(op, success_status, refs, args.dry_run),
        'result': parsed,
    }
    attach = attach_execution(item['id'], execution, base_url=args.base_url, path=args.path, secret_header=args.secret_header, secret=secret)
@@ -31,6 +31,11 @@ SAMPLE_FILES = [
    'test-send-gmail-draft.json',
    'test-send-approved-email.json',
    'test-create-calendar-event.json',
+    'test-list-upcoming-events.json',
+    'test-update-calendar-event.json',
+    'test-delete-calendar-event.json',
+    'test-verify-email-draft-cycle.json',
+    'test-verify-calendar-event-cycle.json',
    'test-fetch-and-normalize-url.json',
    'test-approval-queue-list.json',
    'test-inbound-event-filter.json',
@@ -47,6 +52,9 @@ ROUTER_SNIPPETS = [
    'send_gmail_draft',
    'send_approved_email',
    'create_calendar_event',
+    'list_upcoming_events',
+    'update_calendar_event',
+    'delete_calendar_event',
    'approval_queue_add',
    'approval_queue_list',
    'approval_queue_resolve',
@@ -61,7 +69,15 @@ ROUTER_SNIPPETS = [
    'email_draft_send',
    'email_draft_delete',
    'email_list_drafts',
+    'calendar_list_events',
+    'calendar_event_update',
+    'calendar_event_delete',
    'makeApprovalPolicy',
+    'pending_compact',
+    'history_compact',
+    'summary_line',
+    'result_refs',
+    'default_mode',
    'inboundEvents',
    'eventDedup',
    'notify_text',
Author	SHA1	Message	Date
zap	6341bd9fb0	chore(scripts): add openclaw subagent outcome hotfix script	2026-03-17 01:01:10 +00:00
zap	d08d8fe661	docs(reliability): record acpx follow-up evidence	2026-03-13 20:10:40 +00:00
zap	3cfa7a158c	docs(wip): tighten subagent reliability baton	2026-03-13 19:34:26 +00:00
zap	8998e7535e	docs(reliability): record live agent.wait fix verification	2026-03-13 19:24:14 +00:00
zap	f2b99841af	docs(reliability): record live agent.wait blocker evidence	2026-03-13 18:56:52 +00:00
zap	0c25426974	docs(memory): note preference for more frequent subagent checks	2026-03-13 18:49:17 +00:00
zap	59101f674f	docs(reliability): record agent wait fix diagnosis	2026-03-13 18:34:18 +00:00
zap	95135eb5f1	docs(subagents): record live gpt-5.4 failure verification	2026-03-13 18:12:11 +00:00
zap	08c1981faa	docs(wip): record success probe and next failure-path pass	2026-03-13 16:40:06 +00:00
zap	5dbbc30834	docs(memory): reinforce fresh subagent implementation preference	2026-03-13 16:37:04 +00:00
zap	49ff0998e7	docs(tasks): record subagent reliability fix progress	2026-03-13 00:26:55 +00:00
zap	8983f45d4e	docs(state): correct glm-5 entitlement note	2026-03-13 00:25:59 +00:00
zap	7669d5787d	docs(agent): clarify stuck-subagent monitoring	2026-03-13 00:21:17 +00:00
zap	3bb3888340	docs(state): seed subagent reliability investigation	2026-03-13 00:12:43 +00:00
zap	841365e020	chore(git): ignore local tmp artifacts	2026-03-13 00:01:19 +00:00
zap	bfb73cf80f	docs(tools): add Gitea repo credentials reference	2026-03-12 23:11:06 +00:00
zap	6cfb1da179	docs: close Drive/Docs/Sheets evaluation and complete Google Workspace tasks	2026-03-12 22:42:55 +00:00
zap	e6b9844097	docs(state): close n8n wip and seed drive docs sheets plan	2026-03-12 21:39:34 +00:00
zap	00cdbbb654	docs(agent): add concrete subagent monitoring thresholds	2026-03-12 21:39:30 +00:00
zap	50c020d5bc	chore(state): refresh n8n handoff checkpoints	2026-03-12 21:25:36 +00:00
zap	ffe7a6bad6	docs(n8n-webhook): add operator approval runbook	2026-03-12 21:24:41 +00:00
zap	249e671971	feat(n8n-webhook): add compact approval history views	2026-03-12 21:24:32 +00:00
zap	060da7ea1d	docs(agent): add stuck-subagent intervention rule	2026-03-12 21:23:25 +00:00
zap	2a46de2287	docs(state): record calendar pass completion evidence	2026-03-12 21:11:28 +00:00
zap	4d89f02664	feat(n8n-webhook): add calendar list update delete approval flows	2026-03-12 21:11:22 +00:00
zap	c7d1432cd5	docs(wip): pin remaining implementation passes to gpt-5.4	2026-03-12 20:59:14 +00:00