swarm-zap/memory/incidents/2026-03-04-request-timed-out-before-a-res.md

# Session: 2026-03-04 20:49:36 UTC

- **Session Key**: agent:main:main
- **Session ID**: 16b9385c-9aa8-4f4e-9c0a-0653f9e3b004
- **Source**: gateway:sessions.reset

## Conversation Summary

user: Sender (untrusted metadata):
```json
{
  "label": "openclaw-tui (gateway-client)",
  "id": "gateway-client",
  "name": "openclaw-tui",
  "username": "openclaw-tui"
}
```

[Wed 2026-03-04 20:48 UTC] Can we investigate why litellm (LAN - docker) doesn't work via the regular auth and agent models, when it works via `curl`?
user: You are running a boot check. Follow BOOT.md instructions exactly.

BOOT.md:
# BOOT.md

On gateway startup, run a silent security + readiness check for this workspace.

## Startup checklist

1. Confirm core workspace files exist:
   - `AGENTS.md`, `SOUL.md`, `USER.md`, `TOOLS.md`, `HEARTBEAT.md`
2. Confirm required local skill folders exist:
   - `skills/searxng-local-search`
   - `skills/telegram-ops`
   - `skills/homelab-service-health`
   - `skills/task-capture-and-followup`
   - `skills/daily-brief`
   - `skills/calendar-sentinel`
   - `skills/inbox-triage`
3. Ensure task state exists:
   - `memory/tasks.json` (create `[]` if missing)

## Security checks

4. Check internal hooks enabled state:
   - `session-memory`
   - `command-logger`
   - `bootstrap-extra-files`
   - `boot-md`
   - if any are disabled/missing, record warning with exact hook name
5. Check permissions:
   - `~/.openclaw` should be `700`
   - credential files under `~/.openclaw/credentials/` should be `600`
   - if drift is found, log it as warning; do not auto-fix unless explicitly requested later
6. Check network exposure:
   - verify gateway is not unexpectedly exposed on `0.0.0.0`
   - capture a compact listener snapshot for OpenClaw-related ports
7. Check expected-service posture:
   - expected local services should be present/reachable (OpenClaw runtime + documented core services)
   - flag unexpected listeners only when confidence is high
8. Check backup signal freshness:
   - verify `memory/minio-backup.log` contains `Backup complete:` within last 8 hours
9. Check security-audit freshness:
   - verify a recent `openclaw security audit --deep` result exists (target: within 24 hours)
   - if stale/missing, record reminder (warning level)
10. Check update status:
   - run/read `openclaw update status`
   - record whether update is available

## State recording

11. Write/update machine-readable status file: `memory/startup-health.json` with:
    - `last_run_utc`
    - `status` (`ok|warn|critical`)
    - `checks_passed` (array)
    - `checks_failed` (array)
    - `warnings` (array)
    - `gateway_exposure` (e.g., `local-only|public|unknown`)
    - `last_backup_age_hours`
    - `last_security_audit_age_hours`
    - `update_status` (short text)
12. Write/update `memory/boot-last-run.json` with UTC timestamp + overall status.

## Notification policy

13. Stay silent when status is `ok` or non-actionable `warn`.
14. Send one short proactive alert only for **critical** conditions:
    - credential permission drift on sensitive files,
    - unexpected public exposure of gateway,
    - backup signal stale/missing beyond threshold,
    - missing critical workspace files preventing normal operation.

## Critical issue logging

If any warning/critical issue is found, append a concise line to `memory/startup-health.md` with UTC timestamp, failing check, and suggested fix.

If BOOT.md asks you to send a message, use the message tool (action=send with channel + target).
Use the `target` field (not `to`) for message tool destinations.
After sending with the message tool, reply with ONLY: NO_REPLY.
If nothing needs attention, reply with ONLY: NO_REPLY.