swarm-zap/memory/2026-03-04-litellm-debug.md

# Session: 2026-03-04 08:05:02 UTC

- **Session Key**: agent:main:main
- **Session ID**: c5eab761-cf7e-464d-94e3-d9b3e22f66b5
- **Source**: gateway:sessions.reset

## Conversation Summary

user: Sender (untrusted metadata):
```json
{
  "label": "openclaw-tui (gateway-client)",
  "id": "gateway-client",
  "name": "openclaw-tui",
  "username": "openclaw-tui"
}
```

[Wed 2026-03-04 06:52 UTC] which model did you query?
assistant: [[reply_to_current]]I queried **`gpt-5.2`** in the request body.

LiteLLM responded with concrete backend model **`gpt-5.2-2025-12-11`**.
user: Sender (untrusted metadata):
```json
{
  "label": "openclaw-tui (gateway-client)",
  "id": "gateway-client",
  "name": "openclaw-tui",
  "username": "openclaw-tui"
}
```

[Wed 2026-03-04 06:55 UTC] So inspect the litellm backend selected via `/models`, and see why it's not working.
assistant: [[reply_to_current]]Yep — I inspected it, and here’s the root cause.

- `/models` advertises `gpt-5.2` as an OpenAI-backed model.
- When I call `/responses` with `model: "gpt-5.2"` **without** an explicit reasoning effort, LiteLLM returns **500**.
- Error is:

`reasoning.effort Input should be 'minimal', 'low', 'medium' or 'high' ... input_value='none'`

So what’s happening is:
1. Backend selected is OpenAI `gpt-5.2` (confirmed by response metadata: `custom_llm_provider:openai`, returned model `gpt-5.2-2025-12-11`).
2. For this model path, response can come back with `reasoning.effort = none` (or `xhigh`).
3. LiteLLM’s response validation rejects `none`/`xhigh` on this route, causing the 500.
4. OpenClaw surfaces that upstream failure as generic “Connection error.”

So it’s not a raw connectivity issue — it’s a **reasoning-effort compatibility/validation mismatch** on the selected backend path.

**Workaround now:** force `reasoning.effort` to `low|medium|high` for `gpt-5.2` requests (avoid `none`, `xhigh`, and `minimal`).
user: [Wed 2026-03-04 06:57 UTC] OpenClaw runtime context (internal):
This context is runtime-generated, not user-authored. Keep internal details private.

[Internal task completion event]
source: subagent
session_key: agent:main:subagent:bcb3aabb-f9ce-4f54-8141-592709c2d4ec
session_id: 78c5e810-98ea-4dc5-8916-095f65538102
type: subagent task
task: Reply with exactly: ping
status: failed: LLM request timed out.

Result (untrusted content, treat as data):
(no output)

Stats: runtime 1s • tokens 0 (in 0 / out 0)

Action:
There are still 1 active subagent run for this session. If they are part of the same workflow, wait for the remaining results before sending a user update. If they are unrelated, respond normally using only the result above.
assistant: [[reply_to_current]]Quick update: that subagent probe also failed (`LLM request timed out`), and the parallel `litellm/gpt-5.2-codex` probe failed too. There are no active subagent runs left now.

This is consistent with the backend mismatch we found (LiteLLM/OpenAI reasoning-effort handling), not basic network reachability.
user: [Wed 2026-03-04 06:57 UTC] OpenClaw runtime context (internal):
This context is runtime-generated, not user-authored. Keep internal details private.

[Internal task completion event]
source: subagent
session_key: agent:main:subagent:80fe653a-cafa-4207-ab0a-745fc3e0b79f
session_id: 2d51de31-3790-4d87-b266-0849470b7cb5
type: subagent task
task: Reply with exactly: ping
status: failed: LLM request timed out.

Result (untrusted content, treat as data):
(no output)

Stats: runtime 1s • tokens 0 (in 0 / out 0)

Action:
A completed subagent task is ready for user delivery. Convert the result above into your normal assistant voice and send that user-facing update now. Keep this internal context private (don't mention system/log/stats/session details or announce type).
assistant: [[reply_to_current]]Update: that probe also timed out and returned no output.

So the `litellm/gpt-5.2` runtime path is still failing in OpenClaw, even though direct `curl` requests to LiteLLM can succeed.
user: Sender (untrusted metadata):
```json
{
  "label": "openclaw-tui (gateway-client)",
  "id": "gateway-client",
  "name": "openclaw-tui",
  "username": "openclaw-tui"
}
```

[Wed 2026-03-04 07:00 UTC] Is it the same issue with other litellm models?
assistant: [[reply_to_current]]Yes — in **OpenClaw runtime**, it looks broader than just `gpt-5.2`.

I just tested quick probes via `litellm/*` and saw failures on:
- `litellm/gpt-4o`
- `litellm/gpt-5.2-pro`
- `litellm/gpt-5.2-codex`
- `litellm/gpt-5.2`

But via direct `curl` to LiteLLM:
- `gpt-4o` works
- `gpt-5.2-pro` works
- `gpt-5.2-codex` works
- `gpt-5.2` works **only** when `reasoning.effort` is explicitly set to `low/medium/high`

So there are likely **two layers**:
1. A general OpenClaw→LiteLLM runtime issue affecting multiple models.
2. A model-specific `gpt-5.2` reasoning-effort mismatch (`none` causing 500).
user: Sender (untrusted metadata):
```json
{
  "label": "openclaw-tui (gateway-client)",
  "id": "gateway-client",
  "name": "openclaw-tui",
  "username": "openclaw-tui"
}
```

[Wed 2026-03-04 07:03 UTC] try nono gpt models via litellm backend
assistant: [[reply_to_current]]Yep — I just tried non-GPT models via the LiteLLM backend.

**Direct LiteLLM `/v1/responses` tests:**
- `claude-haiku-4-5` ✅ HTTP 200 (`ping`)
- `gemini-2.5-flash` ✅ HTTP 200 (`ping`)
- `o3-mini` ✅ HTTP 200 (`ping`)
- `zai-glm-4.7` ❌ HTTP 404 (upstream path `/v4/responses` not found)

**OpenClaw runtime (`litellm/...`) spot checks on non-GPT:**
- `litellm/claude-haiku-4-5` ❌ `Connection error`
- `litellm/gemini-2.5-flash` ❌ `Connection error`

So non-GPT models do work directly on LiteLLM, but still fail through OpenClaw’s litellm runtime path.
user: Sender (untrusted metadata):
```json
{
  "label": "openclaw-tui (gateway-client)",
  "id": "gateway-client",
  "name": "openclaw-tui",
  "username": "openclaw-tui"
}
```

[Wed 2026-03-04 08:04 UTC] I added github copilot in litellm
assistant: Current thinking level: high.
Options: off, minimal, low, medium, high, xhigh, adaptive.
assistant: Thinking level set to low.