swarm-zap/memory/incidents/2026-03-03-litellm-test.md

# Session: 2026-03-03 18:00:45 UTC

- **Session Key**: agent:main:main
- **Session ID**: fe1b7054-41fb-462f-8177-f4e8998f988b
- **Source**: gateway:sessions.reset

## Conversation Summary

assistant: [[reply_to_current]] Done — updated.

### What’s now in `litellm-config.yaml`
- **Gemini** was already present:
  - `gemini-1.5-pro`
  - `gemini-1.5-flash`
- **z.ai** added:
  - `zai-glm-4.7`
  - `zai-glm-4.7-flash`
  - both using:
    - `api_base: https://api.z.ai/api/coding/paas/v4`
    - `api_key: os.environ/ZAI_API_KEY`

If you want, I can also add newer Gemini aliases (e.g. 2.x) and/or switch Gemini key var from `GOOGLE_API_KEY` to `GEMINI_API_KEY` if that’s what your env uses.
user: [Tue 2026-03-03 09:37 UTC] Update the file accordingly based on the currently configured providers
assistant: [[reply_to_current]] Done — I updated `litellm-config.yaml` to match the providers currently configured on this host.

### Kept (configured)
- OpenAI-compatible local endpoint (`OPENAI_BASE_URL` + `OPENAI_API_KEY`)
- z.ai (`ZAI_API_KEY`, base `https://api.z.ai/api/coding/paas/v4`)
  - `glm-5`
  - `glm-4.7`
  - `glm-4.7-flash`
  - `glm-4.7-flashx`

### Removed (not currently configured)
- Anthropic
- OpenRouter
- Opencode
- Gemini

If you want Gemini included too, I can add it back with `GEMINI_API_KEY` entries as soon as you confirm that key/env is set.
user: [Tue 2026-03-03 09:42 UTC] We have litellm fixed. Can you test it?
assistant: [[reply_to_current]] I tested from here — current status:

- `http://192.168.153.113:18805/v1/models` responds, but returns TTS models (`kokoro`, `tts-1`, etc.), not your LiteLLM model list.
- `http://192.168.153.113:18805/v1/chat/completions` returns `404 Not Found`.
- No local LiteLLM listener appears to be running on this host (`127.0.0.1:4000/8000` not reachable).

So LiteLLM itself is likely on a different port/host (or not started yet).

If you share the exact LiteLLM base URL (e.g. `http://<host>:<port>/v1`), I’ll run a real end-to-end test immediately.
user: [Tue 2026-03-03 09:43 UTC] NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS brave-search mcp/brave-search:latest "node dist/index.js" brave-search 2 hours ago Up 2 hours 0.0.0.0:18802->8000/tcp, [::]:18802->8000/tcp kokoro-tts ghcr.io/remsky/kokoro-fastapi-cpu:latest "./entrypoint.sh" kokoro-tts 2 hours ago Up 2 hours 0.0.0.0:18805->8880/tcp, [::]:18805->8880/tcp litellm litellm/litellm:latest "docker/prod_entrypo…" litellm 2 hours ago Up 3 minutes (unhealthy) 0.0.0.0:18804->4000/tcp, [::]:18804->4000/tcp searxng searxng/searxng:latest "/usr/local/searxng/…" searxng 2 hours ago Up 2 hours 0.0.0.0:18803->8080/tcp, [::]:18803->8080/tcp whisper-server ghcr.io/ggml-org/whisper.cpp:main "whisper-server --mo…" whisper-server 2 hours ago Up 2 hours (healthy) 0.0.0.0:18801->8080/tcp, [::]:18801->8080/tcp
assistant: [[reply_to_current]] Yep — tested. Good news/bad news:

✅ **LiteLLM is reachable** at `:18804` and serving model aliases:
- `gpt-4o`
- `gpt-4-turbo`
- `zai-glm-5`
- `zai-glm-4.7`
- `zai-glm-4.7-flash`
- `zai-glm-4.7-flashx`

❌ **Actual completions fail** (auth upstream):
- `zai-glm-4.7-flash` returns `Authentication parameter not received in Header`
- Health endpoint shows all configured backends unhealthy for missing upstream API keys.

So your LiteLLM config is loaded, but container env is missing provider keys (`ZAI_API_KEY`, and for OpenAI route, `OPENAI_API_KEY` if used upstream directly).

If you want, I can give you a minimal `docker-compose` env block to make it green in one restart.
user: [Tue 2026-03-03 10:09 UTC] Can you try to hit the litellm API? Here is a virtual key: sk-IlDTV8FaW0vTUL_a4VDhMw
assistant: [[reply_to_current]] Tested with your virtual key.

- `GET /v1/models` ✅ works (returns your configured aliases)
- `POST /v1/chat/completions` ❌ still fails:
  - `zai-glm-4.7-flash`: upstream auth missing
  - `gpt-4o`: upstream OpenAI API key missing

So your **virtual key is valid for LiteLLM**, but LiteLLM itself still lacks provider creds in its container env.

You need to set (at least):
- `ZAI_API_KEY` (for z.ai models)
- `OPENAI_API_KEY` (for gpt-4o / gpt-4-turbo)

Then restart litellm container and re-test.
user: [Tue 2026-03-03 10:41 UTC] ok test now
assistant: [[reply_to_current]] Yep — much better now. I re-tested and it’s working.

### Results
- `/v1/models` ✅ returns models (now includes Claude/Gemini/z.ai aliases too).
- `gpt-4o` chat completion ✅ returned `OK`.
- `zai-glm-4.7-flash` chat completion ✅ works too (returned `OK` when given enough tokens).
- `/health` now shows partial recovery: **7 healthy / 4 unhealthy**.

So LiteLLM is up and functional with your virtual key.
If you want, I can run one quick probe to list exactly which 4 models are still unhealthy.
user: [Tue 2026-03-03 11:07 UTC] Test litellm with gpt-5-mini
assistant: [[reply_to_current]] Tested.

`gpt-5-mini` currently fails on your LiteLLM with:

> no healthy deployments for this model (`model=gpt-5-mini`)

So it isn’t mapped/configured (or healthy) in your current LiteLLM setup yet.
user: [Tue 2026-03-03 11:08 UTC] Test litellm with openai/gpt-5-mini
assistant: [[reply_to_current]] Tested `openai/gpt-5-mini`.

Result: ❌ fails with:

`OpenAIException - Unknown parameter: 'credential_name'`

So this route is being attempted, but there’s a provider-side/request-shape mismatch in the current LiteLLM/OpenAI config for that model.