Files

zap b0b4060bf9 chore(workspace): install all 4 bundled hooks + model default update

- Installed hooks: boot-md, session-memory, command-logger, bootstrap-extra-files
- Updated default model to litellm/copilot-claude-sonnet-4.6
- Added memory/startup-health.json + boot-last-run.json from boot check
- Added memory/startup-health.md warning log
- Captured new session memory files (2026-03-04/05)
- Minor MEMORY.md + USER.md: added git preference line

2026-03-05 04:40:46 +00:00

5.3 KiB

Raw Blame History

Session: 2026-03-03 18:00:45 UTC

Session Key: agent:main:main
Session ID: fe1b7054-41fb-462f-8177-f4e8998f988b
Source: gateway:sessions.reset

Conversation Summary

assistant: reply_to_current Done — updated.

What’s now in `litellm-config.yaml`

Gemini was already present:
- gemini-1.5-pro
- gemini-1.5-flash
z.ai added:
- zai-glm-4.7
- zai-glm-4.7-flash
- both using:
  - api_base: https://api.z.ai/api/coding/paas/v4
  - api_key: os.environ/ZAI_API_KEY

If you want, I can also add newer Gemini aliases (e.g. 2.x) and/or switch Gemini key var from GOOGLE_API_KEY to GEMINI_API_KEY if that’s what your env uses. user: [Tue 2026-03-03 09:37 UTC] Update the file accordingly based on the currently configured providers assistant: reply_to_current Done — I updated litellm-config.yaml to match the providers currently configured on this host.

Kept (configured)

OpenAI-compatible local endpoint (OPENAI_BASE_URL + OPENAI_API_KEY)
z.ai (ZAI_API_KEY, base https://api.z.ai/api/coding/paas/v4)
- glm-5
- glm-4.7
- glm-4.7-flash
- glm-4.7-flashx

Removed (not currently configured)

Anthropic
OpenRouter
Opencode
Gemini

If you want Gemini included too, I can add it back with GEMINI_API_KEY entries as soon as you confirm that key/env is set. user: [Tue 2026-03-03 09:42 UTC] We have litellm fixed. Can you test it? assistant: reply_to_current I tested from here — current status:

http://192.168.153.113:18805/v1/models responds, but returns TTS models (kokoro, tts-1, etc.), not your LiteLLM model list.
http://192.168.153.113:18805/v1/chat/completions returns 404 Not Found.
No local LiteLLM listener appears to be running on this host (127.0.0.1:4000/8000 not reachable).

So LiteLLM itself is likely on a different port/host (or not started yet).

If you share the exact LiteLLM base URL (e.g. http://<host>:<port>/v1), I’ll run a real end-to-end test immediately. user: [Tue 2026-03-03 09:43 UTC] NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS brave-search mcp/brave-search:latest "node dist/index.js" brave-search 2 hours ago Up 2 hours 0.0.0.0:18802->8000/tcp, [::]:18802->8000/tcp kokoro-tts ghcr.io/remsky/kokoro-fastapi-cpu:latest "./entrypoint.sh" kokoro-tts 2 hours ago Up 2 hours 0.0.0.0:18805->8880/tcp, [::]:18805->8880/tcp litellm litellm/litellm:latest "docker/prod_entrypo…" litellm 2 hours ago Up 3 minutes (unhealthy) 0.0.0.0:18804->4000/tcp, [::]:18804->4000/tcp searxng searxng/searxng:latest "/usr/local/searxng/…" searxng 2 hours ago Up 2 hours 0.0.0.0:18803->8080/tcp, [::]:18803->8080/tcp whisper-server ghcr.io/ggml-org/whisper.cpp:main "whisper-server --mo…" whisper-server 2 hours ago Up 2 hours (healthy) 0.0.0.0:18801->8080/tcp, [::]:18801->8080/tcp assistant: reply_to_current Yep — tested. Good news/bad news:

✅ LiteLLM is reachable at :18804 and serving model aliases:

gpt-4o
gpt-4-turbo
zai-glm-5
zai-glm-4.7
zai-glm-4.7-flash
zai-glm-4.7-flashx

❌ Actual completions fail (auth upstream):

zai-glm-4.7-flash returns Authentication parameter not received in Header
Health endpoint shows all configured backends unhealthy for missing upstream API keys.

So your LiteLLM config is loaded, but container env is missing provider keys (ZAI_API_KEY, and for OpenAI route, OPENAI_API_KEY if used upstream directly).

If you want, I can give you a minimal docker-compose env block to make it green in one restart. user: [Tue 2026-03-03 10:09 UTC] Can you try to hit the litellm API? Here is a virtual key: sk-IlDTV8FaW0vTUL_a4VDhMw assistant: reply_to_current Tested with your virtual key.

GET /v1/models ✅ works (returns your configured aliases)
POST /v1/chat/completions ❌ still fails:
- zai-glm-4.7-flash: upstream auth missing
- gpt-4o: upstream OpenAI API key missing

So your virtual key is valid for LiteLLM, but LiteLLM itself still lacks provider creds in its container env.

You need to set (at least):

ZAI_API_KEY (for z.ai models)
OPENAI_API_KEY (for gpt-4o / gpt-4-turbo)

Then restart litellm container and re-test. user: [Tue 2026-03-03 10:41 UTC] ok test now assistant: reply_to_current Yep — much better now. I re-tested and it’s working.

Results

/v1/models ✅ returns models (now includes Claude/Gemini/z.ai aliases too).
gpt-4o chat completion ✅ returned OK.
zai-glm-4.7-flash chat completion ✅ works too (returned OK when given enough tokens).
/health now shows partial recovery: 7 healthy / 4 unhealthy.

So LiteLLM is up and functional with your virtual key.
If you want, I can run one quick probe to list exactly which 4 models are still unhealthy. user: [Tue 2026-03-03 11:07 UTC] Test litellm with gpt-5-mini assistant: reply_to_current Tested.

gpt-5-mini currently fails on your LiteLLM with:

no healthy deployments for this model (model=gpt-5-mini)

So it isn’t mapped/configured (or healthy) in your current LiteLLM setup yet. user: [Tue 2026-03-03 11:08 UTC] Test litellm with openai/gpt-5-mini assistant: reply_to_current Tested openai/gpt-5-mini.

Result: ❌ fails with:

OpenAIException - Unknown parameter: 'credential_name'

So this route is being attempted, but there’s a provider-side/request-shape mismatch in the current LiteLLM/OpenAI config for that model.

5.3 KiB Raw Blame History Unescape Escape