diff --git a/TOOLS.md b/TOOLS.md index 7a87087..75839c6 100644 --- a/TOOLS.md +++ b/TOOLS.md @@ -59,10 +59,30 @@ Skills define *how* tools work. This file is for *your* specifics — the stuff - **K8s Tools:** k9s, kubectl, argocd CLI, krew, kubecolor - **Containers:** Docker, Podman, Distrobox -### Local AI -- **Ollama:** ✅ running -- **llama-swap:** ✅ -- **Models:** Qwen3-4b, Gemma3-4b +### Local AI (llama-swap) +- **Endpoint:** `http://127.0.0.1:8080` +- **Service:** `systemctl --user status llama-swap` +- **Config:** `~/.config/llama-swap/config.yaml` +- **GPU:** RTX 5070 Ti (12GB VRAM) + +**Available Models:** +| Alias | Model | Notes | +|-------|-------|-------| +| `qwen3` | Qwen3-30B-A3B | General purpose MoE, 8k ctx | +| `coder` | Qwen3-Coder-30B-A3B | Code specialist MoE | +| `glm` | GLM-4.7-Flash | Fast reasoning | +| `gemma` | Gemma-3-12B | Balanced, fits fully | +| `reasoning` | Ministral-3-14B-Reasoning | Reasoning specialist | +| `gpt-oss` | GPT-OSS-20B | Experimental | + +**Usage:** +```bash +curl http://127.0.0.1:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{"model": "gemma", "messages": [{"role": "user", "content": "Hello"}]}' +``` + +**Web UI:** http://127.0.0.1:8080/ui ---