feat(external-llm): standardize tiers and optimize model selection

- Rename tiers: opus/sonnet/haiku → frontier/mid-tier/lightweight
- Align with industry benchmarks (MMLU, GPQA, Chatbot Arena)
- Add /external command for LLM mode control
- Fix invoke.py timeout passthrough (now 600s default)

Tier changes:
- Promote gemini-2.5-pro to frontier (benchmark-validated)
- Demote glm-4.7 to mid-tier then removed (unreliable)
- Promote gemini-2.5-flash to mid-tier

New models added:
- gpt-5-mini, gpt-5-nano (GPT family coverage)
- grok-code (Grok/X family)
- glm-4.5-air (lightweight GLM)

Removed (redundant/unreliable):
- o3 (not available)
- glm-4.7 (timeouts)
- gpt-4o, big-pickle, glm-4.5-flash (redundant)

Final: 11 models across 3 tiers, 4 model families

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
OpenCode Test
2026-01-12 03:30:51 -08:00
parent ff111ef278
commit f63172c4cf
7 changed files with 325 additions and 37 deletions

View File

@@ -28,6 +28,7 @@ Slash commands for quick actions. User-invoked (type `/command` to trigger).
| `/programmer` | | Code development tasks | | `/programmer` | | Code development tasks |
| `/gcal` | `/calendar`, `/cal` | Google Calendar access | | `/gcal` | `/calendar`, `/cal` | Google Calendar access |
| `/usage` | `/stats` | View usage statistics | | `/usage` | `/stats` | View usage statistics |
| `/external` | `/llm`, `/ext` | Toggle and use external LLM mode |
### Kubernetes (`/k8s:*`) ### Kubernetes (`/k8s:*`)

89
commands/external.md Normal file
View File

@@ -0,0 +1,89 @@
---
name: external
description: Toggle and use external LLM mode (GPT-5.2, Gemini, etc.)
aliases: [llm, ext, external-llm]
---
# External LLM Mode
Route requests to external LLMs via opencode or gemini CLI.
## Usage
```
/external # Show current status
/external on [reason] # Enable external mode
/external off # Disable external mode
/external invoke <prompt> # Send prompt to default model
/external invoke --model <model> <prompt> # Send to specific model
/external invoke --task <task> <prompt> # Route by task type
/external models # List available models
```
## Implementation
### Status
```bash
~/.claude/mcp/llm-router/toggle.py status
```
### Toggle On/Off
```bash
~/.claude/mcp/llm-router/toggle.py on --reason "reason"
~/.claude/mcp/llm-router/toggle.py off
```
### Invoke
```bash
~/.claude/mcp/llm-router/invoke.py --model MODEL -p "prompt" [--json]
~/.claude/mcp/llm-router/invoke.py --task TASK -p "prompt" [--json]
```
## Available Models by Tier
### Frontier (strongest)
| Model | Provider | Best For |
|-------|----------|----------|
| `github-copilot/gpt-5.2` | opencode | reasoning, fallback |
| `github-copilot/gemini-3-pro-preview` | opencode | long context, reasoning |
| `gemini/gemini-2.5-pro` | gemini | long context, reasoning |
### Mid-tier (general purpose)
| Model | Provider | Best For |
|-------|----------|----------|
| `github-copilot/claude-sonnet-4.5` | opencode | general, fallback |
| `github-copilot/gemini-3-flash-preview` | opencode | fast |
| `zai-coding-plan/glm-4.7` | opencode | code generation |
| `opencode/big-pickle` | opencode | general |
| `gemini/gemini-2.5-flash` | gemini | fast |
### Lightweight (simple tasks)
| Model | Provider | Best For |
|-------|----------|----------|
| `github-copilot/claude-haiku-4.5` | opencode | simple tasks |
## Task Routing
| Task | Routes To | Tier |
|------|-----------|------|
| `reasoning` | github-copilot/gpt-5.2 | frontier |
| `code-generation` | github-copilot/gemini-3-pro-preview | frontier |
| `long-context` | gemini/gemini-2.5-pro | frontier |
| `fast` | github-copilot/gemini-3-flash-preview | mid-tier |
| `general` (default) | github-copilot/claude-sonnet-4.5 | mid-tier |
## State Files
- Mode state: `~/.claude/state/external-mode.json`
- Model policy: `~/.claude/state/model-policy.json`
## Examples
```
/external on testing # Enable for testing
/external invoke "Explain k8s pods" # Use default model (mid-tier)
/external invoke --model github-copilot/gpt-5.2 "Complex analysis" # frontier
/external invoke --task code-generation "Write a Python function" # routes to frontier
/external invoke --task fast "Quick question" # routes to mid-tier
/external off # Back to Claude
```

View File

@@ -38,7 +38,7 @@ def resolve_model(args: argparse.Namespace, policy: dict) -> str:
return policy.get("task_routing", {}).get("default", "copilot/sonnet-4.5") return policy.get("task_routing", {}).get("default", "copilot/sonnet-4.5")
def invoke(model: str, prompt: str, policy: dict) -> str: def invoke(model: str, prompt: str, policy: dict, timeout: int = 600) -> str:
"""Invoke the appropriate provider for the given model.""" """Invoke the appropriate provider for the given model."""
external_models = policy.get("external_models", {}) external_models = policy.get("external_models", {})
@@ -53,11 +53,11 @@ def invoke(model: str, prompt: str, policy: dict) -> str:
if cli == "opencode": if cli == "opencode":
sys.path.insert(0, str(ROUTER_DIR)) sys.path.insert(0, str(ROUTER_DIR))
from providers.opencode import invoke as opencode_invoke from providers.opencode import invoke as opencode_invoke
return opencode_invoke(cli_args, prompt) return opencode_invoke(cli_args, prompt, timeout=timeout)
elif cli == "gemini": elif cli == "gemini":
sys.path.insert(0, str(ROUTER_DIR)) sys.path.insert(0, str(ROUTER_DIR))
from providers.gemini import invoke as gemini_invoke from providers.gemini import invoke as gemini_invoke
return gemini_invoke(cli_args, prompt) return gemini_invoke(cli_args, prompt, timeout=timeout)
else: else:
raise ValueError(f"Unknown CLI: {cli}") raise ValueError(f"Unknown CLI: {cli}")
@@ -88,8 +88,8 @@ def main():
parser.add_argument( parser.add_argument(
"--timeout", "--timeout",
type=int, type=int,
default=300, default=600,
help="Timeout in seconds (default: 300)" help="Timeout in seconds (default: 600)"
) )
args = parser.parse_args() args = parser.parse_args()
@@ -97,7 +97,7 @@ def main():
try: try:
policy = load_policy() policy = load_policy()
model = resolve_model(args, policy) model = resolve_model(args, policy)
result = invoke(model, args.prompt, policy) result = invoke(model, args.prompt, policy, timeout=args.timeout)
if args.json: if args.json:
output = { output = {

View File

@@ -199,6 +199,15 @@
], ],
"invokes": "skill:usage" "invokes": "skill:usage"
}, },
"/external": {
"description": "Toggle and use external LLM mode (GPT-5.2, Gemini, etc.)",
"aliases": [
"/llm",
"/ext",
"/external-llm"
],
"invokes": "command:external"
},
"/README": { "/README": {
"description": "TODO", "description": "TODO",
"aliases": [], "aliases": [],

View File

@@ -1 +1,171 @@
{"infra":{"cluster":"k0s","nodes":3,"arch":"arm64"},"svc":{"gitops":"argocd","mon":"prometheus","alerts":"alertmanager"},"net":{},"hw":{"pi5_8gb":2,"pi3_1gb":1}} {
"infra": {
"cluster": "k0s",
"nodes": 3,
"arch": "arm64",
"storage": "longhorn",
"storage_class": "longhorn",
"backup": "longhorn-backup + minio-to-mega"
},
"hw": {
"pi5_8gb": 2,
"pi3_1gb": 1,
"roles": {
"control_plane": "pi5",
"workers": ["pi5", "pi3"]
}
},
"net": {
"metallb_pool": "192.168.153.240-192.168.153.254",
"ingress_nginx_ip": "192.168.153.240",
"ingress_haproxy_ip": "192.168.153.241",
"tailnet": "taildb3494.ts.net",
"dns_pattern": "<app>.<ns>.<ip>.nip.io"
},
"svc": {
"gitops": "argocd",
"monitoring": {
"metrics": "kube-prometheus-stack",
"logs": "loki-stack",
"alerts": "alertmanager",
"dashboards": "grafana"
},
"ingress": ["nginx-ingress-controller", "haproxy-ingress"],
"storage": ["longhorn", "local-path-storage", "minio"],
"networking": ["metallb", "tailscale-operator"]
},
"apps": {
"ai_stack": {
"namespace": "ai-stack",
"components": ["open-webui", "ollama", "litellm", "searxng", "n8n", "vllm"],
"models": ["gpt-oss:120b", "qwen3-coder"],
"ollama_host": "100.85.116.57:11434"
},
"home": ["home-assistant", "pihole", "plex"],
"infra": ["gitea", "docker-registry", "kubernetes-dashboard"],
"other": ["ghost", "tor-controller", "speedtest-tracker"]
},
"namespaces": [
"ai-stack", "argocd", "monitoring", "loki-system", "longhorn-system",
"metallb-system", "minio", "nginx-ingress-controller", "tailscale-operator",
"gitea", "home-assistant", "pihole", "pihole2", "plex", "ghost",
"kubernetes-dashboard", "docker-registry", "k8s-agent", "tools", "vpa"
],
"urls": {
"grafana": "grafana.monitoring.192.168.153.240.nip.io",
"longhorn": "ui.longhorn-system.192.168.153.240.nip.io",
"open_webui": "oi.ai-stack.192.168.153.240.nip.io",
"searxng": "sx.ai-stack.192.168.153.240.nip.io",
"n8n": "n8n.ai-stack.192.168.153.240.nip.io",
"minio_console": "console.minio.192.168.153.240.nip.io",
"pihole": "pihole.192.168.153.240.nip.io",
"k8s_dashboard": "dashboard.kubernetes-dashboards.192.168.153.240.nip.io",
"home_assistant": "ha.home-assistant.192.168.153.241.nip.io",
"plex": "player.plex.192.168.153.246.nip.io"
},
"external_llm": {
"description": "Route requests to external LLMs via opencode or gemini CLI",
"state_file": "~/.claude/state/external-mode.json",
"router_dir": "~/.claude/mcp/llm-router/",
"commands": {
"toggle_on": "~/.claude/mcp/llm-router/toggle.py on --reason 'reason'",
"toggle_off": "~/.claude/mcp/llm-router/toggle.py off",
"status": "~/.claude/mcp/llm-router/toggle.py status",
"invoke": "~/.claude/mcp/llm-router/invoke.py --model MODEL -p 'prompt'"
},
"providers": ["opencode", "gemini"],
"tiers": {
"frontier": ["github-copilot/gpt-5.2", "github-copilot/gemini-3-pro-preview", "gemini/gemini-2.5-pro"],
"mid-tier": ["github-copilot/gpt-5-mini", "github-copilot/claude-sonnet-4.5", "github-copilot/gemini-3-flash-preview", "opencode/grok-code", "gemini/gemini-2.5-flash"],
"lightweight": ["opencode/gpt-5-nano", "zai-coding-plan/glm-4.5-air", "github-copilot/claude-haiku-4.5"]
},
"task_routing": {
"reasoning": "github-copilot/gpt-5.2",
"code-generation": "github-copilot/gemini-3-pro-preview",
"long-context": "gemini/gemini-2.5-pro",
"fast": "github-copilot/gemini-3-flash-preview",
"default": "github-copilot/claude-sonnet-4.5"
},
"notes": {
"opencode_path": "/home/linuxbrew/.linuxbrew/bin/opencode (NOT /usr/bin/opencode which crashes)",
"o3_removed": "github-copilot/o3 not available via GitHub Copilot"
}
},
"workstation": {
"hostname": "willlaptop",
"ip": "192.168.153.117",
"os": "Arch Linux",
"desktop": "GNOME",
"shell": "fish",
"terminal": ["ghostty", "alacritty", "gnome-console"],
"network": "systemd-networkd + iwd",
"theme": "Dracula",
"editors": ["vscode", "zed", "vim"],
"browsers": ["firefox", "chromium", "google-chrome", "zen-browser", "epiphany"],
"virtualization": ["docker", "podman", "distrobox", "virt-manager", "virtualbox", "gnome-boxes"],
"k8s_tools": ["k9s", "k0s-bin", "k0sctl-bin", "argocd", "krew", "kubecolor"],
"dev_langs": ["go", "rust", "python", "typescript", "zig", "bun", "node/npm/pnpm"],
"ai_local": {
"ollama": true,
"llama_swap": true,
"models": ["Qwen3-4b", "Gemma3-4b"]
},
"backup": ["restic", "timeshift", "btrbk", "chezmoi"],
"dotfiles": "chezmoi"
},
"repos": {
"willlaptop": {
"path": "~/Code/active/devops/willlaptop",
"remote": "git@gitea-gitea-ssh.taildb3494.ts.net:will/willlaptop.git",
"purpose": "Workstation provisioning and config",
"structure": {
"ansible/": "Machine provisioning playbooks",
"ansible/roles/common/": "Hostname, network, users, SSH config",
"ansible/roles/packages/": "Package installation (pacman, AUR, flatpak, appimage)",
"ansible/roles/packages/files/": "Package lists (pkglist.txt, aur_pkglist.txt, etc)",
"docker/": "Local Docker stacks",
"scripts/": "Utility scripts (backup, sync, networking)",
"MCP/": "MCP server configs",
"local_ollama/": "Local Ollama data"
},
"ansible_tags": ["network", "wifi", "ethernet", "users", "sshd", "pacman", "aur", "flatpak", "appimage"],
"docker_stacks": ["file_browser", "minio-longhorn-backup", "rancher-cleanup"],
"scripts": ["bridge-up.sh", "chezmoi-sync.sh", "curl-s3.sh", "kvm-bridge-setup.sh",
"rclone-sync.sh", "restic-backup.sh", "restic-clean.sh"]
},
"homelab": {
"path": "~/Code/active/devops/homelab/homelab",
"remote": "git@github.com:will666/homelab.git",
"symlink": "~/.claude/repos/homelab",
"structure": {
"ansible/": "Ansible playbooks and templates for node provisioning",
"argocd/": "ArgoCD Application manifests (one per service)",
"charts/": "Helm values and raw manifests per service",
"charts/<svc>/values.yaml": "Helm chart values override",
"charts/<svc>/manifests/": "Raw K8s manifests (non-Helm resources)",
"docker/": "Docker Compose stacks for non-K8s workloads"
},
"charts": [
"ai-stack", "argocd", "argo-workflow", "cdi-operator",
"cloudflare-tunnel-ingress-controller", "docker-registry", "ghost",
"gitea", "haproxy-ingress", "harbor", "home-assistant", "k0s-backup",
"k8s-agent-dashboard", "kube-prometheus-stack", "kubernetes-dashboard",
"kubevirt", "local-path-storage", "loki-stack", "longhorn",
"longhorn-backup", "metallb", "minio", "minio-to-mega-backup",
"nfs-server-longhorn", "nginx-ingress-controller", "pihole", "pihole2",
"plex", "speedtest-tracker", "squareffect", "squareserver",
"tailscale-operator", "tools", "tor-controller", "traefik-ingress-controller",
"willlaptop-backup", "willlaptop-monitoring", "wills-portal"
],
"docker_stacks": [
"protonvpn-proxy", "squareffect", "squareserver", "stable-diffusion-webui"
],
"conventions": {
"argocd_app": "argocd/<name>.yaml points to charts/<name>/",
"helm_values": "charts/<name>/values.yaml for Helm overrides",
"raw_manifests": "charts/<name>/manifests/ for non-Helm K8s resources",
"naming": "ArgoCD app name = namespace name (usually)"
}
}
}
}

View File

@@ -119,72 +119,79 @@
"cli": "opencode", "cli": "opencode",
"cli_args": ["-m", "github-copilot/gpt-5.2"], "cli_args": ["-m", "github-copilot/gpt-5.2"],
"use_cases": ["reasoning", "fallback"], "use_cases": ["reasoning", "fallback"],
"tier": "opus-equivalent" "tier": "frontier"
}, },
"github-copilot/claude-sonnet-4.5": { "github-copilot/claude-sonnet-4.5": {
"cli": "opencode", "cli": "opencode",
"cli_args": ["-m", "github-copilot/claude-sonnet-4.5"], "cli_args": ["-m", "github-copilot/claude-sonnet-4.5"],
"use_cases": ["general", "fallback"], "use_cases": ["general", "fallback"],
"tier": "sonnet-equivalent" "tier": "mid-tier"
}, },
"github-copilot/claude-haiku-4.5": { "github-copilot/claude-haiku-4.5": {
"cli": "opencode", "cli": "opencode",
"cli_args": ["-m", "github-copilot/claude-haiku-4.5"], "cli_args": ["-m", "github-copilot/claude-haiku-4.5"],
"use_cases": ["simple"], "use_cases": ["simple"],
"tier": "haiku-equivalent" "tier": "lightweight"
},
"zai-coding-plan/glm-4.7": {
"cli": "opencode",
"cli_args": ["-m", "zai-coding-plan/glm-4.7"],
"use_cases": ["code-generation"],
"tier": "opus-equivalent"
}, },
"github-copilot/gemini-3-pro-preview": { "github-copilot/gemini-3-pro-preview": {
"cli": "opencode", "cli": "opencode",
"cli_args": ["-m", "github-copilot/gemini-3-pro-preview"], "cli_args": ["-m", "github-copilot/gemini-3-pro-preview"],
"use_cases": ["long-context", "reasoning"], "use_cases": ["long-context", "reasoning"],
"tier": "opus-equivalent" "tier": "frontier"
}, },
"github-copilot/gemini-3-flash-preview": { "github-copilot/gemini-3-flash-preview": {
"cli": "opencode", "cli": "opencode",
"cli_args": ["-m", "github-copilot/gemini-3-flash-preview"], "cli_args": ["-m", "github-copilot/gemini-3-flash-preview"],
"use_cases": ["fast", "general"], "use_cases": ["fast", "general"],
"tier": "sonnet-equivalent" "tier": "mid-tier"
},
"github-copilot/o3": {
"cli": "opencode",
"cli_args": ["-m", "github-copilot/o3"],
"use_cases": ["complex-reasoning"],
"tier": "sonnet-equivalent"
},
"opencode/big-pickle": {
"cli": "opencode",
"cli_args": ["-m", "opencode/big-pickle"],
"use_cases": ["general"],
"tier": "sonnet-equivalent"
}, },
"gemini/gemini-2.5-pro": { "gemini/gemini-2.5-pro": {
"cli": "gemini", "cli": "gemini",
"cli_args": ["-m", "gemini-2.5-pro"], "cli_args": ["-m", "gemini-2.5-pro"],
"use_cases": ["long-context", "reasoning"], "use_cases": ["long-context", "reasoning"],
"tier": "sonnet-equivalent" "tier": "frontier"
}, },
"gemini/gemini-2.5-flash": { "gemini/gemini-2.5-flash": {
"cli": "gemini", "cli": "gemini",
"cli_args": ["-m", "gemini-2.5-flash"], "cli_args": ["-m", "gemini-2.5-flash"],
"use_cases": ["fast", "general"], "use_cases": ["fast", "general"],
"tier": "haiku-equivalent" "tier": "mid-tier"
},
"github-copilot/gpt-5-mini": {
"cli": "opencode",
"cli_args": ["-m", "github-copilot/gpt-5-mini"],
"use_cases": ["fast", "general"],
"tier": "mid-tier"
},
"opencode/gpt-5-nano": {
"cli": "opencode",
"cli_args": ["-m", "opencode/gpt-5-nano"],
"use_cases": ["fast", "simple"],
"tier": "lightweight"
},
"zai-coding-plan/glm-4.5-air": {
"cli": "opencode",
"cli_args": ["-m", "zai-coding-plan/glm-4.5-air"],
"use_cases": ["simple", "fast"],
"tier": "lightweight"
},
"opencode/grok-code": {
"cli": "opencode",
"cli_args": ["-m", "opencode/grok-code"],
"use_cases": ["code-generation", "general"],
"tier": "mid-tier"
} }
}, },
"claude_to_external_map": { "tier_to_external_map": {
"opus": "github-copilot/gpt-5.2", "frontier": "github-copilot/gpt-5.2",
"sonnet": "github-copilot/claude-sonnet-4.5", "mid-tier": "github-copilot/gpt-5-mini",
"haiku": "github-copilot/claude-haiku-4.5" "lightweight": "opencode/gpt-5-nano"
}, },
"task_routing": { "task_routing": {
"reasoning": "github-copilot/gpt-5.2", "reasoning": "github-copilot/gpt-5.2",
"code-generation": "zai-coding-plan/glm-4.7", "code-generation": "github-copilot/gemini-3-pro-preview",
"long-context": "gemini/gemini-2.5-pro", "long-context": "gemini/gemini-2.5-pro",
"fast": "github-copilot/gemini-3-flash-preview",
"default": "github-copilot/claude-sonnet-4.5" "default": "github-copilot/claude-sonnet-4.5"
} }
} }

View File

@@ -20,6 +20,18 @@
"status": "active", "status": "active",
"added": "2026-01-04" "added": "2026-01-04"
}, },
{
"id": "f6a7b8c9-0123-45fa-1234-666666666666",
"instruction": "After reinstalling gmail-mcp package, run ~/.claude/patches/apply-gmail-auth-patch.sh to restore auto re-auth on token expiry.",
"status": "active",
"added": "2026-01-09"
},
{
"id": "a7b8c9d0-1234-56ab-2345-777777777777",
"instruction": "Homelab repo is at ~/Code/active/devops/homelab/homelab (canonical). ~/.claude/repos/homelab is a symlink to it. Always use the canonical path for new work.",
"status": "active",
"added": "2026-01-09"
},
{ {
"id": "b2c3d4e5-6789-01bc-def0-222222222222", "id": "b2c3d4e5-6789-01bc-def0-222222222222",
"instruction": "Git workflow: See CLAUDE.md for full process. Use rebase merges, not merge commits.", "instruction": "Git workflow: See CLAUDE.md for full process. Use rebase merges, not merge commits.",