- Rename tiers: opus/sonnet/haiku → frontier/mid-tier/lightweight - Align with industry benchmarks (MMLU, GPQA, Chatbot Arena) - Add /external command for LLM mode control - Fix invoke.py timeout passthrough (now 600s default) Tier changes: - Promote gemini-2.5-pro to frontier (benchmark-validated) - Demote glm-4.7 to mid-tier then removed (unreliable) - Promote gemini-2.5-flash to mid-tier New models added: - gpt-5-mini, gpt-5-nano (GPT family coverage) - grok-code (Grok/X family) - glm-4.5-air (lightweight GLM) Removed (redundant/unreliable): - o3 (not available) - glm-4.7 (timeouts) - gpt-4o, big-pickle, glm-4.5-flash (redundant) Final: 11 models across 3 tiers, 4 model families Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
90 lines
2.7 KiB
Markdown
90 lines
2.7 KiB
Markdown
---
|
|
name: external
|
|
description: Toggle and use external LLM mode (GPT-5.2, Gemini, etc.)
|
|
aliases: [llm, ext, external-llm]
|
|
---
|
|
|
|
# External LLM Mode
|
|
|
|
Route requests to external LLMs via opencode or gemini CLI.
|
|
|
|
## Usage
|
|
|
|
```
|
|
/external # Show current status
|
|
/external on [reason] # Enable external mode
|
|
/external off # Disable external mode
|
|
/external invoke <prompt> # Send prompt to default model
|
|
/external invoke --model <model> <prompt> # Send to specific model
|
|
/external invoke --task <task> <prompt> # Route by task type
|
|
/external models # List available models
|
|
```
|
|
|
|
## Implementation
|
|
|
|
### Status
|
|
```bash
|
|
~/.claude/mcp/llm-router/toggle.py status
|
|
```
|
|
|
|
### Toggle On/Off
|
|
```bash
|
|
~/.claude/mcp/llm-router/toggle.py on --reason "reason"
|
|
~/.claude/mcp/llm-router/toggle.py off
|
|
```
|
|
|
|
### Invoke
|
|
```bash
|
|
~/.claude/mcp/llm-router/invoke.py --model MODEL -p "prompt" [--json]
|
|
~/.claude/mcp/llm-router/invoke.py --task TASK -p "prompt" [--json]
|
|
```
|
|
|
|
## Available Models by Tier
|
|
|
|
### Frontier (strongest)
|
|
| Model | Provider | Best For |
|
|
|-------|----------|----------|
|
|
| `github-copilot/gpt-5.2` | opencode | reasoning, fallback |
|
|
| `github-copilot/gemini-3-pro-preview` | opencode | long context, reasoning |
|
|
| `gemini/gemini-2.5-pro` | gemini | long context, reasoning |
|
|
|
|
### Mid-tier (general purpose)
|
|
| Model | Provider | Best For |
|
|
|-------|----------|----------|
|
|
| `github-copilot/claude-sonnet-4.5` | opencode | general, fallback |
|
|
| `github-copilot/gemini-3-flash-preview` | opencode | fast |
|
|
| `zai-coding-plan/glm-4.7` | opencode | code generation |
|
|
| `opencode/big-pickle` | opencode | general |
|
|
| `gemini/gemini-2.5-flash` | gemini | fast |
|
|
|
|
### Lightweight (simple tasks)
|
|
| Model | Provider | Best For |
|
|
|-------|----------|----------|
|
|
| `github-copilot/claude-haiku-4.5` | opencode | simple tasks |
|
|
|
|
## Task Routing
|
|
|
|
| Task | Routes To | Tier |
|
|
|------|-----------|------|
|
|
| `reasoning` | github-copilot/gpt-5.2 | frontier |
|
|
| `code-generation` | github-copilot/gemini-3-pro-preview | frontier |
|
|
| `long-context` | gemini/gemini-2.5-pro | frontier |
|
|
| `fast` | github-copilot/gemini-3-flash-preview | mid-tier |
|
|
| `general` (default) | github-copilot/claude-sonnet-4.5 | mid-tier |
|
|
|
|
## State Files
|
|
|
|
- Mode state: `~/.claude/state/external-mode.json`
|
|
- Model policy: `~/.claude/state/model-policy.json`
|
|
|
|
## Examples
|
|
|
|
```
|
|
/external on testing # Enable for testing
|
|
/external invoke "Explain k8s pods" # Use default model (mid-tier)
|
|
/external invoke --model github-copilot/gpt-5.2 "Complex analysis" # frontier
|
|
/external invoke --task code-generation "Write a Python function" # routes to frontier
|
|
/external invoke --task fast "Quick question" # routes to mid-tier
|
|
/external off # Back to Claude
|
|
```
|