feat(external-llm): standardize tiers and optimize model selection

- Rename tiers: opus/sonnet/haiku → frontier/mid-tier/lightweight
- Align with industry benchmarks (MMLU, GPQA, Chatbot Arena)
- Add /external command for LLM mode control
- Fix invoke.py timeout passthrough (now 600s default)

Tier changes:
- Promote gemini-2.5-pro to frontier (benchmark-validated)
- Demote glm-4.7 to mid-tier then removed (unreliable)
- Promote gemini-2.5-flash to mid-tier

New models added:
- gpt-5-mini, gpt-5-nano (GPT family coverage)
- grok-code (Grok/X family)
- glm-4.5-air (lightweight GLM)

Removed (redundant/unreliable):
- o3 (not available)
- glm-4.7 (timeouts)
- gpt-4o, big-pickle, glm-4.5-flash (redundant)

Final: 11 models across 3 tiers, 4 model families

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
OpenCode Test
2026-01-12 03:30:51 -08:00
parent ff111ef278
commit f63172c4cf
7 changed files with 325 additions and 37 deletions

89
commands/external.md Normal file
View File

@@ -0,0 +1,89 @@
---
name: external
description: Toggle and use external LLM mode (GPT-5.2, Gemini, etc.)
aliases: [llm, ext, external-llm]
---
# External LLM Mode
Route requests to external LLMs via opencode or gemini CLI.
## Usage
```
/external # Show current status
/external on [reason] # Enable external mode
/external off # Disable external mode
/external invoke <prompt> # Send prompt to default model
/external invoke --model <model> <prompt> # Send to specific model
/external invoke --task <task> <prompt> # Route by task type
/external models # List available models
```
## Implementation
### Status
```bash
~/.claude/mcp/llm-router/toggle.py status
```
### Toggle On/Off
```bash
~/.claude/mcp/llm-router/toggle.py on --reason "reason"
~/.claude/mcp/llm-router/toggle.py off
```
### Invoke
```bash
~/.claude/mcp/llm-router/invoke.py --model MODEL -p "prompt" [--json]
~/.claude/mcp/llm-router/invoke.py --task TASK -p "prompt" [--json]
```
## Available Models by Tier
### Frontier (strongest)
| Model | Provider | Best For |
|-------|----------|----------|
| `github-copilot/gpt-5.2` | opencode | reasoning, fallback |
| `github-copilot/gemini-3-pro-preview` | opencode | long context, reasoning |
| `gemini/gemini-2.5-pro` | gemini | long context, reasoning |
### Mid-tier (general purpose)
| Model | Provider | Best For |
|-------|----------|----------|
| `github-copilot/claude-sonnet-4.5` | opencode | general, fallback |
| `github-copilot/gemini-3-flash-preview` | opencode | fast |
| `zai-coding-plan/glm-4.7` | opencode | code generation |
| `opencode/big-pickle` | opencode | general |
| `gemini/gemini-2.5-flash` | gemini | fast |
### Lightweight (simple tasks)
| Model | Provider | Best For |
|-------|----------|----------|
| `github-copilot/claude-haiku-4.5` | opencode | simple tasks |
## Task Routing
| Task | Routes To | Tier |
|------|-----------|------|
| `reasoning` | github-copilot/gpt-5.2 | frontier |
| `code-generation` | github-copilot/gemini-3-pro-preview | frontier |
| `long-context` | gemini/gemini-2.5-pro | frontier |
| `fast` | github-copilot/gemini-3-flash-preview | mid-tier |
| `general` (default) | github-copilot/claude-sonnet-4.5 | mid-tier |
## State Files
- Mode state: `~/.claude/state/external-mode.json`
- Model policy: `~/.claude/state/model-policy.json`
## Examples
```
/external on testing # Enable for testing
/external invoke "Explain k8s pods" # Use default model (mid-tier)
/external invoke --model github-copilot/gpt-5.2 "Complex analysis" # frontier
/external invoke --task code-generation "Write a Python function" # routes to frontier
/external invoke --task fast "Quick question" # routes to mid-tier
/external off # Back to Claude
```