# LLM Routing Guide Use the right model for the job. Cost and speed matter. ## Available CLIs | CLI | Auth | Best For | |-----|------|----------| | `claude` | Pro subscription | Complex reasoning, this workspace | | `opencode` | GitHub Copilot subscription | Code, free Copilot models | | `gemini` | Google account (free tier available) | Long context, multimodal | ## Model Tiers ### ⚡ Fast & Cheap (Simple Tasks) ```bash # Quick parsing, extraction, formatting, simple questions opencode run -m github-copilot/claude-haiku-4.5 "parse this JSON and extract emails" opencode run -m zai-coding-plan/glm-4.5-flash "summarize in 2 sentences" gemini -m gemini-2.0-flash "quick question here" ``` **Use for:** Log parsing, data extraction, simple formatting, yes/no questions, summarization ### 🔧 Balanced (Standard Work) ```bash # Code review, analysis, standard coding tasks opencode run -m github-copilot/claude-sonnet-4.5 "review this code" opencode run -m github-copilot/gpt-5-mini "explain this error" gemini -m gemini-2.5-pro "analyze this architecture" ``` **Use for:** Code generation, debugging, analysis, documentation ### 🧠 Powerful (Complex Reasoning) ```bash # Complex reasoning, multi-step planning, difficult problems claude -p --model opus "design a system for X" opencode run -m github-copilot/gpt-5.2 "complex reasoning task" opencode run -m github-copilot/gemini-3-pro-preview "architectural decision" ``` **Use for:** Architecture decisions, complex debugging, multi-step planning ### 📚 Long Context ```bash # Large codebases, long documents, big context windows gemini -m gemini-2.5-pro "analyze this entire codebase" < large_file.txt opencode run -m github-copilot/gemini-3-pro-preview "summarize all these files" ``` **Use for:** Analyzing large files, long documents, full codebase understanding ## Quick Reference | Task | Model | CLI Command | |------|-------|-------------| | Parse JSON/logs | haiku | `opencode run -m github-copilot/claude-haiku-4.5 "..."` | | Simple summary | flash | `gemini -m gemini-2.0-flash "..."` | | Code review | sonnet | `opencode run -m github-copilot/claude-sonnet-4.5 "..."` | | Write code | codex | `opencode run -m github-copilot/gpt-5.1-codex "..."` | | Debug complex issue | sonnet/opus | `claude -p --model sonnet "..."` | | Architecture design | opus | `claude -p --model opus "..."` | | Analyze large file | gemini-pro | `gemini -m gemini-2.5-pro "..." < file` | | Quick kubectl help | flash | `opencode run -m zai-coding-plan/glm-4.5-flash "..."` | ## Cost Optimization Rules 1. **Start small** — Try haiku/flash first, escalate only if needed 2. **Batch similar tasks** — One opus call > five haiku calls for complex work 3. **Use subscriptions** — GitHub Copilot models are "free" with subscription 4. **Cache results** — Don't re-ask the same question 5. **Context matters** — Smaller context = faster + cheaper ## Example Workflows ### Triage emails (cheap) ```bash opencode run -m github-copilot/claude-haiku-4.5 "categorize these emails as urgent/normal/spam" ``` ### Code review (balanced) ```bash opencode run -m github-copilot/claude-sonnet-4.5 "review this PR for issues" ``` ### Architectural decision (powerful) ```bash claude -p --model opus "given these constraints, design the best approach for..." ``` ### Summarize long doc (long context) ```bash cat huge_document.md | gemini -m gemini-2.5-pro "summarize key points" ``` ## For Flynn (Clawdbot) When spawning sub-agents or doing background work: - Use `sessions_spawn` with appropriate model hints - For simple extraction: spawn with default (cheaper model) - For complex analysis: explicitly request opus When using exec to call CLIs: - Prefer `opencode run` for one-shot tasks (GitHub Copilot = included) - Use `claude -p` when you need Claude-specific capabilities - Use `gemini` for very long context or multimodal --- *Principle: Don't use a sledgehammer to hang a picture.*