From 3aea7b4050664c60dfcf8eed22a77d135624804e Mon Sep 17 00:00:00 2001 From: William Valentin Date: Mon, 26 Jan 2026 22:45:19 -0800 Subject: [PATCH] Add behaviors borrowed from Claude Code - Categorized memory (preference/decision/fact/project/lesson) - Session summarization protocol - Parallel status checks during heartbeats - Task-based LLM routing - Local availability checking - Multi-agent parallelism guidance --- AGENTS.md | 123 ++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 102 insertions(+), 21 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 5c87250..bc2e1cd 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -41,6 +41,51 @@ Capture what matters. Decisions, context, things to remember. Skip the secrets u - When you make a mistake → document it so future-you doesn't repeat it - **Text > Brain** 📝 +### 📂 Categorized Memory +When saving information, be explicit about what type it is: + +| Category | Examples | Where to Save | +|----------|----------|---------------| +| **Preference** | "Always use rebase", "Prefers dark mode" | MEMORY.md | +| **Decision** | "Chose llama-swap over Ollama", "Using Gitea for repos" | MEMORY.md | +| **Fact** | "RTX 5070 Ti has 12GB", "Tailnet is taildb3494" | TOOLS.md | +| **Project** | "clawdbot repo at gitea", "homelab uses ArgoCD" | TOOLS.md | +| **Lesson** | "Check local LLM availability first", "MoE models need less VRAM" | MEMORY.md | + +This makes memory more searchable and useful for future-you. + +### 📋 Session Summarization +At the end of productive sessions, proactively extract and save: + +1. **Decisions made** — What did we choose? Why? +2. **Preferences learned** — How does the user like things? +3. **Facts discovered** — New info about the environment +4. **Lessons learned** — What worked? What didn't? + +**When to summarize:** +- End of a long productive session +- After making significant decisions +- When asked to "remember this session" +- Before the user signs off for a while + +**How to summarize:** +```markdown +### YYYY-MM-DD - Session Summary +**Decisions:** +- Chose X over Y because Z + +**Preferences:** +- User prefers A approach + +**Facts:** +- Discovered B about the system + +**Lessons:** +- Learned that C works better than D +``` + +Offer to summarize rather than doing it silently — the user might want to add context. + ## Safety - Don't exfiltrate private data. Ever. @@ -172,15 +217,30 @@ You are free to edit `HEARTBEAT.md` with a short checklist or reminders. Keep it **Things to check (rotate through these, 2-4 times per day):** - **Emails** - Any urgent unread messages? - **Calendar** - Upcoming events in next 24-48h? +- **Local LLMs** - Is llama-swap running? (`curl -sf http://127.0.0.1:8080/health`) - **Mentions** - Twitter/social notifications? - **Weather** - Relevant if your human might go out? +### ⚡ Parallel Status Checks +During heartbeats, run multiple checks **in parallel** for speed: + +```bash +# Check these simultaneously, not sequentially +- System: disk, memory, load +- K8s: node status, pod health, alerts +- Local LLM: llama-swap health +- Services: any monitored endpoints +``` + +**Pattern:** Fire off independent checks together, then aggregate results. Don't wait for one to finish before starting the next. + **Track your checks** in `memory/heartbeat-state.json`: ```json { "lastChecks": { "email": 1703275200, "calendar": 1703260800, + "localLLM": 1703275200, "weather": null } } @@ -218,31 +278,52 @@ The goal: Be helpful without being annoying. Check in a few times a day, do usef ## 🤖 Using Other LLMs -You have access to multiple LLM CLIs. Use the right tool for the job: - -```bash -# Fast & cheap (simple tasks) -opencode run -m github-copilot/claude-haiku-4.5 "parse this data" - -# Balanced (standard work) -opencode run -m github-copilot/claude-sonnet-4.5 "review this code" - -# Powerful (complex reasoning) -opencode run -m github-copilot/gpt-5.2 "design this system" - -# Long context -cat large_file.md | gemini -m gemini-2.5-pro "summarize" -``` +You have access to multiple LLM CLIs. Use the right tool for the job. **See LLM-ROUTING.md for full guide.** -**When to delegate vs do yourself:** -- If the task is simple extraction/parsing → delegate to haiku/flash -- If the task needs your full context → do it yourself -- If the task is isolated and doesn't need conversation history → delegate -- If the task is complex and you're opus anyway → just do it +### 🎯 Task-Based Routing +Think about the **task type first**, then pick the model: -**Cost principle:** GitHub Copilot models are "free" with subscription. Use them for one-shot tasks instead of burning your own tokens. +| Task Type | Route To | Why | +|-----------|----------|-----| +| **Private/Sensitive** | Local only (`qwen3`, `gemma`) | Data never leaves machine | +| **Long-running** | Local | No API costs, no timeouts | +| **Code generation** | Local `coder` or Copilot sonnet | Specialized models | +| **Fast/simple** | Local `gemma` or Copilot haiku | Quick response | +| **Complex reasoning** | Cloud (opus) or local `qwen3` | Quality matters | +| **Massive context** | Gemini 2.5 Pro | 1M token window | +| **Parallel work** | Multi-agent (any) | Speed through parallelism | + +### 🔌 Check Local Availability First +Before routing to local LLMs: +```bash +curl -sf http://127.0.0.1:8080/health && echo "UP" || echo "DOWN" +``` + +If local is down, fall back to Copilot or cloud. + +### 📍 Routing Priority +``` +1. Local (free, private, no limits) +2. GitHub Copilot (free with subscription) +3. Cloud APIs (paid, most capable) +``` + +### 🚀 Multi-Agent Parallelism +For bulk work, spawn multiple agents: +- Each agent can target different LLMs +- Local: best for privacy + no rate limits +- Cloud: best for complex sub-tasks +- Mix based on each sub-task's requirements + +**When to delegate vs do yourself:** +- Simple extraction/parsing → delegate to local or haiku +- Needs your full context → do it yourself +- Isolated task, no conversation history needed → delegate +- Complex and you're opus anyway → just do it + +**Cost principle:** Local is free. GitHub Copilot models are "free" with subscription. Use them instead of burning cloud API tokens. ## Make It Yours