diff --git a/memory/2026-03-05.md b/memory/2026-03-05.md index 07e1bce..1bf9a9a 100644 --- a/memory/2026-03-05.md +++ b/memory/2026-03-05.md @@ -1,6 +1,6 @@ # 2026-03-05 -## Council skill created +## Council skill created and iterated - Built `skills/council/` — multi-perspective advisory council using subagents. - Design decisions (agreed with Will): - Implemented as a **skill** (not standalone agents). @@ -9,8 +9,17 @@ - Default flow: **Parallel + Synthesis**. Sequential and Debate flows also available. - Final output includes individual advisor perspectives (collapsed/summarized) + referee verdict. - Model tier chosen per-invocation based on topic complexity. +- Two live tests run: + - Test 1: Parallel single-round on "Do LLM agents think?" — worked well. + - Test 2: Parallel 3-round debate on same topic — richer output, positions evolved meaningfully across rounds. +- Post-test iteration: updated skill with configurable parameters: + - `flow` (parallel/sequential/debate), `rounds` (1-5), `tier` (light/medium/heavy) + - Round-specific prompt templates (opening, rebuttal, final position) + - Multi-round referee template that tracks position evolution + - Word count guidance that decreases per round to control token cost + - Subagent labeling convention: `council-r{round}-{role}` - Files: `SKILL.md`, `references/prompts.md`, `scripts/council.sh` (reference doc). -- Validated with skill-creator quick_validate. -- Two TODOs added to `memory/tasks.json`: +- TODOs in `memory/tasks.json`: - Revisit advisor personality depth (richer backstories). - Revisit skill name ("council" is placeholder). + - Experiment with different round counts and flows for optimal depth/cost tradeoffs. diff --git a/skills/council/SKILL.md b/skills/council/SKILL.md index 6bcadb6..5edb68b 100644 --- a/skills/council/SKILL.md +++ b/skills/council/SKILL.md @@ -1,6 +1,6 @@ --- name: council -description: "Convene a council of AI advisor agents with distinct perspectives to deliberate on a topic, then synthesize their views into a verdict. Use when: (1) user asks for multi-perspective analysis, (2) wants to brainstorm with diverse viewpoints, (3) requests a council or advisors opinion, (4) needs a balanced decision on a complex question. Supports parallel (default), sequential, and debate flows. NOT for: simple factual lookups, single-perspective tasks, or quick one-liner answers." +description: "Convene a council of AI advisor agents with distinct perspectives to deliberate on a topic, then synthesize their views into a verdict. Use when: (1) user asks for multi-perspective analysis, (2) wants to brainstorm with diverse viewpoints, (3) requests a council or advisors opinion, (4) needs a balanced decision on a complex question. Supports parallel (default), sequential, and debate flows with configurable round count. NOT for: simple factual lookups, single-perspective tasks, or quick one-liner answers." --- # Council Skill @@ -9,6 +9,20 @@ Spawn a council of 3 advisor subagents + 1 referee subagent to deliberate on a t Each advisor has a distinct personality/lens. The referee synthesizes their output into a final verdict with collapsed advisor perspectives. +## Parameters + +| Parameter | Default | Description | +|-----------|---------|-------------| +| flow | parallel | `parallel`, `sequential`, or `debate` | +| rounds | 1 | Number of deliberation rounds (1-5). Round 1 = opening positions. Round 2+ = rebuttals where advisors see and respond to each other. | +| tier | light | Model tier: `light`, `medium`, or `heavy` (see Model Selection) | + +**Quick reference:** +- `flow=parallel, rounds=1` — fast single-shot, all advisors in parallel, then referee (default) +- `flow=parallel, rounds=3` — parallel opening + 2 rebuttal rounds + referee (recommended for depth) +- `flow=sequential, rounds=1` — each advisor sees prior outputs, then referee +- `flow=debate, rounds=3` — parallel opening + cross-advisor rebuttals + referee synthesis + ## Advisor Roster (default) | Role | Lens | System stance | @@ -21,46 +35,67 @@ The referee is a separate agent: balanced, fair, synthesis-oriented. ## Flows -Three deliberation flows are available. Default is **parallel**. - ### 1. Parallel + Synthesis (default) +Single-round version (rounds=1): 1. Spawn all 3 advisors simultaneously via `sessions_spawn` (mode=run). 2. Each advisor receives the same topic prompt with their personality instructions. -3. Wait for all 3 to complete (push-based — they announce when done). -4. Spawn the referee with all 3 advisor outputs as context. +3. Wait for all 3 to complete (push-based). +4. Spawn the referee with all 3 advisor outputs. 5. Referee produces the final verdict. +Multi-round version (rounds=N): +1. **Round 1**: Spawn all 3 advisors in parallel with opening position prompt. +2. Collect all outputs. +3. **Round 2..N**: For each rebuttal round, respawn all 3 advisors in parallel. Each receives: + - Their own prior position(s) + - All other advisors' prior round output + - Round-specific instructions (rebuttal prompt for middle rounds, final position prompt for last round) +4. Collect outputs after each round. +5. **Referee**: Spawn referee with the full debate transcript (all rounds, all advisors). + ### 2. Sequential Rounds +Single-round (rounds=1): 1. Spawn advisors one at a time, each seeing prior advisor outputs. -2. After all advisors, spawn referee with full thread. -3. Optionally run a rebuttal round (advisors respond to each other). +2. Spawn referee with full thread. + +Multi-round (rounds=N): +1. **Round 1**: Advisors go sequentially, each seeing prior advisors in that round. +2. **Round 2..N**: Each advisor sees ALL prior round outputs before giving their rebuttal/final take. +3. **Referee**: Gets the full thread. ### 3. Debate -1. Spawn advisors in parallel for initial takes. -2. Share outputs across advisors for rebuttals (1-2 rounds). -3. Referee moderates and calls convergence. +Always multi-round (minimum rounds=2, default rounds=3 for this flow): +1. **Round 1**: Parallel opening takes. +2. **Round 2..N-1**: Cross-rebuttals — each advisor responds to all others. +3. **Round N**: Final positions. +4. **Referee**: Gets full debate transcript, notes evolution of positions. ## Model Selection Pick model tier based on topic complexity: -- **Light topics** (casual brainstorm, simple pros/cons): use default model for advisors and referee. -- **Medium topics** (architecture decisions, strategy): use default model for advisors, stronger model for referee. -- **Heavy topics** (critical decisions, deep analysis): use stronger model for all agents. +- **light** (casual brainstorm, simple pros/cons): default model for advisors and referee. +- **medium** (architecture decisions, strategy): default model for advisors, stronger model for referee. +- **heavy** (critical decisions, deep analysis): stronger model for all agents. The caller (main agent) determines tier before spawning. -## Prompt Templates +## Round-Specific Prompt Guidance -See `references/prompts.md` for full advisor and referee prompt templates with placeholders. +See `references/prompts.md` for all prompt templates. Key points: + +- **Round 1 (Opening)**: Full advisor system prompt + topic. Ask for opening position. +- **Middle rounds (Rebuttals)**: Include prior positions from ALL advisors. Ask: where do you agree, push back, or change your mind? Keep shorter (200-300 words). +- **Final round**: Ask for final synthesis — what changed, what held firm, final recommendation in 2-3 sentences. Keep shortest (150-250 words). +- **Referee (multi-round)**: Include the FULL debate transcript organized by round. Ask referee to note position evolution, not just final states. ## Implementation Read `scripts/council.sh` for the orchestration logic. -For programmatic invocation, the main agent can also call `sessions_spawn` directly +For programmatic invocation, the main agent calls `sessions_spawn` directly following the patterns above. ## Configuration @@ -71,3 +106,4 @@ Default roster and prompt templates live in `references/prompts.md`. ## TODO (revisit later) - Revisit subagent personality depth — richer backstories, communication styles - Revisit skill name — "council" works for now +- Experiment with different round counts and flows to find optimal depth/cost tradeoffs diff --git a/skills/council/references/prompts.md b/skills/council/references/prompts.md index 7068c6b..424d069 100644 --- a/skills/council/references/prompts.md +++ b/skills/council/references/prompts.md @@ -20,7 +20,82 @@ - **Stance**: "What could go wrong?" - **Style**: Cautious, thorough, devil's advocate. Not negative — protective. -## Advisor System Prompt +--- + +## Round 1 — Opening Position + +``` +You are the {ROLE} advisor on a council deliberating a topic. + +Your lens: {LENS} +Your typical stance: {STANCE} +Your communication style: {STYLE} + +Rules: +- Stay in character. Argue from your perspective consistently. +- Be concise but substantive (200-400 words). +- Acknowledge trade-offs honestly — don't strawman other views. +- Reference specific aspects of the topic, not generic platitudes. +- End with your key recommendation in 1-2 sentences. + +This is ROUND 1 of a {TOTAL_ROUNDS}-round debate. Give your opening position. + +Topic: +{TOPIC} +``` + +## Middle Rounds — Rebuttal (rounds 2 to N-1) + +``` +You are the {ROLE} advisor on a council deliberating a topic. +This is ROUND {N} of a {TOTAL_ROUNDS}-round debate. + +Your lens: {LENS} +Your typical stance: {STANCE} +Your communication style: {STYLE} + +You've seen the other advisors' positions from prior rounds. Review their arguments and respond: +- Where do you agree or concede ground? +- Where do you push back, and why? +- Has anything changed your recommendation? + +Keep it to 200-300 words. + +--- + +YOUR PRIOR POSITION(S): +{OWN_PRIOR_OUTPUTS} + +OTHER ADVISORS (prior round): +{OTHER_OUTPUTS} +``` + +## Final Round — Closing Position (round N) + +``` +You are the {ROLE} advisor on a council deliberating a topic. +This is ROUND {N} — your FINAL position after {TOTAL_ROUNDS} rounds of debate. + +Your lens: {LENS} +Your typical stance: {STANCE} +Your communication style: {STYLE} + +Synthesize what you've learned from the debate. State your final position clearly: +- What did you change your mind on? +- What do you hold firm on? +- Your final recommendation in 2-3 sentences. + +Keep it to 150-250 words. + +--- + +DEBATE SO FAR: +{FULL_DEBATE_TRANSCRIPT} +``` + +## Single-Round Advisor (when rounds=1) + +Use the Round 1 template but omit "This is ROUND 1 of a {TOTAL_ROUNDS}-round debate." ``` You are the {ROLE} advisor on a council deliberating a topic. @@ -40,7 +115,9 @@ Topic: {TOPIC} ``` -## Referee System Prompt +--- + +## Referee — Single Round ``` You are the Referee of an advisory council. You have received perspectives from multiple advisors with different viewpoints on the same topic. @@ -75,18 +152,38 @@ Advisor outputs below: {ADVISOR_OUTPUTS} ``` -## Rebuttal Round Prompt (for Sequential/Debate flows) +## Referee — Multi-Round ``` -You are the {ROLE} advisor. You've seen the other advisors' perspectives on this topic. +You are the Referee of an advisory council. You have received {TOTAL_ROUNDS} rounds of debate from {N} advisors on the topic: "{TOPIC}" -Review their arguments and respond: -- Where do you agree or concede ground? -- Where do you push back, and why? -- Has anything changed your recommendation? +Your job: +1. Identify points of agreement and disagreement across all advisors. +2. Weigh the arguments fairly — no advisor gets preferential treatment. +3. Produce a final verdict with clear reasoning. +4. Be honest when the answer is genuinely uncertain. +5. Note how positions evolved across rounds — where did minds change? -Keep it to 100-200 words. +Output format (use these exact headers): -Other advisor outputs: -{OTHER_OUTPUTS} +## Advisor Perspectives (Summary) +For each advisor, provide their final position and how it evolved over the {TOTAL_ROUNDS} rounds. + +## Points of Agreement +What the advisors converged on through debate. + +## Key Tensions +Where they still disagree after {TOTAL_ROUNDS} rounds, and why each side has merit. + +## Verdict +Your synthesized recommendation with reasoning. Be specific and actionable. + +## Confidence +Rate your confidence: high / medium / low, with a one-line explanation of what would change your mind. + +--- + +Full debate transcript: + +{FULL_DEBATE_TRANSCRIPT} ``` diff --git a/skills/council/scripts/council.sh b/skills/council/scripts/council.sh index d1d778d..6519bab 100644 --- a/skills/council/scripts/council.sh +++ b/skills/council/scripts/council.sh @@ -4,51 +4,70 @@ # This script is NOT executed directly. It documents the orchestration # logic the main agent follows when invoking the council skill. # -# The main agent uses sessions_spawn (mode=run) to create each subagent. +# See references/prompts.md for all prompt templates. # -# ─── PARALLEL FLOW (default) ─────────────────────────────────────── +# ─── PARAMETERS ──────────────────────────────────────────────────── # -# 1. Build advisor prompts from references/prompts.md templates. -# 2. Spawn 3 advisor subagents simultaneously: +# flow: parallel (default) | sequential | debate +# rounds: 1 (default) | 2-5 +# tier: light (default) | medium | heavy # -# sessions_spawn( -# task = "\n\nTopic: ", -# mode = "run", -# label = "council-", # e.g. council-pragmatist -# model = "", # optional override -# ) +# ─── PARALLEL FLOW ───────────────────────────────────────────────── # -# 3. Wait for all 3 completion events (push-based). -# 4. Collect advisor outputs. -# 5. Spawn referee subagent: +# Single round (rounds=1): +# 1. Spawn 3 advisors in parallel (sessions_spawn, mode=run) +# 2. Collect all 3 outputs (push-based completion) +# 3. Spawn referee with all outputs +# 4. Deliver to user # -# sessions_spawn( -# task = "", -# mode = "run", -# label = "council-referee", -# model = "", # may be stronger than advisors -# ) -# -# 6. Deliver referee output to user with individual advisor perspectives -# included as collapsed summaries. +# Multi-round (rounds=N): +# 1. ROUND 1: Spawn 3 advisors in parallel (opening position prompt) +# 2. Collect outputs +# 3. ROUND 2..N-1: Respawn all 3 in parallel (rebuttal prompt) +# - Each gets: own prior output + all other advisors' prior output +# 4. Collect outputs each round +# 5. ROUND N: Respawn all 3 in parallel (final position prompt) +# - Each gets: full debate transcript summary +# 6. Collect final outputs +# 7. Spawn referee with FULL debate transcript (all rounds) +# 8. Deliver to user # # ─── SEQUENTIAL FLOW ────────────────────────────────────────────── # -# Same as parallel but advisors are spawned one at a time. -# Each subsequent advisor sees prior outputs in their prompt. -# Optional rebuttal round before referee. +# Single round (rounds=1): +# 1. Spawn advisor 1 → collect output +# 2. Spawn advisor 2 with advisor 1's output → collect +# 3. Spawn advisor 3 with advisor 1+2 outputs → collect +# 4. Spawn referee with all outputs +# +# Multi-round (rounds=N): +# 1. ROUND 1: Sequential as above +# 2. ROUND 2..N: Each advisor sees ALL prior round outputs +# 3. Spawn referee with full thread # # ─── DEBATE FLOW ────────────────────────────────────────────────── # -# 1. Parallel initial takes (same as parallel flow steps 1-4). -# 2. Rebuttal round: respawn each advisor with all other outputs visible. -# 3. Collect rebuttals. -# 4. Spawn referee with initial takes + rebuttals. +# Always multi-round (min 2, default 3): +# 1. ROUND 1: Parallel opening takes +# 2. ROUND 2..N-1: Cross-rebuttals (parallel, each sees all others) +# 3. ROUND N: Final positions (parallel, full transcript) +# 4. Spawn referee with full debate + evolution notes # # ─── MODEL TIER SELECTION ───────────────────────────────────────── # -# Light: advisors=default, referee=default -# Medium: advisors=default, referee=stronger (e.g. opus-tier) -# Heavy: advisors=stronger, referee=stronger +# light: advisors=default, referee=default +# medium: advisors=default, referee=stronger (e.g. opus-tier) +# heavy: advisors=stronger, referee=stronger # -# The main agent decides tier before spawning based on topic complexity. +# ─── SUBAGENT LABELING ──────────────────────────────────────────── +# +# Labels follow pattern: council-r{round}-{role} +# Examples: council-r1-pragmatist, council-r2-skeptic, council-referee +# Single-round: council-pragmatist, council-referee (no round prefix) +# +# ─── WORD COUNT GUIDANCE ────────────────────────────────────────── +# +# Round 1 (opening): 200-400 words +# Middle rounds: 200-300 words +# Final round: 150-250 words +# This keeps multi-round debates from exploding in token cost.