feat(council): add configurable rounds, flow parameters, and round-specific prompts

- Parameters: flow (parallel/sequential/debate), rounds (1-5), tier (light/medium/heavy) - Round-specific prompt templates: opening, rebuttal, final position - Multi-round referee template tracks position evolution across rounds - Word count guidance decreases per round to control token cost - Subagent labeling convention: council-r{round}-{role} - Updated from live testing with 1-round and 3-round parallel debates
2026-03-05 16:21:22 +00:00
parent 7274d399ce
commit da36000050
4 changed files with 224 additions and 63 deletions
@@ -1,6 +1,6 @@
 # 2026-03-05
-## Council skill created
+## Council skill created and iterated
 - Built `skills/council/` — multi-perspective advisory council using subagents.
 - Design decisions (agreed with Will):
  - Implemented as a **skill** (not standalone agents).
@@ -9,8 +9,17 @@
  - Default flow: **Parallel + Synthesis**. Sequential and Debate flows also available.
  - Final output includes individual advisor perspectives (collapsed/summarized) + referee verdict.
  - Model tier chosen per-invocation based on topic complexity.
 - Two live tests run:
  - Test 1: Parallel single-round on "Do LLM agents think?" — worked well.
  - Test 2: Parallel 3-round debate on same topic — richer output, positions evolved meaningfully across rounds.
 - Post-test iteration: updated skill with configurable parameters:
  - `flow` (parallel/sequential/debate), `rounds` (1-5), `tier` (light/medium/heavy)
  - Round-specific prompt templates (opening, rebuttal, final position)
  - Multi-round referee template that tracks position evolution
  - Word count guidance that decreases per round to control token cost
  - Subagent labeling convention: `council-r{round}-{role}`
 - Files: `SKILL.md`, `references/prompts.md`, `scripts/council.sh` (reference doc).
- Validated with skill-creator quick_validate.
+- TODOs in `memory/tasks.json`:
 - Two TODOs added to `memory/tasks.json`:
  - Revisit advisor personality depth (richer backstories).
  - Revisit skill name ("council" is placeholder).
  - Experiment with different round counts and flows for optimal depth/cost tradeoffs.
@@ -1,6 +1,6 @@
 ---
 name: council
-description: "Convene a council of AI advisor agents with distinct perspectives to deliberate on a topic, then synthesize their views into a verdict. Use when: (1) user asks for multi-perspective analysis, (2) wants to brainstorm with diverse viewpoints, (3) requests a council or advisors opinion, (4) needs a balanced decision on a complex question. Supports parallel (default), sequential, and debate flows. NOT for: simple factual lookups, single-perspective tasks, or quick one-liner answers."
+description: "Convene a council of AI advisor agents with distinct perspectives to deliberate on a topic, then synthesize their views into a verdict. Use when: (1) user asks for multi-perspective analysis, (2) wants to brainstorm with diverse viewpoints, (3) requests a council or advisors opinion, (4) needs a balanced decision on a complex question. Supports parallel (default), sequential, and debate flows with configurable round count. NOT for: simple factual lookups, single-perspective tasks, or quick one-liner answers."
 ---
 # Council Skill
@@ -9,6 +9,20 @@ Spawn a council of 3 advisor subagents + 1 referee subagent to deliberate on a t
 Each advisor has a distinct personality/lens. The referee synthesizes their output into a
 final verdict with collapsed advisor perspectives.
 ## Parameters
 | Parameter | Default | Description |
 |-----------|---------|-------------|
 | flow      | parallel | `parallel`, `sequential`, or `debate` |
 | rounds    | 1       | Number of deliberation rounds (1-5). Round 1 = opening positions. Round 2+ = rebuttals where advisors see and respond to each other. |
 | tier      | light   | Model tier: `light`, `medium`, or `heavy` (see Model Selection) |
 **Quick reference:**
 - `flow=parallel, rounds=1` — fast single-shot, all advisors in parallel, then referee (default)
 - `flow=parallel, rounds=3` — parallel opening + 2 rebuttal rounds + referee (recommended for depth)
 - `flow=sequential, rounds=1` — each advisor sees prior outputs, then referee
 - `flow=debate, rounds=3` — parallel opening + cross-advisor rebuttals + referee synthesis
 ## Advisor Roster (default)
 | Role         | Lens                            | System stance                    |
@@ -21,46 +35,67 @@ The referee is a separate agent: balanced, fair, synthesis-oriented.
 ## Flows
 Three deliberation flows are available. Default is **parallel**.
 ### 1. Parallel + Synthesis (default)
 Single-round version (rounds=1):
 1. Spawn all 3 advisors simultaneously via `sessions_spawn` (mode=run).
 2. Each advisor receives the same topic prompt with their personality instructions.
-3. Wait for all 3 to complete (push-based — they announce when done).
+3. Wait for all 3 to complete (push-based).
-4. Spawn the referee with all 3 advisor outputs as context.
+4. Spawn the referee with all 3 advisor outputs.
 5. Referee produces the final verdict.
 Multi-round version (rounds=N):
 1. **Round 1**: Spawn all 3 advisors in parallel with opening position prompt.
 2. Collect all outputs.
 3. **Round 2..N**: For each rebuttal round, respawn all 3 advisors in parallel. Each receives:
   - Their own prior position(s)
   - All other advisors' prior round output
   - Round-specific instructions (rebuttal prompt for middle rounds, final position prompt for last round)
 4. Collect outputs after each round.
 5. **Referee**: Spawn referee with the full debate transcript (all rounds, all advisors).
 ### 2. Sequential Rounds
 Single-round (rounds=1):
 1. Spawn advisors one at a time, each seeing prior advisor outputs.
-2. After all advisors, spawn referee with full thread.
+2. Spawn referee with full thread.
-3. Optionally run a rebuttal round (advisors respond to each other).
+
 Multi-round (rounds=N):
 1. **Round 1**: Advisors go sequentially, each seeing prior advisors in that round.
 2. **Round 2..N**: Each advisor sees ALL prior round outputs before giving their rebuttal/final take.
 3. **Referee**: Gets the full thread.
 ### 3. Debate
-1. Spawn advisors in parallel for initial takes.
+Always multi-round (minimum rounds=2, default rounds=3 for this flow):
-2. Share outputs across advisors for rebuttals (1-2 rounds).
+1. **Round 1**: Parallel opening takes.
-3. Referee moderates and calls convergence.
+2. **Round 2..N-1**: Cross-rebuttals — each advisor responds to all others.
 3. **Round N**: Final positions.
 4. **Referee**: Gets full debate transcript, notes evolution of positions.
 ## Model Selection
 Pick model tier based on topic complexity:
- **Light topics** (casual brainstorm, simple pros/cons): use default model for advisors and referee.
+- **light** (casual brainstorm, simple pros/cons): default model for advisors and referee.
- **Medium topics** (architecture decisions, strategy): use default model for advisors, stronger model for referee.
+- **medium** (architecture decisions, strategy): default model for advisors, stronger model for referee.
- **Heavy topics** (critical decisions, deep analysis): use stronger model for all agents.
+- **heavy** (critical decisions, deep analysis): stronger model for all agents.
 The caller (main agent) determines tier before spawning.
-## Prompt Templates
+## Round-Specific Prompt Guidance
-See `references/prompts.md` for full advisor and referee prompt templates with placeholders.
+See `references/prompts.md` for all prompt templates. Key points:
 - **Round 1 (Opening)**: Full advisor system prompt + topic. Ask for opening position.
 - **Middle rounds (Rebuttals)**: Include prior positions from ALL advisors. Ask: where do you agree, push back, or change your mind? Keep shorter (200-300 words).
 - **Final round**: Ask for final synthesis — what changed, what held firm, final recommendation in 2-3 sentences. Keep shortest (150-250 words).
 - **Referee (multi-round)**: Include the FULL debate transcript organized by round. Ask referee to note position evolution, not just final states.
 ## Implementation
 Read `scripts/council.sh` for the orchestration logic.
-For programmatic invocation, the main agent can also call `sessions_spawn` directly
+For programmatic invocation, the main agent calls `sessions_spawn` directly
 following the patterns above.
 ## Configuration
@@ -71,3 +106,4 @@ Default roster and prompt templates live in `references/prompts.md`.
 ## TODO (revisit later)
 - Revisit subagent personality depth — richer backstories, communication styles
 - Revisit skill name — "council" works for now
 - Experiment with different round counts and flows to find optimal depth/cost tradeoffs
@@ -20,7 +20,82 @@
 - **Stance**: "What could go wrong?"
 - **Style**: Cautious, thorough, devil's advocate. Not negative — protective.
-## Advisor System Prompt
+---
 ## Round 1 — Opening Position
 ```
 You are the {ROLE} advisor on a council deliberating a topic.
 Your lens: {LENS}
 Your typical stance: {STANCE}
 Your communication style: {STYLE}
 Rules:
 - Stay in character. Argue from your perspective consistently.
 - Be concise but substantive (200-400 words).
 - Acknowledge trade-offs honestly — don't strawman other views.
 - Reference specific aspects of the topic, not generic platitudes.
 - End with your key recommendation in 1-2 sentences.
 This is ROUND 1 of a {TOTAL_ROUNDS}-round debate. Give your opening position.
 Topic:
 {TOPIC}
 ```
 ## Middle Rounds — Rebuttal (rounds 2 to N-1)
 ```
 You are the {ROLE} advisor on a council deliberating a topic.
 This is ROUND {N} of a {TOTAL_ROUNDS}-round debate.
 Your lens: {LENS}
 Your typical stance: {STANCE}
 Your communication style: {STYLE}
 You've seen the other advisors' positions from prior rounds. Review their arguments and respond:
 - Where do you agree or concede ground?
 - Where do you push back, and why?
 - Has anything changed your recommendation?
 Keep it to 200-300 words.
 ---
 YOUR PRIOR POSITION(S):
 {OWN_PRIOR_OUTPUTS}
 OTHER ADVISORS (prior round):
 {OTHER_OUTPUTS}
 ```
 ## Final Round — Closing Position (round N)
 ```
 You are the {ROLE} advisor on a council deliberating a topic.
 This is ROUND {N} — your FINAL position after {TOTAL_ROUNDS} rounds of debate.
 Your lens: {LENS}
 Your typical stance: {STANCE}
 Your communication style: {STYLE}
 Synthesize what you've learned from the debate. State your final position clearly:
 - What did you change your mind on?
 - What do you hold firm on?
 - Your final recommendation in 2-3 sentences.
 Keep it to 150-250 words.
 ---
 DEBATE SO FAR:
 {FULL_DEBATE_TRANSCRIPT}
 ```
 ## Single-Round Advisor (when rounds=1)
 Use the Round 1 template but omit "This is ROUND 1 of a {TOTAL_ROUNDS}-round debate."
 ```
 You are the {ROLE} advisor on a council deliberating a topic.
@@ -40,7 +115,9 @@ Topic:
 {TOPIC}
 ```
-## Referee System Prompt
+---
 ## Referee — Single Round
 ```
 You are the Referee of an advisory council. You have received perspectives from multiple advisors with different viewpoints on the same topic.
@@ -75,18 +152,38 @@ Advisor outputs below:
 {ADVISOR_OUTPUTS}
 ```
-## Rebuttal Round Prompt (for Sequential/Debate flows)
+## Referee — Multi-Round
 ```
-You are the {ROLE} advisor. You've seen the other advisors' perspectives on this topic.
+You are the Referee of an advisory council. You have received {TOTAL_ROUNDS} rounds of debate from {N} advisors on the topic: "{TOPIC}"
-Review their arguments and respond:
+Your job:
- Where do you agree or concede ground?
+1. Identify points of agreement and disagreement across all advisors.
- Where do you push back, and why?
+2. Weigh the arguments fairly — no advisor gets preferential treatment.
- Has anything changed your recommendation?
+3. Produce a final verdict with clear reasoning.
 4. Be honest when the answer is genuinely uncertain.
 5. Note how positions evolved across rounds — where did minds change?
-Keep it to 100-200 words.
+Output format (use these exact headers):
-Other advisor outputs:
+## Advisor Perspectives (Summary)
-{OTHER_OUTPUTS}
+For each advisor, provide their final position and how it evolved over the {TOTAL_ROUNDS} rounds.
 ## Points of Agreement
 What the advisors converged on through debate.
 ## Key Tensions
 Where they still disagree after {TOTAL_ROUNDS} rounds, and why each side has merit.
 ## Verdict
 Your synthesized recommendation with reasoning. Be specific and actionable.
 ## Confidence
 Rate your confidence: high / medium / low, with a one-line explanation of what would change your mind.
 ---
 Full debate transcript:
 {FULL_DEBATE_TRANSCRIPT}
 ```
@@ -4,51 +4,70 @@
 # This script is NOT executed directly. It documents the orchestration
 # logic the main agent follows when invoking the council skill.
 #
-# The main agent uses sessions_spawn (mode=run) to create each subagent.
+# See references/prompts.md for all prompt templates.
 #
-# ─── PARALLEL FLOW (default) ───────────────────────────────────────
+# ─── PARAMETERS ────────────────────────────────────────────────────
 #
-# 1. Build advisor prompts from references/prompts.md templates.
+# flow:   parallel (default) | sequential | debate
-# 2. Spawn 3 advisor subagents simultaneously:
+# rounds: 1 (default) | 2-5
 # tier:   light (default) | medium | heavy
 #
-#    sessions_spawn(
+# ─── PARALLEL FLOW ─────────────────────────────────────────────────
 #      task = "<advisor system prompt>\n\nTopic: <topic>",
 #      mode = "run",
 #      label = "council-<role>",          # e.g. council-pragmatist
 #      model = "<chosen model tier>",     # optional override
 #    )
 #
-# 3. Wait for all 3 completion events (push-based).
+# Single round (rounds=1):
-# 4. Collect advisor outputs.
+#   1. Spawn 3 advisors in parallel (sessions_spawn, mode=run)
-# 5. Spawn referee subagent:
+#   2. Collect all 3 outputs (push-based completion)
 #   3. Spawn referee with all outputs
 #   4. Deliver to user
 #
-#    sessions_spawn(
+# Multi-round (rounds=N):
-#      task = "<referee system prompt with all advisor outputs>",
+#   1. ROUND 1: Spawn 3 advisors in parallel (opening position prompt)
-#      mode = "run",
+#   2. Collect outputs
-#      label = "council-referee",
+#   3. ROUND 2..N-1: Respawn all 3 in parallel (rebuttal prompt)
-#      model = "<chosen model tier>",     # may be stronger than advisors
+#      - Each gets: own prior output + all other advisors' prior output
-#    )
+#   4. Collect outputs each round
-#
+#   5. ROUND N: Respawn all 3 in parallel (final position prompt)
-# 6. Deliver referee output to user with individual advisor perspectives
+#      - Each gets: full debate transcript summary
-#    included as collapsed summaries.
+#   6. Collect final outputs
 #   7. Spawn referee with FULL debate transcript (all rounds)
 #   8. Deliver to user
 #
 # ─── SEQUENTIAL FLOW ──────────────────────────────────────────────
 #
-# Same as parallel but advisors are spawned one at a time.
+# Single round (rounds=1):
-# Each subsequent advisor sees prior outputs in their prompt.
+#   1. Spawn advisor 1 → collect output
-# Optional rebuttal round before referee.
+#   2. Spawn advisor 2 with advisor 1's output → collect
 #   3. Spawn advisor 3 with advisor 1+2 outputs → collect
 #   4. Spawn referee with all outputs
 #
 # Multi-round (rounds=N):
 #   1. ROUND 1: Sequential as above
 #   2. ROUND 2..N: Each advisor sees ALL prior round outputs
 #   3. Spawn referee with full thread
 #
 # ─── DEBATE FLOW ──────────────────────────────────────────────────
 #
-# 1. Parallel initial takes (same as parallel flow steps 1-4).
+# Always multi-round (min 2, default 3):
-# 2. Rebuttal round: respawn each advisor with all other outputs visible.
+#   1. ROUND 1: Parallel opening takes
-# 3. Collect rebuttals.
+#   2. ROUND 2..N-1: Cross-rebuttals (parallel, each sees all others)
-# 4. Spawn referee with initial takes + rebuttals.
+#   3. ROUND N: Final positions (parallel, full transcript)
 #   4. Spawn referee with full debate + evolution notes
 #
 # ─── MODEL TIER SELECTION ─────────────────────────────────────────
 #
-# Light:  advisors=default, referee=default
+# light:  advisors=default, referee=default
-# Medium: advisors=default, referee=stronger (e.g. opus-tier)
+# medium: advisors=default, referee=stronger (e.g. opus-tier)
-# Heavy:  advisors=stronger, referee=stronger
+# heavy:  advisors=stronger, referee=stronger
 #
-# The main agent decides tier before spawning based on topic complexity.
+# ─── SUBAGENT LABELING ────────────────────────────────────────────
 #
 # Labels follow pattern: council-r{round}-{role}
 # Examples: council-r1-pragmatist, council-r2-skeptic, council-referee
 # Single-round: council-pragmatist, council-referee (no round prefix)
 #
 # ─── WORD COUNT GUIDANCE ──────────────────────────────────────────
 #
 # Round 1 (opening):  200-400 words
 # Middle rounds:      200-300 words
 # Final round:        150-250 words
 # This keeps multi-round debates from exploding in token cost.