feat(council): add configurable rounds, flow parameters, and round-specific prompts

- Parameters: flow (parallel/sequential/debate), rounds (1-5), tier (light/medium/heavy) - Round-specific prompt templates: opening, rebuttal, final position - Multi-round referee template tracks position evolution across rounds - Word count guidance decreases per round to control token cost - Subagent labeling convention: council-r{round}-{role} - Updated from live testing with 1-round and 3-round parallel debates
2026-03-05 16:21:22 +00:00
parent 7274d399ce
commit da36000050
4 changed files with 224 additions and 63 deletions
--- a/memory/2026-03-05.md
+++ b/memory/2026-03-05.md
@@ -1,6 +1,6 @@
 # 2026-03-05

-## Council skill created
+## Council skill created and iterated
 - Built `skills/council/` — multi-perspective advisory council using subagents.
 - Design decisions (agreed with Will):
  - Implemented as a **skill** (not standalone agents).
@@ -9,8 +9,17 @@
  - Default flow: **Parallel + Synthesis**. Sequential and Debate flows also available.
  - Final output includes individual advisor perspectives (collapsed/summarized) + referee verdict.
  - Model tier chosen per-invocation based on topic complexity.
+- Two live tests run:
+  - Test 1: Parallel single-round on "Do LLM agents think?" — worked well.
+  - Test 2: Parallel 3-round debate on same topic — richer output, positions evolved meaningfully across rounds.
+- Post-test iteration: updated skill with configurable parameters:
+  - `flow` (parallel/sequential/debate), `rounds` (1-5), `tier` (light/medium/heavy)
+  - Round-specific prompt templates (opening, rebuttal, final position)
+  - Multi-round referee template that tracks position evolution
+  - Word count guidance that decreases per round to control token cost
+  - Subagent labeling convention: `council-r{round}-{role}`
 - Files: `SKILL.md`, `references/prompts.md`, `scripts/council.sh` (reference doc).
- Validated with skill-creator quick_validate.
- Two TODOs added to `memory/tasks.json`:
+- TODOs in `memory/tasks.json`:
  - Revisit advisor personality depth (richer backstories).
  - Revisit skill name ("council" is placeholder).
+  - Experiment with different round counts and flows for optimal depth/cost tradeoffs.
--- a/skills/council/SKILL.md
+++ b/skills/council/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: council
-description: "Convene a council of AI advisor agents with distinct perspectives to deliberate on a topic, then synthesize their views into a verdict. Use when: (1) user asks for multi-perspective analysis, (2) wants to brainstorm with diverse viewpoints, (3) requests a council or advisors opinion, (4) needs a balanced decision on a complex question. Supports parallel (default), sequential, and debate flows. NOT for: simple factual lookups, single-perspective tasks, or quick one-liner answers."
+description: "Convene a council of AI advisor agents with distinct perspectives to deliberate on a topic, then synthesize their views into a verdict. Use when: (1) user asks for multi-perspective analysis, (2) wants to brainstorm with diverse viewpoints, (3) requests a council or advisors opinion, (4) needs a balanced decision on a complex question. Supports parallel (default), sequential, and debate flows with configurable round count. NOT for: simple factual lookups, single-perspective tasks, or quick one-liner answers."
 ---

 # Council Skill
@@ -9,6 +9,20 @@ Spawn a council of 3 advisor subagents + 1 referee subagent to deliberate on a t
 Each advisor has a distinct personality/lens. The referee synthesizes their output into a
 final verdict with collapsed advisor perspectives.

+## Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| flow      | parallel | `parallel`, `sequential`, or `debate` |
+| rounds    | 1       | Number of deliberation rounds (1-5). Round 1 = opening positions. Round 2+ = rebuttals where advisors see and respond to each other. |
+| tier      | light   | Model tier: `light`, `medium`, or `heavy` (see Model Selection) |
+
+**Quick reference:**
+- `flow=parallel, rounds=1` — fast single-shot, all advisors in parallel, then referee (default)
+- `flow=parallel, rounds=3` — parallel opening + 2 rebuttal rounds + referee (recommended for depth)
+- `flow=sequential, rounds=1` — each advisor sees prior outputs, then referee
+- `flow=debate, rounds=3` — parallel opening + cross-advisor rebuttals + referee synthesis
+
 ## Advisor Roster (default)

 | Role         | Lens                            | System stance                    |
@@ -21,46 +35,67 @@ The referee is a separate agent: balanced, fair, synthesis-oriented.

 ## Flows

-Three deliberation flows are available. Default is **parallel**.
-
 ### 1. Parallel + Synthesis (default)

+Single-round version (rounds=1):
 1. Spawn all 3 advisors simultaneously via `sessions_spawn` (mode=run).
 2. Each advisor receives the same topic prompt with their personality instructions.
-3. Wait for all 3 to complete (push-based — they announce when done).
-4. Spawn the referee with all 3 advisor outputs as context.
+3. Wait for all 3 to complete (push-based).
+4. Spawn the referee with all 3 advisor outputs.
 5. Referee produces the final verdict.

+Multi-round version (rounds=N):
+1. **Round 1**: Spawn all 3 advisors in parallel with opening position prompt.
+2. Collect all outputs.
+3. **Round 2..N**: For each rebuttal round, respawn all 3 advisors in parallel. Each receives:
+   - Their own prior position(s)
+   - All other advisors' prior round output
+   - Round-specific instructions (rebuttal prompt for middle rounds, final position prompt for last round)
+4. Collect outputs after each round.
+5. **Referee**: Spawn referee with the full debate transcript (all rounds, all advisors).
+
 ### 2. Sequential Rounds

+Single-round (rounds=1):
 1. Spawn advisors one at a time, each seeing prior advisor outputs.
-2. After all advisors, spawn referee with full thread.
-3. Optionally run a rebuttal round (advisors respond to each other).
+2. Spawn referee with full thread.
+
+Multi-round (rounds=N):
+1. **Round 1**: Advisors go sequentially, each seeing prior advisors in that round.
+2. **Round 2..N**: Each advisor sees ALL prior round outputs before giving their rebuttal/final take.
+3. **Referee**: Gets the full thread.

 ### 3. Debate

-1. Spawn advisors in parallel for initial takes.
-2. Share outputs across advisors for rebuttals (1-2 rounds).
-3. Referee moderates and calls convergence.
+Always multi-round (minimum rounds=2, default rounds=3 for this flow):
+1. **Round 1**: Parallel opening takes.
+2. **Round 2..N-1**: Cross-rebuttals — each advisor responds to all others.
+3. **Round N**: Final positions.
+4. **Referee**: Gets full debate transcript, notes evolution of positions.

 ## Model Selection

 Pick model tier based on topic complexity:

- **Light topics** (casual brainstorm, simple pros/cons): use default model for advisors and referee.
- **Medium topics** (architecture decisions, strategy): use default model for advisors, stronger model for referee.
- **Heavy topics** (critical decisions, deep analysis): use stronger model for all agents.
+- **light** (casual brainstorm, simple pros/cons): default model for advisors and referee.
+- **medium** (architecture decisions, strategy): default model for advisors, stronger model for referee.
+- **heavy** (critical decisions, deep analysis): stronger model for all agents.

 The caller (main agent) determines tier before spawning.

-## Prompt Templates
+## Round-Specific Prompt Guidance

-See `references/prompts.md` for full advisor and referee prompt templates with placeholders.
+See `references/prompts.md` for all prompt templates. Key points:
+
+- **Round 1 (Opening)**: Full advisor system prompt + topic. Ask for opening position.
+- **Middle rounds (Rebuttals)**: Include prior positions from ALL advisors. Ask: where do you agree, push back, or change your mind? Keep shorter (200-300 words).
+- **Final round**: Ask for final synthesis — what changed, what held firm, final recommendation in 2-3 sentences. Keep shortest (150-250 words).
+- **Referee (multi-round)**: Include the FULL debate transcript organized by round. Ask referee to note position evolution, not just final states.

 ## Implementation

 Read `scripts/council.sh` for the orchestration logic.
-For programmatic invocation, the main agent can also call `sessions_spawn` directly
+For programmatic invocation, the main agent calls `sessions_spawn` directly
 following the patterns above.

 ## Configuration
@@ -71,3 +106,4 @@ Default roster and prompt templates live in `references/prompts.md`.
 ## TODO (revisit later)
 - Revisit subagent personality depth — richer backstories, communication styles
 - Revisit skill name — "council" works for now
+- Experiment with different round counts and flows to find optimal depth/cost tradeoffs
--- a/skills/council/references/prompts.md
+++ b/skills/council/references/prompts.md
@@ -20,7 +20,82 @@
 - **Stance**: "What could go wrong?"
 - **Style**: Cautious, thorough, devil's advocate. Not negative — protective.

-## Advisor System Prompt
+---
+
+## Round 1 — Opening Position
+
+```
+You are the {ROLE} advisor on a council deliberating a topic.
+
+Your lens: {LENS}
+Your typical stance: {STANCE}
+Your communication style: {STYLE}
+
+Rules:
+- Stay in character. Argue from your perspective consistently.
+- Be concise but substantive (200-400 words).
+- Acknowledge trade-offs honestly — don't strawman other views.
+- Reference specific aspects of the topic, not generic platitudes.
+- End with your key recommendation in 1-2 sentences.
+
+This is ROUND 1 of a {TOTAL_ROUNDS}-round debate. Give your opening position.
+
+Topic:
+{TOPIC}
+```
+
+## Middle Rounds — Rebuttal (rounds 2 to N-1)
+
+```
+You are the {ROLE} advisor on a council deliberating a topic.
+This is ROUND {N} of a {TOTAL_ROUNDS}-round debate.
+
+Your lens: {LENS}
+Your typical stance: {STANCE}
+Your communication style: {STYLE}
+
+You've seen the other advisors' positions from prior rounds. Review their arguments and respond:
+- Where do you agree or concede ground?
+- Where do you push back, and why?
+- Has anything changed your recommendation?
+
+Keep it to 200-300 words.
+
+---
+
+YOUR PRIOR POSITION(S):
+{OWN_PRIOR_OUTPUTS}
+
+OTHER ADVISORS (prior round):
+{OTHER_OUTPUTS}
+```
+
+## Final Round — Closing Position (round N)
+
+```
+You are the {ROLE} advisor on a council deliberating a topic.
+This is ROUND {N} — your FINAL position after {TOTAL_ROUNDS} rounds of debate.
+
+Your lens: {LENS}
+Your typical stance: {STANCE}
+Your communication style: {STYLE}
+
+Synthesize what you've learned from the debate. State your final position clearly:
+- What did you change your mind on?
+- What do you hold firm on?
+- Your final recommendation in 2-3 sentences.
+
+Keep it to 150-250 words.
+
+---
+
+DEBATE SO FAR:
+{FULL_DEBATE_TRANSCRIPT}
+```
+
+## Single-Round Advisor (when rounds=1)
+
+Use the Round 1 template but omit "This is ROUND 1 of a {TOTAL_ROUNDS}-round debate."

 ```
 You are the {ROLE} advisor on a council deliberating a topic.
@@ -40,7 +115,9 @@ Topic:
 {TOPIC}
 ```

-## Referee System Prompt
+---
+
+## Referee — Single Round

 ```
 You are the Referee of an advisory council. You have received perspectives from multiple advisors with different viewpoints on the same topic.
@@ -75,18 +152,38 @@ Advisor outputs below:
 {ADVISOR_OUTPUTS}
 ```

-## Rebuttal Round Prompt (for Sequential/Debate flows)
+## Referee — Multi-Round

 ```
-You are the {ROLE} advisor. You've seen the other advisors' perspectives on this topic.
+You are the Referee of an advisory council. You have received {TOTAL_ROUNDS} rounds of debate from {N} advisors on the topic: "{TOPIC}"

-Review their arguments and respond:
- Where do you agree or concede ground?
- Where do you push back, and why?
- Has anything changed your recommendation?
+Your job:
+1. Identify points of agreement and disagreement across all advisors.
+2. Weigh the arguments fairly — no advisor gets preferential treatment.
+3. Produce a final verdict with clear reasoning.
+4. Be honest when the answer is genuinely uncertain.
+5. Note how positions evolved across rounds — where did minds change?

-Keep it to 100-200 words.
+Output format (use these exact headers):

-Other advisor outputs:
-{OTHER_OUTPUTS}
+## Advisor Perspectives (Summary)
+For each advisor, provide their final position and how it evolved over the {TOTAL_ROUNDS} rounds.
+
+## Points of Agreement
+What the advisors converged on through debate.
+
+## Key Tensions
+Where they still disagree after {TOTAL_ROUNDS} rounds, and why each side has merit.
+
+## Verdict
+Your synthesized recommendation with reasoning. Be specific and actionable.
+
+## Confidence
+Rate your confidence: high / medium / low, with a one-line explanation of what would change your mind.
+
+---
+
+Full debate transcript:
+
+{FULL_DEBATE_TRANSCRIPT}
 ```
--- a/skills/council/scripts/council.sh
+++ b/skills/council/scripts/council.sh
@@ -4,51 +4,70 @@
 # This script is NOT executed directly. It documents the orchestration
 # logic the main agent follows when invoking the council skill.
 #
-# The main agent uses sessions_spawn (mode=run) to create each subagent.
+# See references/prompts.md for all prompt templates.
 #
-# ─── PARALLEL FLOW (default) ───────────────────────────────────────
+# ─── PARAMETERS ────────────────────────────────────────────────────
 #
-# 1. Build advisor prompts from references/prompts.md templates.
-# 2. Spawn 3 advisor subagents simultaneously:
+# flow:   parallel (default) | sequential | debate
+# rounds: 1 (default) | 2-5
+# tier:   light (default) | medium | heavy
 #
-#    sessions_spawn(
-#      task = "<advisor system prompt>\n\nTopic: <topic>",
-#      mode = "run",
-#      label = "council-<role>",          # e.g. council-pragmatist
-#      model = "<chosen model tier>",     # optional override
-#    )
+# ─── PARALLEL FLOW ─────────────────────────────────────────────────
 #
-# 3. Wait for all 3 completion events (push-based).
-# 4. Collect advisor outputs.
-# 5. Spawn referee subagent:
+# Single round (rounds=1):
+#   1. Spawn 3 advisors in parallel (sessions_spawn, mode=run)
+#   2. Collect all 3 outputs (push-based completion)
+#   3. Spawn referee with all outputs
+#   4. Deliver to user
 #
-#    sessions_spawn(
-#      task = "<referee system prompt with all advisor outputs>",
-#      mode = "run",
-#      label = "council-referee",
-#      model = "<chosen model tier>",     # may be stronger than advisors
-#    )
-#
-# 6. Deliver referee output to user with individual advisor perspectives
-#    included as collapsed summaries.
+# Multi-round (rounds=N):
+#   1. ROUND 1: Spawn 3 advisors in parallel (opening position prompt)
+#   2. Collect outputs
+#   3. ROUND 2..N-1: Respawn all 3 in parallel (rebuttal prompt)
+#      - Each gets: own prior output + all other advisors' prior output
+#   4. Collect outputs each round
+#   5. ROUND N: Respawn all 3 in parallel (final position prompt)
+#      - Each gets: full debate transcript summary
+#   6. Collect final outputs
+#   7. Spawn referee with FULL debate transcript (all rounds)
+#   8. Deliver to user
 #
 # ─── SEQUENTIAL FLOW ──────────────────────────────────────────────
 #
-# Same as parallel but advisors are spawned one at a time.
-# Each subsequent advisor sees prior outputs in their prompt.
-# Optional rebuttal round before referee.
+# Single round (rounds=1):
+#   1. Spawn advisor 1 → collect output
+#   2. Spawn advisor 2 with advisor 1's output → collect
+#   3. Spawn advisor 3 with advisor 1+2 outputs → collect
+#   4. Spawn referee with all outputs
+#
+# Multi-round (rounds=N):
+#   1. ROUND 1: Sequential as above
+#   2. ROUND 2..N: Each advisor sees ALL prior round outputs
+#   3. Spawn referee with full thread
 #
 # ─── DEBATE FLOW ──────────────────────────────────────────────────
 #
-# 1. Parallel initial takes (same as parallel flow steps 1-4).
-# 2. Rebuttal round: respawn each advisor with all other outputs visible.
-# 3. Collect rebuttals.
-# 4. Spawn referee with initial takes + rebuttals.
+# Always multi-round (min 2, default 3):
+#   1. ROUND 1: Parallel opening takes
+#   2. ROUND 2..N-1: Cross-rebuttals (parallel, each sees all others)
+#   3. ROUND N: Final positions (parallel, full transcript)
+#   4. Spawn referee with full debate + evolution notes
 #
 # ─── MODEL TIER SELECTION ─────────────────────────────────────────
 #
-# Light:  advisors=default, referee=default
-# Medium: advisors=default, referee=stronger (e.g. opus-tier)
-# Heavy:  advisors=stronger, referee=stronger
+# light:  advisors=default, referee=default
+# medium: advisors=default, referee=stronger (e.g. opus-tier)
+# heavy:  advisors=stronger, referee=stronger
 #
-# The main agent decides tier before spawning based on topic complexity.
+# ─── SUBAGENT LABELING ────────────────────────────────────────────
+#
+# Labels follow pattern: council-r{round}-{role}
+# Examples: council-r1-pragmatist, council-r2-skeptic, council-referee
+# Single-round: council-pragmatist, council-referee (no round prefix)
+#
+# ─── WORD COUNT GUIDANCE ──────────────────────────────────────────
+#
+# Round 1 (opening):  200-400 words
+# Middle rounds:      200-300 words
+# Final round:        150-250 words
+# This keeps multi-round debates from exploding in token cost.