Files

zap e08e3d65e9 feat(council): add D/P (Deterministic/Probabilistic) dual-group mode

- New 'mode' parameter: personality (default) or dp
- D group: grounded, feasibility-first (Freethinker + Arbiter)
- P group: exploratory, reframing-first (Freethinker + Arbiter)
- Meta-Arbiter merges best ideas from both groups
- Full prompt templates for ideation, assessment, bridge, and merge
- Orchestration docs for single-round and multi-round D/P flows
- Inspired by Flynn's dual-council architecture, adapted for OpenClaw subagents

2026-03-05 19:18:44 +00:00

11 KiB

Raw Blame History

name, description

name	description
council	Convene a council of AI advisor agents with distinct perspectives to deliberate on a topic, then synthesize their views into a verdict. Use when: (1) user asks for multi-perspective analysis, (2) wants to brainstorm with diverse viewpoints, (3) requests a council or advisors opinion, (4) needs a balanced decision on a complex question. Supports parallel (default), sequential, and debate flows with configurable round count. NOT for: simple factual lookups, single-perspective tasks, or quick one-liner answers.

name

description

council

Convene a council of AI advisor agents with distinct perspectives to deliberate on a topic, then synthesize their views into a verdict. Use when: (1) user asks for multi-perspective analysis, (2) wants to brainstorm with diverse viewpoints, (3) requests a council or advisors opinion, (4) needs a balanced decision on a complex question. Supports parallel (default), sequential, and debate flows with configurable round count. NOT for: simple factual lookups, single-perspective tasks, or quick one-liner answers.

Council Skill

Spawn a council of 3 advisor subagents + 1 referee subagent to deliberate on a topic. Each advisor has a distinct personality/lens. The referee synthesizes their output into a final verdict with collapsed advisor perspectives.

Parameters

Parameter	Default	Description
mode	personality	`personality` (3 advisors) or `dp` (D/P dual-group)
flow	parallel	`parallel`, `sequential`, or `debate`
rounds	1	Number of deliberation rounds (1-5). Round 1 = opening positions. Round 2+ = rebuttals where advisors see and respond to each other.
tier	light	Model tier: `light`, `medium`, or `heavy` (see Model Selection)

Quick reference:

flow=parallel, rounds=1 — fast single-shot, all advisors in parallel, then referee (default)
flow=parallel, rounds=3 — parallel opening + 2 rebuttal rounds + referee (recommended for depth)
flow=sequential, rounds=1 — each advisor sees prior outputs, then referee
flow=debate, rounds=3 — parallel opening + cross-advisor rebuttals + referee synthesis

Modes

Personality Mode (default)

Three advisors with distinct personality lenses. Best for opinion, strategy, and brainstorming topics.

D/P Mode (Deterministic/Probabilistic)

Two groups of advisors with opposing cognitive styles, inspired by Flynn's dual-council architecture. Best for complex problem-solving, technical design, and situations where you want both grounded AND creative solutions.

Group D (Deterministic): Grounded, feasibility-first, risk-averse. Optimizes for "boring-but-true."
Group P (Probabilistic): Exploratory, reframing-first, risk-tolerant. Optimizes for "non-obvious leverage."

Each group has a Freethinker (generates ideas) and an Arbiter (evaluates/ranks them). The Referee (Meta-Arbiter) merges the best from both groups.

D/P Subagent Roster

Role	Group	Lens	Stance
D-Freethinker	Deterministic	Proven approaches, minimal assumptions	"What's the most reliable path?"
D-Arbiter	Deterministic	Feasibility scoring, risk assessment	"Does this hold up under scrutiny?"
P-Freethinker	Probabilistic	Reframing, lateral thinking	"What if the question is wrong?"
P-Arbiter	Probabilistic	Novelty scoring, upside potential	"Is this different enough to matter?"
Meta-Arbiter	—	Cross-group synthesis	"What survives both worldviews?"

D/P Flow (mode=dp)

Single-round (rounds=1):

Spawn D-Freethinker and P-Freethinker in parallel — each generates ideas.
Spawn D-Arbiter and P-Arbiter in parallel — each evaluates their group's ideas, produces shortlist.
Spawn Meta-Arbiter with both group shortlists — selects primary/secondary ideas, identifies merges.

Multi-round (rounds=N):

Round 1: Parallel ideation (both freethinkers) → parallel assessment (both arbiters) → bridge packets.
Round 2..N: Each group receives the other's bridge packet. Freethinkers revise/extend. Arbiters re-evaluate.
Final: Meta-Arbiter receives final shortlists from both groups.

Total subagent calls: 5 (single-round) or 4N+1 (multi-round).

When to use D/P vs Personality

Personality (mode=personality): "Should we do X?" — opinion/judgment calls, strategy debates
D/P (mode=dp): "How should we solve X?" — problem-solving, technical design, generating concrete approaches

Personality Mode — Advisor Roster

Role	Lens	System stance
Pragmatist	Feasibility, cost, effort	"Can we actually do this?"
Visionary	Long-term potential, innovation	"What if we went bigger?"
Skeptic	Risk, failure modes, edge cases	"What could go wrong?"

The referee is a separate agent: balanced, fair, synthesis-oriented.

Flows

1. Parallel + Synthesis (default)

Single-round version (rounds=1):

Spawn all 3 advisors simultaneously via sessions_spawn (mode=run).
Each advisor receives the same topic prompt with their personality instructions.
Wait for all 3 to complete (push-based).
Spawn the referee with all 3 advisor outputs.
Referee produces the final verdict.

Multi-round version (rounds=N):

Round 1: Spawn all 3 advisors in parallel with opening position prompt.
Collect all outputs.
Round 2..N: For each rebuttal round, respawn all 3 advisors in parallel. Each receives:
- Their own prior position(s)
- All other advisors' prior round output
- Round-specific instructions (rebuttal prompt for middle rounds, final position prompt for last round)
Collect outputs after each round.
Referee: Spawn referee with the full debate transcript (all rounds, all advisors).

2. Sequential Rounds

Single-round (rounds=1):

Spawn advisors one at a time, each seeing prior advisor outputs.
Spawn referee with full thread.

Multi-round (rounds=N):

Round 1: Advisors go sequentially, each seeing prior advisors in that round.
Round 2..N: Each advisor sees ALL prior round outputs before giving their rebuttal/final take.
Referee: Gets the full thread.

3. Debate

Always multi-round (minimum rounds=2, default rounds=3 for this flow):

Round 1: Parallel opening takes.
Round 2..N-1: Cross-rebuttals — each advisor responds to all others.
Round N: Final positions.
Referee: Gets full debate transcript, notes evolution of positions.

Model Selection

Pick model tier based on topic complexity:

light (casual brainstorm, simple pros/cons): default model for advisors and referee.
medium (architecture decisions, strategy): default model for advisors, stronger model for referee.
heavy (critical decisions, deep analysis): stronger model for all agents.

The caller (main agent) determines tier before spawning.

Round-Specific Prompt Guidance

See references/prompts.md for all prompt templates. Key points:

Round 1 (Opening): Full advisor system prompt + topic. Ask for opening position.
Middle rounds (Rebuttals): Include prior positions from ALL advisors. Ask: where do you agree, push back, or change your mind? Keep shorter (200-300 words).
Final round: Ask for final synthesis — what changed, what held firm, final recommendation in 2-3 sentences. Keep shortest (150-250 words).
Referee (multi-round): Include the FULL debate transcript organized by round. Ask referee to note position evolution, not just final states.

Experimental Findings

Tested all 3 flows on the same topic ("Should AI assistants have persistent memory?"):

Parallel 1-round vs Parallel 3-round

1-round: Fast, good for quick takes. Advisors give independent positions, referee synthesizes. Clean but no cross-pollination — advisors can't respond to each other's arguments.
3-round: Significantly richer. Positions evolved meaningfully — the Visionary stepped back from always-on after engaging with Skeptic's arguments, the Skeptic softened on trajectory. Referee captured evolution. Best overall depth-to-cost ratio.
Takeaway: 3 rounds is the sweet spot. 1 round works for quick brainstorms. More than 3 likely hits diminishing returns (positions converge by round 3).

Sequential vs Parallel

Sequential: Later advisors build directly on earlier ones — less redundancy, more focused rebuttals. The Skeptic (speaking last) gave the sharpest response because they could address both prior positions directly. But earlier advisors can't respond to later ones without extra rounds.
Parallel: Advisors are more independent, sometimes overlapping. But each brings a genuinely uninfluenced perspective in round 1, which can surface blind spots that sequential misses.
Takeaway: Sequential produces tighter dialogue in fewer total subagent calls (3 advisors + 1 referee = 4 calls). Parallel gives more independent coverage but needs multi-round for depth (3 advisors x 3 rounds + 1 referee = 10 calls).

Debate (parallel 3-round) vs Parallel 3-round

The flows are mechanically identical in our implementation. The distinction is mainly about prompt framing — debate prompts emphasize direct engagement ("respond to the Visionary's claim that...") while parallel rebuttal prompts are more general ("where do you agree or push back?").
Takeaway: These can be unified. The "debate" label is useful for user-facing intent ("I want them to argue") but doesn't need a separate mechanical flow.

Cost profile (approximate, per run on default model tier)

Parallel 1-round: ~4 subagent calls, ~60k tokens total
Sequential 1-round: ~4 subagent calls, ~55k tokens total (slightly less due to no parallel redundancy)
Parallel/Debate 3-round: ~10 subagent calls, ~130k tokens total

Recommended defaults by use case

Quick brainstorm: flow=parallel, rounds=1 — fast, cheap, good enough for casual topics
Balanced analysis: flow=parallel, rounds=3 — best depth-to-cost ratio, recommended default for substantive topics
Tight dialogue: flow=sequential, rounds=1 — fewest calls, good for focused topics where building on each other matters
Deep dive: flow=debate, rounds=3 — same as parallel 3-round with more combative prompting

Implementation

Read scripts/council.sh for the orchestration logic. For programmatic invocation, the main agent calls sessions_spawn directly following the patterns above.

Configuration

Advisor personalities can be customized per-invocation by overriding the roster. Default roster and prompt templates live in references/prompts.md.

TODO (revisit later)

Revisit subagent personality depth — richer backstories, communication styles
Revisit skill name — "council" works for now
Consider unifying debate and parallel flows (mechanically identical, differ only in prompt tone)
Explore whether 2 rounds is sufficient for most topics (vs 3)
Test with different model tiers for advisors vs referee
Test D/P mode end-to-end — validate prompt templates produce useful structured output
Tune D/P ideas_per_round and scoring thresholds
Consider hybrid mode: D/P groups for ideation then personality advisors for evaluation
Bridge packet design: what info to exchange between groups in multi-round D/P

11 KiB Raw Blame History