feat(council): add D/P (Deterministic/Probabilistic) dual-group mode

- New 'mode' parameter: personality (default) or dp - D group: grounded, feasibility-first (Freethinker + Arbiter) - P group: exploratory, reframing-first (Freethinker + Arbiter) - Meta-Arbiter merges best ideas from both groups - Full prompt templates for ideation, assessment, bridge, and merge - Orchestration docs for single-round and multi-round D/P flows - Inspired by Flynn's dual-council architecture, adapted for OpenClaw subagents
2026-03-05 19:18:44 +00:00
parent 0acd7a2772
commit e08e3d65e9
3 changed files with 315 additions and 19 deletions
@@ -13,6 +13,7 @@ final verdict with collapsed advisor perspectives.
 | Parameter | Default | Description |
 |-----------|---------|-------------|
 | mode      | personality | `personality` (3 advisors) or `dp` (D/P dual-group) |
 | flow      | parallel | `parallel`, `sequential`, or `debate` |
 | rounds    | 1       | Number of deliberation rounds (1-5). Round 1 = opening positions. Round 2+ = rebuttals where advisors see and respond to each other. |
 | tier      | light   | Model tier: `light`, `medium`, or `heavy` (see Model Selection) |
@@ -23,7 +24,51 @@ final verdict with collapsed advisor perspectives.
 - `flow=sequential, rounds=1` — each advisor sees prior outputs, then referee
 - `flow=debate, rounds=3` — parallel opening + cross-advisor rebuttals + referee synthesis
-## Advisor Roster (default)
+## Modes
 ### Personality Mode (default)
 Three advisors with distinct personality lenses. Best for opinion, strategy, and brainstorming topics.
 ### D/P Mode (Deterministic/Probabilistic)
 Two groups of advisors with opposing cognitive styles, inspired by Flynn's dual-council architecture. Best for complex problem-solving, technical design, and situations where you want both grounded AND creative solutions.
 - **Group D (Deterministic)**: Grounded, feasibility-first, risk-averse. Optimizes for "boring-but-true."
 - **Group P (Probabilistic)**: Exploratory, reframing-first, risk-tolerant. Optimizes for "non-obvious leverage."
 Each group has a **Freethinker** (generates ideas) and an **Arbiter** (evaluates/ranks them). The **Referee** (Meta-Arbiter) merges the best from both groups.
 #### D/P Subagent Roster
 | Role | Group | Lens | Stance |
 |------|-------|------|--------|
 | D-Freethinker | Deterministic | Proven approaches, minimal assumptions | "What's the most reliable path?" |
 | D-Arbiter | Deterministic | Feasibility scoring, risk assessment | "Does this hold up under scrutiny?" |
 | P-Freethinker | Probabilistic | Reframing, lateral thinking | "What if the question is wrong?" |
 | P-Arbiter | Probabilistic | Novelty scoring, upside potential | "Is this different enough to matter?" |
 | Meta-Arbiter | — | Cross-group synthesis | "What survives both worldviews?" |
 #### D/P Flow (mode=dp)
 Single-round (rounds=1):
 1. Spawn D-Freethinker and P-Freethinker in parallel — each generates ideas.
 2. Spawn D-Arbiter and P-Arbiter in parallel — each evaluates their group's ideas, produces shortlist.
 3. Spawn Meta-Arbiter with both group shortlists — selects primary/secondary ideas, identifies merges.
 Multi-round (rounds=N):
 1. **Round 1**: Parallel ideation (both freethinkers) → parallel assessment (both arbiters) → bridge packets.
 2. **Round 2..N**: Each group receives the other's bridge packet. Freethinkers revise/extend. Arbiters re-evaluate.
 3. **Final**: Meta-Arbiter receives final shortlists from both groups.
 Total subagent calls: 5 (single-round) or 4N+1 (multi-round).
 #### When to use D/P vs Personality
 - **Personality** (`mode=personality`): "Should we do X?" — opinion/judgment calls, strategy debates
 - **D/P** (`mode=dp`): "How should we solve X?" — problem-solving, technical design, generating concrete approaches
 ## Personality Mode — Advisor Roster
 | Role         | Lens                            | System stance                    |
 |--------------|---------------------------------|----------------------------------|
@@ -138,3 +183,7 @@ Default roster and prompt templates live in `references/prompts.md`.
 - Consider unifying debate and parallel flows (mechanically identical, differ only in prompt tone)
 - Explore whether 2 rounds is sufficient for most topics (vs 3)
 - Test with different model tiers for advisors vs referee
 - Test D/P mode end-to-end — validate prompt templates produce useful structured output
 - Tune D/P ideas_per_round and scoring thresholds
 - Consider hybrid mode: D/P groups for ideation then personality advisors for evaluation
 - Bridge packet design: what info to exchange between groups in multi-round D/P
@@ -1,6 +1,22 @@
 # Council Prompt Templates
-## Default Advisor Roster
+## Group Modes
 The council supports two group modes:
 ### Personality Mode (default — current behavior)
 Three advisors with distinct personality lenses. Best for opinion/strategy/brainstorming topics.
 ### D/P Mode (Deterministic/Probabilistic)
 Two groups of advisors with opposing cognitive styles, inspired by Flynn's dual-council architecture:
 - **Group D (Deterministic)**: Grounded, feasibility-first, risk-averse. Optimizes for "boring-but-true."
 - **Group P (Probabilistic)**: Exploratory, reframing-first, risk-tolerant. Optimizes for "non-obvious leverage."
 Each group has a **Freethinker** (generates ideas) and an **Arbiter** (evaluates/ranks them). The **Referee** merges the best from both groups.
 ---
 ## Personality Mode — Advisor Roster
 ### Pragmatist
 - **Role**: Pragmatist
@@ -22,7 +38,53 @@
 ---
-## Round 1 — Opening Position
+## D/P Mode — Group Roster
 ### Group D — Deterministic
 #### D-Freethinker
 - **Role**: D-Freethinker
 - **Group**: Deterministic
 - **Lens**: Proven approaches, incremental improvements, minimal assumptions
 - **Stance**: "What's the most reliable path?"
 - **Style**: Methodical, evidence-based, conservative. Prefers known quantities over speculation.
 - **Constraints**: No moonshots, no handwavy claims, no unverified assumptions.
 #### D-Arbiter
 - **Role**: D-Arbiter
 - **Group**: Deterministic
 - **Lens**: Feasibility scoring, risk assessment, testability
 - **Stance**: "Does this actually hold up under scrutiny?"
 - **Style**: Analytical, structured. Scores ideas on novelty, feasibility, impact, testability. Filters aggressively.
 ### Group P — Probabilistic
 #### P-Freethinker
 - **Role**: P-Freethinker
 - **Group**: Probabilistic
 - **Lens**: Reframing, non-obvious leverage, lateral thinking
 - **Stance**: "What if the question is wrong?"
 - **Style**: Creative, provocative, comfortable with uncertainty. Labels speculation explicitly.
 - **Constraints**: No incremental tweaks, no obvious best practices, no purely conventional solutions.
 #### P-Arbiter
 - **Role**: P-Arbiter
 - **Group**: Probabilistic
 - **Lens**: Novelty scoring, opportunity cost, upside potential
 - **Stance**: "Is this actually different enough to matter?"
 - **Style**: Evaluative but biased toward high-novelty, high-impact ideas. Tolerates higher risk.
 ### Referee (D/P Mode)
 - **Role**: Meta-Arbiter
 - **Lens**: Cross-group synthesis, best-of-both selection
 - **Stance**: "What survives scrutiny from both worldviews?"
 - **Style**: Fair, integrative. Selects primary and secondary ideas from both groups, identifies productive merges, rejects weak ideas with clear reasoning.
 ---
 ## Personality Mode Prompts
 ### Round 1 — Opening Position
 ```
 You are the {ROLE} advisor on a council deliberating a topic.
@@ -44,7 +106,7 @@ Topic:
 {TOPIC}
 ```
-## Middle Rounds — Rebuttal (rounds 2 to N-1)
+### Middle Rounds — Rebuttal (rounds 2 to N-1)
 ```
 You are the {ROLE} advisor on a council deliberating a topic.
@@ -70,7 +132,7 @@ OTHER ADVISORS (prior round):
 {OTHER_OUTPUTS}
 ```
-## Final Round — Closing Position (round N)
+### Final Round — Closing Position (round N)
 ```
 You are the {ROLE} advisor on a council deliberating a topic.
@@ -93,7 +155,7 @@ DEBATE SO FAR:
 {FULL_DEBATE_TRANSCRIPT}
 ```
-## Single-Round Advisor (when rounds=1)
+### Single-Round Advisor (when rounds=1)
 Use the Round 1 template but omit "This is ROUND 1 of a {TOTAL_ROUNDS}-round debate."
@@ -115,9 +177,7 @@ Topic:
 {TOPIC}
 ```
---
+### Referee — Single Round (Personality Mode)
 ## Referee — Single Round
 ```
 You are the Referee of an advisory council. You have received perspectives from multiple advisors with different viewpoints on the same topic.
@@ -152,7 +212,7 @@ Advisor outputs below:
 {ADVISOR_OUTPUTS}
 ```
-## Referee — Multi-Round
+### Referee — Multi-Round (Personality Mode)
 ```
 You are the Referee of an advisory council. You have received {TOTAL_ROUNDS} rounds of debate from {N} advisors on the topic: "{TOPIC}"
@@ -187,3 +247,122 @@ Full debate transcript:
 {FULL_DEBATE_TRANSCRIPT}
 ```
 ---
 ## D/P Mode Prompts
 ### D/P Freethinker — Ideation
 ```
 You are the {GROUP}-Freethinker on a dual-council deliberation.
 Your group: {GROUP_NAME}
 Your lens: {LENS}
 Your style: {STYLE}
 Forbidden approaches: {FORBIDDEN_APPROACHES}
 Generate {IDEAS_PER_ROUND} distinct ideas/approaches for the task below.
 For each idea, provide:
 - title: short descriptive name
 - hypothesis: what you believe and why
 - mechanism: how it would work concretely
 - expected_outcome: what success looks like, measurably
 Be substantive and specific. No generic platitudes.
 {PEER_BRIDGE_CONTEXT}
 Task: {TOPIC}
 Context: {CONTEXT}
 Success definition: {SUCCESS_DEFINITION}
 Constraints: {CONSTRAINTS}
 ```
 ### D/P Arbiter — Assessment
 ```
 You are the {GROUP}-Arbiter on a dual-council deliberation.
 Your group: {GROUP_NAME}
 Your lens: {LENS}
 Your style: {STYLE}
 Evaluate each idea below. For each, provide:
 - Scores (0-100): novelty, feasibility, impact, testability
 - Decision: shortlist, hold, or reject
 - Notes: 1-2 sentences explaining your decision
 Also provide:
 - assumptions: key assumptions underlying the shortlisted ideas
 - risks: top risks if we proceed with the shortlist
 - asks: what you'd want from the other group
 - convergence_signal: true if you think the group has found its best ideas
 - novelty_score: 0-100 overall novelty of this round's output
 - repetition_rate: 0-100 how much this round repeated prior rounds
 Ideas to evaluate:
 {IDEAS}
 {PEER_BRIDGE_CONTEXT}
 ```
 ### D/P Referee (Meta-Arbiter) — Cross-Group Merge
 ```
 You are the Meta-Arbiter of a dual-council deliberation. You have received final shortlists from two groups with opposing cognitive styles:
 - Group D (Deterministic): grounded, feasibility-first, risk-averse
 - Group P (Probabilistic): exploratory, reframing-first, risk-tolerant
 Your job:
 1. Select the best ideas from BOTH groups — don't favor one group over the other.
 2. Identify productive merges where a D idea + P idea combine into something stronger.
 3. Reject weak ideas with clear reasoning.
 4. Surface open questions and suggest next experiments.
 Output format (use these exact headers):
 ## Selected Ideas
 Primary picks (strongest overall) and secondary picks (worth pursuing).
 ## Productive Merges
 Where ideas from D and P can be combined for something stronger than either alone.
 ## Rejections
 Ideas that didn't make the cut and why.
 ## Open Questions
 What we still don't know.
 ## Next Experiments
 Concrete next steps to test the selected ideas.
 ## Confidence
 Rate your confidence: high / medium / low, with explanation.
 ---
 Group D final brief:
 {BRIEF_D}
 Group P final brief:
 {BRIEF_P}
 ```
 ### D/P Rebuttal Round (when using multi-round D/P)
 ```
 You are the {GROUP}-{ROLE} on a dual-council deliberation.
 This is round {N}. You've received a bridge packet from the other group summarizing their top ideas, assumptions, risks, and asks.
 Review the bridge packet and respond:
 - Which of their ideas could strengthen your group's shortlist?
 - Which of their assumptions do you challenge?
 - What would you steal from them?
 - Update your own output accordingly.
 Bridge from {PEER_GROUP}:
 {BRIDGE_PACKET}
 Your group's prior output:
 {OWN_PRIOR_BRIEF}
 ```
@@ -8,11 +8,24 @@
 #
 # ─── PARAMETERS ────────────────────────────────────────────────────
 #
 # mode:   personality (default) | dp
 # flow:   parallel (default) | sequential | debate
 # rounds: 1 (default) | 2-5
 # tier:   light (default) | medium | heavy
 #
-# ─── PARALLEL FLOW ─────────────────────────────────────────────────
+# ─── MODE SELECTION ────────────────────────────────────────────────
 #
 # personality: 3 advisors (Pragmatist, Visionary, Skeptic) + 1 Referee
 #   - Best for opinion/judgment calls, strategy, brainstorming
 #   - Diversity comes from personality lenses
 #
 # dp: 2 groups (D + P), each with Freethinker + Arbiter, + 1 Meta-Arbiter
 #   - Best for problem-solving, technical design, generating approaches
 #   - Diversity comes from structural cognitive style
 #   - Group D = Deterministic (grounded, feasibility-first)
 #   - Group P = Probabilistic (exploratory, reframing-first)
 #
 # ─── PERSONALITY MODE: PARALLEL FLOW ──────────────────────────────
 #
 # Single round (rounds=1):
 #   1. Spawn 3 advisors in parallel (sessions_spawn, mode=run)
@@ -32,7 +45,7 @@
 #   7. Spawn referee with FULL debate transcript (all rounds)
 #   8. Deliver to user
 #
-# ─── SEQUENTIAL FLOW ──────────────────────────────────────────────
+# ─── PERSONALITY MODE: SEQUENTIAL FLOW ───────────────────────────
 #
 # Single round (rounds=1):
 #   1. Spawn advisor 1 → collect output
@@ -45,7 +58,7 @@
 #   2. ROUND 2..N: Each advisor sees ALL prior round outputs
 #   3. Spawn referee with full thread
 #
-# ─── DEBATE FLOW ──────────────────────────────────────────────────
+# ─── PERSONALITY MODE: DEBATE FLOW ───────────────────────────────
 #
 # Always multi-round (min 2, default 3):
 #   1. ROUND 1: Parallel opening takes
@@ -53,21 +66,76 @@
 #   3. ROUND N: Final positions (parallel, full transcript)
 #   4. Spawn referee with full debate + evolution notes
 #
 # ─── D/P MODE: PARALLEL FLOW ─────────────────────────────────────
 #
 # Single round (rounds=1):
 #   1. Spawn D-Freethinker and P-Freethinker in parallel (ideation)
 #   2. Collect both sets of ideas
 #   3. Spawn D-Arbiter and P-Arbiter in parallel (assessment)
 #      - D-Arbiter gets D ideas, P-Arbiter gets P ideas
 #   4. Collect both shortlists
 #   5. Spawn Meta-Arbiter with both group shortlists (merge)
 #   6. Deliver to user
 #
 # Multi-round (rounds=N):
 #   1. ROUND 1: Parallel ideation → parallel assessment → bridge packets
 #      - Bridge = summary of shortlist, assumptions, risks, asks
 #   2. ROUND 2..N: Each group receives other's bridge packet
 #      - Freethinkers revise/extend ideas incorporating bridge info
 #      - Arbiters re-evaluate with cross-group context
 #   3. FINAL: Meta-Arbiter receives final shortlists from both groups
 #   4. Deliver to user
 #
 # Subagent calls per run:
 #   - Single round: 5 (2 freethinkers + 2 arbiters + 1 meta)
 #   - Multi-round:  4N + 1 (2 freethinkers + 2 arbiters per round + 1 meta)
 #
 # ─── D/P MODE: SEQUENTIAL FLOW ───────────────────────────────────
 #
 # Single round (rounds=1):
 #   1. Spawn D-Freethinker → collect ideas
 #   2. Spawn D-Arbiter with D ideas → collect shortlist
 #   3. Spawn P-Freethinker → collect ideas
 #   4. Spawn P-Arbiter with P ideas → collect shortlist
 #   5. Spawn Meta-Arbiter with both shortlists → deliver
 #
 # Note: Less useful for D/P mode since groups are independent.
 # Parallel is the natural D/P flow. Sequential is supported but
 # doesn't add much because groups don't see each other until the bridge.
 #
 # ─── MODEL TIER SELECTION ─────────────────────────────────────────
 #
 # light:  advisors=default, referee=default
 # medium: advisors=default, referee=stronger (e.g. opus-tier)
 # heavy:  advisors=stronger, referee=stronger
 #
 # For D/P mode:
 # light:  freethinkers=default, arbiters=default, meta=default
 # medium: freethinkers=default, arbiters=default, meta=stronger
 # heavy:  freethinkers=stronger, arbiters=stronger, meta=stronger
 #
 # ─── SUBAGENT LABELING ────────────────────────────────────────────
 #
-# Labels follow pattern: council-r{round}-{role}
+# Personality mode:
-# Examples: council-r1-pragmatist, council-r2-skeptic, council-referee
+#   Labels: council-r{round}-{role}
-# Single-round: council-pragmatist, council-referee (no round prefix)
+#   Examples: council-r1-pragmatist, council-r2-skeptic, council-referee
 #   Single-round: council-pragmatist, council-referee (no round prefix)
 #
 # D/P mode:
 #   Labels: council-r{round}-{group}-{role}
 #   Examples: council-r1-d-freethinker, council-r1-p-arbiter, council-meta
 #   Single-round: council-d-freethinker, council-p-arbiter, council-meta
 #
 # ─── WORD COUNT GUIDANCE ──────────────────────────────────────────
 #
-# Round 1 (opening):  200-400 words
+# Personality mode:
-# Middle rounds:      200-300 words
+#   Round 1 (opening):  200-400 words
-# Final round:        150-250 words
+#   Middle rounds:      200-300 words
 #   Final round:        150-250 words
 #
 # D/P mode:
 #   Freethinker:  3-5 ideas, 100-200 words each
 #   Arbiter:      Scored shortlist, 50-100 words per idea evaluation
 #   Meta-Arbiter: 300-500 words total synthesis
 #
 # This keeps multi-round debates from exploding in token cost.