docs(council): add experimental findings from all 3 flow types
- Tested parallel 1-round, sequential 1-round, debate/parallel 3-round - 3 rounds is sweet spot: positions converge, meaningful evolution - Sequential most token-efficient; parallel 3-round best depth-to-cost - Debate and parallel 3-round mechanically identical (prompt tone differs) - Added cost profiles, recommended defaults by use case - Updated TODOs: unify flows, test 2-round, test mixed model tiers
This commit is contained in:
@@ -23,3 +23,16 @@
|
||||
- Revisit advisor personality depth (richer backstories).
|
||||
- Revisit skill name ("council" is placeholder).
|
||||
- Experiment with different round counts and flows for optimal depth/cost tradeoffs.
|
||||
|
||||
## Council experiments completed
|
||||
- Ran all 3 flow types on same topic ("Should AI assistants have persistent memory?"):
|
||||
1. **Parallel 1-round** (Experiment 1): Fast, clean, independent perspectives. 4 subagent calls, ~60k tokens.
|
||||
2. **Sequential 1-round** (Experiment 2): Tighter dialogue — later advisors build on earlier. 4 calls, ~55k tokens. Less redundancy.
|
||||
3. **Debate/Parallel 3-round** (Experiment 3): Richest output. Positions evolved significantly across rounds (Visionary backed off always-on, Skeptic softened on trajectory). 10 calls, ~130k tokens.
|
||||
- Key findings:
|
||||
- 3 rounds is the sweet spot for depth — positions converge by round 3.
|
||||
- Sequential is most token-efficient for focused topics.
|
||||
- Parallel 3-round is best depth-to-cost ratio for substantive topics.
|
||||
- Debate and parallel 3-round are mechanically identical — differ only in prompt tone.
|
||||
- Updated SKILL.md with experimental findings, recommended defaults by use case, cost profiles.
|
||||
- New TODOs added: unify debate/parallel flows, test 2-round sufficiency, test mixed model tiers.
|
||||
|
||||
Reference in New Issue
Block a user