Files
swarm-master/swarm-common/obsidian-vault/will/will-shared-zap/Runbooks/Atlas Kanban Durable Project Workflow.md
T
2026-05-20 17:36:42 -07:00

9.9 KiB

type, system, status, owner, created, related
type system status owner created related
runbook atlas-kanban active Will / Atlas 2026-05-14
Atlas Capability Upgrade Program
Kanban Task Graph Templates

Atlas Kanban Durable Project Workflow

Use this runbook when a request should become durable, reviewable work instead of staying inside one chat thread.

When to create or reuse a board

Create or reuse a Kanban board when any of these are true:

  • The work spans multiple roles, e.g. research, engineering, ops, review, writing.
  • The work should survive an interrupted chat, gateway restart, or laptop sleep.
  • A human may need to unblock, review, approve, or redirect a step.
  • The output is an artifact, code/config change, runbook, or decision with audit value.
  • The work can be parallelized or needs dependency gates.

Use a dedicated board for durable programs and domains. Example:

hermes kanban init atlas-capability-upgrades

Prefer one board per project/domain when unrelated tasks would make status hard to read. Use tenants only as soft namespacing inside a board; use separate boards for hard isolation.

Default workflow

Default graph:

plan/spec -> task graph -> specialist execution -> reviewer gate -> Atlas synthesis/status report

Common stage mapping:

  1. Spec/discovery: ops, researcher, or writer creates an implementation-ready spec.
  2. Orchestration: orchestrator verifies available profiles, then creates child cards and parent links.
  3. Specialist execution: engineer, ops, researcher, writer, or other named specialist does the work in the right workspace.
  4. Reviewer gate: reviewer or human reviews code/config/ops changes before they are considered done.
  5. Synthesis/status: atlas or writer summarizes outcomes, artifacts, verification, blocked items, and next actions.

Step 0: inspect current state

Before creating or changing tasks:

hermes profile list
hermes kanban --board <board> stats --json
hermes kanban --board <board> list

Before touching a repository or vault:

git -C <repo> status --short --branch

Do not route work to unknown profile names. The dispatcher will not invent missing assignees.

Workspace choice

Choose the narrowest workspace that preserves the needed state:

Workspace kind Use when Notes
scratch Research, specs, status summaries, throwaway artifacts Fresh per run. Good default for notes that will be copied to durable paths.
dir:<absolute-path> Shared persistent notes/config/project directory Use for Obsidian/vault or other durable shared directories. Inspect status first.
worktree Code changes in a git repo Use for implementation/review to isolate branches. Commit only intentional durable changes.

For config or repo changes, preserve the stable default profile as the production gateway unless Will explicitly asks to migrate it.

Profile routing policy

Use actual profiles from hermes profile list. Current recommended routing:

  • orchestrator: decompose goals, create/link tasks, status synthesis. Do not implement.
  • researcher: source discovery, comparisons, requirements investigation.
  • engineer: code changes, tests, PR-ready diffs in worktree workspaces.
  • ops: Hermes/profile/gateway/config/runbook operations and host/service diagnostics.
  • reviewer: code/config/artifact review, approval, or blocking findings.
  • writer: durable notes, reports, summaries, and polished user-facing artifacts.
  • glm-simple: cheap/simple low-risk text cleanup or bounded deterministic tasks after an explicit routing decision.
  • atlas: synthesis/status for Will-facing updates when a durable project needs a named Atlas output.
  • default: stable production gateway only; do not use as a worker target unless Will explicitly asks.

Creating dependencies

Use parent links at creation time whenever one task truly needs another task's output.

Example shape:

SPEC=$(hermes kanban --board <board> create "spec: <feature>" --assignee ops --workspace scratch --priority 10 --json | jq -r .task_id)
IMPL=$(hermes kanban --board <board> create "implement: <feature>" --assignee engineer --parent "$SPEC" --workspace worktree --skill test-driven-development --max-runtime 2h --priority 8 --json | jq -r .task_id)
REVIEW=$(hermes kanban --board <board> create "review: <feature>" --assignee reviewer --parent "$IMPL" --workspace worktree --skill github-code-review --priority 7 --json | jq -r .task_id)
hermes kanban --board <board> create "synthesize: <feature> status" --assignee writer --parent "$REVIEW" --workspace scratch --priority 5

Rules:

  • Parallel lanes should be siblings with no parent links.
  • Dependent stages should use --parent/parents=[...]; do not rely on prose like "wait for X".
  • Create parents first, capture returned ids, then create children with parent ids immediately.
  • Avoid linking after creation when a child could be claimed before the dependency exists.

Worker start and handoff standard

Every worker starts by reading its card and parent handoffs. In a worker, use the Kanban tools; as a human, use:

hermes kanban --board <board> show <task_id>
hermes kanban --board <board> runs <task_id>
hermes kanban --board <board> log <task_id>

Use structured metadata in completions/comments:

{
  "changed_files": ["path/to/file"],
  "artifact_paths": ["path/to/artifact.md"],
  "verification": ["exact command or check performed"],
  "decisions": ["short decision and rationale"],
  "residual_risk": ["known risk or empty list"],
  "retry_notes": "what a retry should avoid or inspect first"
}

Never put secrets, raw tokens, raw personal data, long logs, or unrelated transcript text in task bodies, comments, or metadata.

Reviewer-gate policy

Code, config, and ops changes normally do not complete directly from the implementer.

Implementer workflow:

  1. Make narrow, reviewable changes.
  2. Verify with targeted commands.
  3. Add a Kanban comment containing review-required handoff and structured metadata.
  4. Block the task with reason="review-required: <one-line summary>".

Example worker-side shape:

kanban_comment(
    body="review-required handoff:\n" + json.dumps({
        "changed_files": ["..."],
        "tests_run": ["..."],
        "verification": ["..."],
        "diff_path": "...",
        "decisions": ["..."],
        "residual_risk": []
    }, indent=2)
)
kanban_block(reason="review-required: implementation finished; needs review before merge")

Reviewer workflow:

  • Complete the review task if approved, with findings and verification in metadata.
  • Block with exact required changes if not approved.
  • If changes are needed, create a new follow-up implementation task assigned to the original specialist; do not turn the reviewer card into implementation work.

Docs, research, and spec-only tasks may complete directly when the artifact itself is the deliverable and no risk-bearing change was made.

Recovery procedure

Do not blindly unblock. Inspect prior state first:

hermes kanban --board <board> show <task_id>
hermes kanban --board <board> runs <task_id>
hermes kanban --board <board> log <task_id>

Blocked task:

  1. Read the block reason and comments.
  2. If it asks for a human decision, answer in a comment, then unblock.
  3. If it is review-required, route review or perform review before unblocking.
  4. Unblock only after the missing input is present:
hermes kanban --board <board> unblock <task_id>

Spawn failure or gave-up task:

  1. Confirm the assignee exists:
hermes profile list
  1. Check missing skill, credential, model, or PATH in that profile without printing secrets.
  2. Reassign only to an existing profile when needed:
hermes kanban --board <board> reassign <task_id> <profile> --reclaim --reason "profile recovery: <short reason>"

Crashed, timed-out, or stale running task:

  1. Read runs/logs/comments first.
  2. Prefer chunking scope, reducing memory, or raising max runtime only after understanding the prior failure.
  3. Reclaim only when the worker is stale or known dead:
hermes kanban --board <board> reclaim <task_id> --reason "stale claim: <evidence>"

Watch active failures:

hermes kanban --board <board> watch --kinds completed,blocked,gave_up,crashed,timed_out

Status reporting standard

A daily or phase status report should be produced from board state, not from memory alone.

Inputs:

hermes kanban --board <board> stats --json
hermes kanban --board <board> list
hermes kanban --board <board> watch --kinds completed,blocked,gave_up,crashed,timed_out

Report format:

# <Board/project> status - <date>

## Completed since last report
- <task_id/title>: artifact, verification, downstream effect.

## Running/ready by profile
- ops: ...
- engineer: ...
- reviewer: ...
- writer: ...

## Blocked/gave_up items
- <task_id/title>: owner/profile, exact blocker, next action, who owns it.

## Risks / decisions needed
- ...

## Next planned graph expansion
- ...

delegate_task vs Kanban

Use delegate_task for short, synchronous, non-durable reasoning inside the current turn when the parent needs the answer immediately.

Use Kanban when work:

  • Must survive restarts or interruptions.
  • Needs named persistent profiles.
  • Requires human review/unblock.
  • Spans multiple roles or artifacts.
  • Needs dependency tracking or audit history.

A Kanban worker may call delegate_task internally for bounded subtasks, but not as a substitute for board handoffs.

Verification checklist

Before considering a durable project graph healthy:

  • hermes profile list confirms every assignee exists.
  • hermes kanban --board <board> stats --json returns successfully.
  • Config parse confirms kanban.dispatch_in_gateway=true unless intentionally using manual dispatch.
  • Task graph has parent links for true dependencies and no links for independent lanes.
  • Code/config/ops changes end in reviewer-gated handoffs.
  • Status reports include blocked/gave_up items and exact next actions.