From cfdd4484950de7e5b9a0ccf21b62bc1d9115f619 Mon Sep 17 00:00:00 2001
From: William Valentin <william.valentin.info@gmail.com>
Date: Fri, 6 Feb 2026 16:52:38 -0800
Subject: [PATCH] docs: add Docker sandbox and multi-agent routing
 design/implementation plans

---
 ...06-p2-docker-sandbox-multi-agent-design.md |  173 ++
 ...cker-sandbox-multi-agent-implementation.md | 1832 +++++++++++++++++
 2 files changed, 2005 insertions(+)
 create mode 100644 docs/plans/2026-02-06-p2-docker-sandbox-multi-agent-design.md
 create mode 100644 docs/plans/2026-02-06-p2-docker-sandbox-multi-agent-implementation.md

diff --git a/docs/plans/2026-02-06-p2-docker-sandbox-multi-agent-design.md b/docs/plans/2026-02-06-p2-docker-sandbox-multi-agent-design.md
new file mode 100644
index 0000000..2b4509f
--- /dev/null
+++ b/docs/plans/2026-02-06-p2-docker-sandbox-multi-agent-design.md
@@ -0,0 +1,173 @@
+# P2: Docker Sandboxing + Multi-Agent Routing — Design
+
+**Date:** 2026-02-06
+**Status:** Approved
+**Priority:** P2 (completes all P2 work)
+
+---
+
+## Feature 1: Docker Sandboxing
+
+### Goal
+
+Channel sessions (Telegram, Discord, Slack, WhatsApp) execute `shell.exec` and `process.start` inside Docker containers. TUI and local WebSocket sessions continue running on the host.
+
+### Architecture
+
+Tool-level wrapping: sandboxed versions of dangerous tools (`shell.exec`, `process.start`) delegate to `docker exec` inside a per-session container. All other tools (file.read, web.fetch, memory.*, etc.) run on the host unchanged.
+
+```
+src/sandbox/
+  docker.ts         — DockerSandbox class (create/exec/destroy containers via CLI)
+  docker.test.ts    — Tests (mocked Docker CLI)
+  manager.ts        — SandboxManager (session→container mapping + lifecycle)
+  manager.test.ts   — Tests
+  tools.ts          — createSandboxedShellTool(), createSandboxedProcessStartTool()
+  tools.test.ts     — Tests
+  index.ts          — Barrel export
+```
+
+### Config Schema
+
+```yaml
+sandbox:
+  enabled: false              # opt-in
+  image: "node:22-slim"       # base container image
+  workspace_dir: "/workspace" # mount path inside container
+  network: "none"             # container network mode (none/bridge/host)
+  memory_limit: "512m"        # memory limit per container
+  cpu_limit: "1.0"            # CPU limit per container
+  timeout_seconds: 300        # auto-kill timeout per container
+```
+
+### DockerSandbox Class
+
+Wraps Docker CLI via `child_process.execFile` (no Docker SDK dependency):
+- `create()` — `docker create` with resource limits, bind mount, network mode
+- `start()` — `docker start`
+- `exec(command, opts)` — `docker exec` with timeout, returns stdout/stderr
+- `destroy()` — `docker rm -f`
+- `isRunning()` — `docker inspect` check
+
+### SandboxManager
+
+- `getOrCreate(sessionId, config)` — Lazy container creation on first tool call
+- `destroy(sessionId)` — Stop and remove container
+- `destroyAll()` — Shutdown hook for daemon cleanup
+
+### Sandboxed Tools
+
+- `createSandboxedShellTool(sandbox)` — Same `Tool` interface as `shell.exec`, but runs via `sandbox.exec(command)`. Preserves cwd (translated to container path), timeout, output truncation.
+- `createSandboxedProcessStartTool(sandbox)` — Wraps `process.start` to spawn via `docker exec -d` (detached mode).
+
+### Per-Session ToolRegistry
+
+When sandbox is active for a channel session, the daemon creates a cloned `ToolRegistry` that replaces `shell.exec` and `process.start` with sandboxed versions. All other tools reference the shared host registry.
+
+### Error Handling
+
+- Docker not installed → log warning at startup, fall through to host execution
+- Container creation fails → log error, return tool error (not crash)
+- Container timeout → `docker rm -f`, return timeout error
+- Docker daemon unavailable → graceful degradation with clear error messages
+
+---
+
+## Feature 2: Multi-Agent Routing
+
+### Goal
+
+Named agent configurations that can be assigned to channels, senders, or sender patterns. Each agent config specifies its own system prompt, model tier, tool profile, and sandbox setting.
+
+### Architecture
+
+```
+src/agents/
+  registry.ts        — AgentConfigRegistry (stores named AgentConfig objects)
+  router.ts          — AgentRouter (resolves {channel, senderId} → AgentConfig)
+  router.test.ts     — Tests
+  index.ts           — Barrel export
+```
+
+### Config Schema
+
+```yaml
+agent_configs:
+  assistant:
+    system_prompt: "You are a helpful assistant."
+    model_tier: default
+    tool_profile: messaging
+    sandbox: true
+
+  coder:
+    system_prompt: "You are a coding assistant. Focus on writing clean code."
+    model_tier: complex
+    tool_profile: coding
+    sandbox: true
+
+routing:
+  default_agent: assistant
+  channels:
+    discord: coder
+  senders:
+    "telegram:12345": coder
+    "slack:U0*": assistant
+```
+
+### AgentConfigRegistry
+
+Stores parsed `AgentConfig` objects by name:
+- `register(config)` — Add a named config
+- `get(name)` — Look up by name
+- `list()` — All registered configs
+- `loadFromConfig(rawConfig)` — Parse from validated YAML
+
+### AgentConfig Type
+
+```typescript
+interface AgentConfig {
+  name: string;
+  systemPrompt?: string;     // overrides global system prompt
+  modelTier?: ModelTier;     // fast/default/complex/local
+  toolProfile?: ToolProfile; // minimal/messaging/coding/full
+  toolOverrides?: ToolOverrideConfig;
+  sandbox?: boolean;         // use Docker sandbox (if globally enabled)
+}
+```
+
+### AgentRouter
+
+Resolves which `AgentConfig` to use for a given message:
+1. Check `senders` map — exact match first, then glob patterns (via `minimatch`)
+2. Check `channels` map — channel name match
+3. Fall back to `routing.default_agent`
+
+### Daemon Integration
+
+The `createMessageRouter()` function changes:
+1. On message: `agentRouter.resolve(channel, senderId)` returns agent config name
+2. Cache key: `${channel}:${senderId}:${agentConfigName}` (agent change = new orchestrator)
+3. Create `AgentOrchestrator` with resolved config's system prompt, model tier, tool policy
+4. If sandbox enabled for this config + globally: create per-session sandboxed ToolRegistry
+5. Otherwise: use shared host ToolRegistry
+
+---
+
+## Modified Files
+
+- `src/config/schema.ts` — Add `sandboxSchema`, `agentConfigSchema`, `routingSchema`
+- `src/config/index.ts` — Export new types
+- `src/daemon/index.ts` — Wire SandboxManager + AgentRouter into message handler
+- `src/tools/registry.ts` — Add `clone()` method for per-session copies
+
+## Testing
+
+- All Docker interactions mocked (no real Docker in tests)
+- Agent router tested with config fixtures (exact, glob, channel, default fallback)
+- Sandboxed tools tested with mocked Docker CLI exec
+- Integration tested via daemon message handler with mocked dependencies
+
+## Dependencies
+
+- No new npm dependencies (Docker CLI, `minimatch` already available or trivially implemented)
+- Runtime: Docker must be installed on host for sandbox feature to work (graceful degradation if absent)
diff --git a/docs/plans/2026-02-06-p2-docker-sandbox-multi-agent-implementation.md b/docs/plans/2026-02-06-p2-docker-sandbox-multi-agent-implementation.md
new file mode 100644
index 0000000..25e8360
--- /dev/null
+++ b/docs/plans/2026-02-06-p2-docker-sandbox-multi-agent-implementation.md
@@ -0,0 +1,1832 @@
+# P2: Docker Sandboxing + Multi-Agent Routing — Implementation Plan
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** Add Docker container sandboxing for channel tool execution and named agent configuration with config-based routing.
+
+**Architecture:** Tool-level wrapping — sandboxed `shell.exec` and `process.start` delegate to `docker exec` inside per-session containers. Agent config registry stores named agent definitions (system prompt, model tier, tool profile, sandbox flag) with config-based routing that maps channels/senders to agent configs.
+
+**Tech Stack:** TypeScript (ES2022, NodeNext), Zod schemas, Vitest tests, Docker CLI (no SDK dependency), `child_process.execFile`.
+
+---
+
+## Task 1: Config Schema — Sandbox + Agent Configs + Routing
+
+**Files:**
+- Modify: `src/config/schema.ts:164-231`
+- Modify: `src/config/index.ts:1-3`
+
+**Step 1: Write the failing test**
+
+Create file: `src/config/schema.test.ts`
+
+```typescript
+import { describe, it, expect } from 'vitest';
+import { configSchema } from './schema.js';
+
+describe('configSchema — sandbox', () => {
+  const minimalConfig = {
+    telegram: { bot_token: 'test', allowed_chat_ids: [1] },
+    models: { default: { provider: 'anthropic', model: 'claude-3' } },
+  };
+
+  it('defaults sandbox to disabled', () => {
+    const result = configSchema.parse(minimalConfig);
+    expect(result.sandbox.enabled).toBe(false);
+    expect(result.sandbox.image).toBe('node:22-slim');
+    expect(result.sandbox.network).toBe('none');
+    expect(result.sandbox.memory_limit).toBe('512m');
+    expect(result.sandbox.cpu_limit).toBe('1.0');
+    expect(result.sandbox.timeout_seconds).toBe(300);
+  });
+
+  it('accepts sandbox config', () => {
+    const result = configSchema.parse({
+      ...minimalConfig,
+      sandbox: { enabled: true, image: 'ubuntu:24.04', network: 'bridge' },
+    });
+    expect(result.sandbox.enabled).toBe(true);
+    expect(result.sandbox.image).toBe('ubuntu:24.04');
+    expect(result.sandbox.network).toBe('bridge');
+  });
+});
+
+describe('configSchema — agent_configs', () => {
+  const minimalConfig = {
+    telegram: { bot_token: 'test', allowed_chat_ids: [1] },
+    models: { default: { provider: 'anthropic', model: 'claude-3' } },
+  };
+
+  it('defaults agent_configs to empty', () => {
+    const result = configSchema.parse(minimalConfig);
+    expect(result.agent_configs).toEqual({});
+  });
+
+  it('accepts named agent configs', () => {
+    const result = configSchema.parse({
+      ...minimalConfig,
+      agent_configs: {
+        assistant: {
+          system_prompt: 'You are helpful.',
+          model_tier: 'default',
+          tool_profile: 'messaging',
+        },
+        coder: {
+          model_tier: 'complex',
+          tool_profile: 'coding',
+          sandbox: true,
+        },
+      },
+    });
+    expect(result.agent_configs.assistant.system_prompt).toBe('You are helpful.');
+    expect(result.agent_configs.assistant.tool_profile).toBe('messaging');
+    expect(result.agent_configs.coder.sandbox).toBe(true);
+  });
+});
+
+describe('configSchema — routing', () => {
+  const minimalConfig = {
+    telegram: { bot_token: 'test', allowed_chat_ids: [1] },
+    models: { default: { provider: 'anthropic', model: 'claude-3' } },
+  };
+
+  it('defaults routing to empty', () => {
+    const result = configSchema.parse(minimalConfig);
+    expect(result.routing.default_agent).toBeUndefined();
+    expect(result.routing.channels).toEqual({});
+    expect(result.routing.senders).toEqual({});
+  });
+
+  it('accepts routing config', () => {
+    const result = configSchema.parse({
+      ...minimalConfig,
+      routing: {
+        default_agent: 'assistant',
+        channels: { discord: 'coder' },
+        senders: { 'telegram:12345': 'coder' },
+      },
+    });
+    expect(result.routing.default_agent).toBe('assistant');
+    expect(result.routing.channels.discord).toBe('coder');
+    expect(result.routing.senders['telegram:12345']).toBe('coder');
+  });
+});
+```
+
+**Step 2: Run test to verify it fails**
+
+Run: `pnpm vitest run src/config/schema.test.ts`
+Expected: FAIL — `sandbox`, `agent_configs`, and `routing` properties don't exist on config
+
+**Step 3: Implement the schema additions**
+
+Add to `src/config/schema.ts` before the `configSchema` definition (before line 192):
+
+```typescript
+// ── Sandbox schemas ───────────────────────────────────────────────────
+
+const sandboxSchema = z.object({
+  enabled: z.boolean().default(false),
+  image: z.string().default('node:22-slim'),
+  workspace_dir: z.string().default('/workspace'),
+  network: z.enum(['none', 'bridge', 'host']).default('none'),
+  memory_limit: z.string().default('512m'),
+  cpu_limit: z.string().default('1.0'),
+  timeout_seconds: z.number().min(10).max(3600).default(300),
+}).default({});
+
+// ── Agent config + routing schemas ────────────────────────────────────
+
+const modelTierEnum = z.enum(['fast', 'default', 'complex', 'local']);
+
+const agentConfigEntrySchema = z.object({
+  system_prompt: z.string().optional(),
+  model_tier: modelTierEnum.optional(),
+  tool_profile: toolProfileEnum.optional(),
+  tool_overrides: toolOverrideSchema.optional(),
+  sandbox: z.boolean().default(false),
+});
+
+const agentConfigsSchema = z.record(z.string(), agentConfigEntrySchema).default({});
+
+const routingSchema = z.object({
+  default_agent: z.string().optional(),
+  channels: z.record(z.string(), z.string()).default({}),
+  senders: z.record(z.string(), z.string()).default({}),
+}).default({});
+```
+
+Then add to the `configSchema` z.object (around line 192-212), add these three new fields:
+
+```typescript
+  sandbox: sandboxSchema,
+  agent_configs: agentConfigsSchema,
+  routing: routingSchema,
+```
+
+And add type exports at the end (after line 230):
+
+```typescript
+export type SandboxConfig = z.infer<typeof sandboxSchema>;
+export type AgentConfigEntry = z.infer<typeof agentConfigEntrySchema>;
+export type RoutingConfig = z.infer<typeof routingSchema>;
+```
+
+**Step 4: Update `src/config/index.ts` barrel export**
+
+Add the new types to the export line:
+
+```typescript
+export { configSchema, type Config, type TelegramConfig, type ModelConfig, type CronJobConfig, type AgentsConfig, type CompactionConfig, type ToolProfile, type ToolOverrideConfig, type ToolsConfig, type SandboxConfig, type AgentConfigEntry, type RoutingConfig } from './schema.js';
+```
+
+**Step 5: Run test to verify it passes**
+
+Run: `pnpm vitest run src/config/schema.test.ts`
+Expected: PASS (all 6 tests)
+
+**Step 6: Run full test suite**
+
+Run: `pnpm test:run`
+Expected: All 606+ tests pass
+
+**Step 7: Commit**
+
+```bash
+git add src/config/schema.ts src/config/schema.test.ts src/config/index.ts
+git commit -m "feat: add sandbox, agent_configs, and routing config schemas"
+```
+
+---
+
+## Task 2: DockerSandbox Class
+
+**Files:**
+- Create: `src/sandbox/docker.ts`
+- Create: `src/sandbox/docker.test.ts`
+
+**Step 1: Write the failing test**
+
+Create file: `src/sandbox/docker.test.ts`
+
+```typescript
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { DockerSandbox, type DockerSandboxConfig } from './docker.js';
+import * as childProcess from 'child_process';
+
+// Mock child_process.execFile
+vi.mock('child_process', () => ({
+  execFile: vi.fn(),
+}));
+
+const mockedExecFile = vi.mocked(childProcess.execFile);
+
+function mockExecFileSuccess(stdout = '', stderr = '') {
+  mockedExecFile.mockImplementation(
+    (_cmd: unknown, _args: unknown, _opts: unknown, callback: unknown) => {
+      (callback as (err: null, stdout: string, stderr: string) => void)(null, stdout, stderr);
+      return {} as ReturnType<typeof childProcess.execFile>;
+    },
+  );
+}
+
+function mockExecFileError(message: string) {
+  mockedExecFile.mockImplementation(
+    (_cmd: unknown, _args: unknown, _opts: unknown, callback: unknown) => {
+      (callback as (err: Error) => void)(new Error(message));
+      return {} as ReturnType<typeof childProcess.execFile>;
+    },
+  );
+}
+
+describe('DockerSandbox', () => {
+  const defaultConfig: DockerSandboxConfig = {
+    sessionId: 'test-session',
+    image: 'node:22-slim',
+    workspaceDir: '/workspace',
+    network: 'none',
+    memoryLimit: '512m',
+    cpuLimit: '1.0',
+    timeoutSeconds: 300,
+  };
+
+  beforeEach(() => {
+    vi.clearAllMocks();
+  });
+
+  describe('create()', () => {
+    it('creates a docker container with correct args', async () => {
+      mockExecFileSuccess('container-abc123');
+      const sandbox = new DockerSandbox(defaultConfig);
+      await sandbox.create();
+
+      expect(mockedExecFile).toHaveBeenCalledWith(
+        'docker',
+        expect.arrayContaining([
+          'create',
+          '--name', expect.stringContaining('flynn-test-session'),
+          '--memory', '512m',
+          '--cpus', '1.0',
+          '--network', 'none',
+          '-v', expect.stringContaining(':/workspace'),
+          'node:22-slim',
+          'sleep', 'infinity',
+        ]),
+        expect.any(Object),
+        expect.any(Function),
+      );
+      expect(sandbox.containerId).toBe('container-abc123');
+    });
+
+    it('starts the container after creating', async () => {
+      mockExecFileSuccess('container-abc123');
+      const sandbox = new DockerSandbox(defaultConfig);
+      await sandbox.create();
+
+      // Second call should be docker start
+      expect(mockedExecFile).toHaveBeenCalledTimes(2);
+      expect(mockedExecFile).toHaveBeenNthCalledWith(
+        2, 'docker', ['start', 'container-abc123'],
+        expect.any(Object), expect.any(Function),
+      );
+    });
+
+    it('throws if docker create fails', async () => {
+      mockExecFileError('docker not found');
+      const sandbox = new DockerSandbox(defaultConfig);
+      await expect(sandbox.create()).rejects.toThrow('docker not found');
+    });
+  });
+
+  describe('exec()', () => {
+    it('runs command inside container', async () => {
+      const sandbox = new DockerSandbox(defaultConfig);
+      // Manually set container ID to skip create
+      (sandbox as unknown as { _containerId: string })._containerId = 'container-abc';
+
+      mockExecFileSuccess('hello world\n');
+      const result = await sandbox.exec('echo hello world');
+
+      expect(mockedExecFile).toHaveBeenCalledWith(
+        'docker',
+        ['exec', 'container-abc', 'bash', '-c', 'echo hello world'],
+        expect.objectContaining({ timeout: expect.any(Number) }),
+        expect.any(Function),
+      );
+      expect(result).toEqual({ stdout: 'hello world\n', stderr: '' });
+    });
+
+    it('passes cwd as workdir option', async () => {
+      const sandbox = new DockerSandbox(defaultConfig);
+      (sandbox as unknown as { _containerId: string })._containerId = 'container-abc';
+
+      mockExecFileSuccess('');
+      await sandbox.exec('ls', { cwd: '/workspace/project' });
+
+      expect(mockedExecFile).toHaveBeenCalledWith(
+        'docker',
+        ['exec', '-w', '/workspace/project', 'container-abc', 'bash', '-c', 'ls'],
+        expect.any(Object),
+        expect.any(Function),
+      );
+    });
+
+    it('throws if no container created', async () => {
+      const sandbox = new DockerSandbox(defaultConfig);
+      await expect(sandbox.exec('echo hi')).rejects.toThrow('not created');
+    });
+  });
+
+  describe('destroy()', () => {
+    it('force-removes the container', async () => {
+      const sandbox = new DockerSandbox(defaultConfig);
+      (sandbox as unknown as { _containerId: string })._containerId = 'container-abc';
+
+      mockExecFileSuccess();
+      await sandbox.destroy();
+
+      expect(mockedExecFile).toHaveBeenCalledWith(
+        'docker', ['rm', '-f', 'container-abc'],
+        expect.any(Object), expect.any(Function),
+      );
+    });
+
+    it('does nothing if no container', async () => {
+      const sandbox = new DockerSandbox(defaultConfig);
+      await sandbox.destroy(); // should not throw
+      expect(mockedExecFile).not.toHaveBeenCalled();
+    });
+  });
+
+  describe('isAvailable()', () => {
+    it('returns true when docker is installed', async () => {
+      mockExecFileSuccess('Docker version 27.0.0');
+      const result = await DockerSandbox.isAvailable();
+      expect(result).toBe(true);
+    });
+
+    it('returns false when docker is not installed', async () => {
+      mockExecFileError('command not found');
+      const result = await DockerSandbox.isAvailable();
+      expect(result).toBe(false);
+    });
+  });
+});
+```
+
+**Step 2: Run test to verify it fails**
+
+Run: `pnpm vitest run src/sandbox/docker.test.ts`
+Expected: FAIL — cannot find module `./docker.js`
+
+**Step 3: Implement DockerSandbox**
+
+Create file: `src/sandbox/docker.ts`
+
+```typescript
+import { execFile } from 'child_process';
+
+export interface DockerSandboxConfig {
+  sessionId: string;
+  image: string;
+  workspaceDir: string;
+  network: 'none' | 'bridge' | 'host';
+  memoryLimit: string;
+  cpuLimit: string;
+  timeoutSeconds: number;
+}
+
+export interface ExecOptions {
+  cwd?: string;
+  timeout?: number;
+}
+
+export interface ExecResult {
+  stdout: string;
+  stderr: string;
+}
+
+/**
+ * Manages a single Docker container for sandboxed tool execution.
+ * Uses the Docker CLI directly (no SDK dependency).
+ */
+export class DockerSandbox {
+  private config: DockerSandboxConfig;
+  private _containerId: string | null = null;
+  private _hostWorkdir: string;
+
+  constructor(config: DockerSandboxConfig) {
+    this.config = config;
+    // Use a temp directory on the host, named by session
+    const sanitizedId = config.sessionId.replace(/[^a-zA-Z0-9_-]/g, '_');
+    this._hostWorkdir = `/tmp/flynn-sandbox-${sanitizedId}`;
+  }
+
+  get containerId(): string | null {
+    return this._containerId;
+  }
+
+  get containerName(): string {
+    const sanitizedId = this.config.sessionId.replace(/[^a-zA-Z0-9_-]/g, '_');
+    return `flynn-${sanitizedId}`;
+  }
+
+  /** Create and start the sandbox container. */
+  async create(): Promise<void> {
+    const args = [
+      'create',
+      '--name', this.containerName,
+      '--memory', this.config.memoryLimit,
+      '--cpus', this.config.cpuLimit,
+      '--network', this.config.network,
+      '-v', `${this._hostWorkdir}:${this.config.workspaceDir}`,
+      this.config.image,
+      'sleep', 'infinity',
+    ];
+
+    const createResult = await this.dockerCmd(args);
+    this._containerId = createResult.stdout.trim();
+
+    await this.dockerCmd(['start', this._containerId]);
+  }
+
+  /** Execute a command inside the container. */
+  async exec(command: string, opts?: ExecOptions): Promise<ExecResult> {
+    if (!this._containerId) {
+      throw new Error('Sandbox container not created. Call create() first.');
+    }
+
+    const args = ['exec'];
+    if (opts?.cwd) {
+      args.push('-w', opts.cwd);
+    }
+    args.push(this._containerId, 'bash', '-c', command);
+
+    const timeout = opts?.timeout ?? this.config.timeoutSeconds * 1000;
+    return this.dockerCmd(args, timeout);
+  }
+
+  /** Force-remove the container. */
+  async destroy(): Promise<void> {
+    if (!this._containerId) return;
+
+    try {
+      await this.dockerCmd(['rm', '-f', this._containerId]);
+    } catch {
+      // Ignore errors during cleanup
+    }
+    this._containerId = null;
+  }
+
+  /** Check if Docker is available on this host. */
+  static async isAvailable(): Promise<boolean> {
+    try {
+      await new Promise<string>((resolve, reject) => {
+        execFile('docker', ['version', '--format', '{{.Server.Version}}'], {
+          timeout: 5000,
+        }, (error, stdout) => {
+          if (error) reject(error);
+          else resolve(stdout);
+        });
+      });
+      return true;
+    } catch {
+      return false;
+    }
+  }
+
+  /** Run a docker CLI command. */
+  private dockerCmd(args: string[], timeout = 30_000): Promise<ExecResult> {
+    return new Promise((resolve, reject) => {
+      execFile('docker', args, { timeout, maxBuffer: 1024 * 1024 }, (error, stdout, stderr) => {
+        if (error) {
+          reject(error);
+          return;
+        }
+        resolve({ stdout, stderr });
+      });
+    });
+  }
+}
+```
+
+**Step 4: Run test to verify it passes**
+
+Run: `pnpm vitest run src/sandbox/docker.test.ts`
+Expected: PASS (all tests)
+
+**Step 5: Commit**
+
+```bash
+git add src/sandbox/docker.ts src/sandbox/docker.test.ts
+git commit -m "feat: add DockerSandbox class for container lifecycle"
+```
+
+---
+
+## Task 3: SandboxManager
+
+**Files:**
+- Create: `src/sandbox/manager.ts`
+- Create: `src/sandbox/manager.test.ts`
+
+**Step 1: Write the failing test**
+
+Create file: `src/sandbox/manager.test.ts`
+
+```typescript
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { SandboxManager } from './manager.js';
+import { DockerSandbox } from './docker.js';
+import type { SandboxConfig } from '../config/schema.js';
+
+// Mock DockerSandbox
+vi.mock('./docker.js', () => ({
+  DockerSandbox: vi.fn().mockImplementation(() => ({
+    create: vi.fn().mockResolvedValue(undefined),
+    destroy: vi.fn().mockResolvedValue(undefined),
+    exec: vi.fn().mockResolvedValue({ stdout: '', stderr: '' }),
+    containerId: 'mock-container',
+  })),
+}));
+
+describe('SandboxManager', () => {
+  const defaultConfig: SandboxConfig = {
+    enabled: true,
+    image: 'node:22-slim',
+    workspace_dir: '/workspace',
+    network: 'none',
+    memory_limit: '512m',
+    cpu_limit: '1.0',
+    timeout_seconds: 300,
+  };
+
+  beforeEach(() => {
+    vi.clearAllMocks();
+  });
+
+  describe('getOrCreate()', () => {
+    it('creates a new sandbox for unknown session', async () => {
+      const manager = new SandboxManager(defaultConfig);
+      const sandbox = await manager.getOrCreate('session-1');
+
+      expect(DockerSandbox).toHaveBeenCalledWith(expect.objectContaining({
+        sessionId: 'session-1',
+        image: 'node:22-slim',
+      }));
+      expect(sandbox.create).toHaveBeenCalled();
+    });
+
+    it('reuses existing sandbox for same session', async () => {
+      const manager = new SandboxManager(defaultConfig);
+      const first = await manager.getOrCreate('session-1');
+      const second = await manager.getOrCreate('session-1');
+
+      expect(first).toBe(second);
+      expect(DockerSandbox).toHaveBeenCalledTimes(1);
+    });
+
+    it('creates separate sandboxes for different sessions', async () => {
+      const manager = new SandboxManager(defaultConfig);
+      await manager.getOrCreate('session-1');
+      await manager.getOrCreate('session-2');
+
+      expect(DockerSandbox).toHaveBeenCalledTimes(2);
+    });
+  });
+
+  describe('destroy()', () => {
+    it('destroys sandbox and removes from cache', async () => {
+      const manager = new SandboxManager(defaultConfig);
+      const sandbox = await manager.getOrCreate('session-1');
+
+      await manager.destroy('session-1');
+      expect(sandbox.destroy).toHaveBeenCalled();
+
+      // Should create a new one now
+      await manager.getOrCreate('session-1');
+      expect(DockerSandbox).toHaveBeenCalledTimes(2);
+    });
+
+    it('does nothing for unknown session', async () => {
+      const manager = new SandboxManager(defaultConfig);
+      await manager.destroy('nonexistent'); // should not throw
+    });
+  });
+
+  describe('destroyAll()', () => {
+    it('destroys all sandboxes', async () => {
+      const manager = new SandboxManager(defaultConfig);
+      const s1 = await manager.getOrCreate('session-1');
+      const s2 = await manager.getOrCreate('session-2');
+
+      await manager.destroyAll();
+      expect(s1.destroy).toHaveBeenCalled();
+      expect(s2.destroy).toHaveBeenCalled();
+    });
+  });
+});
+```
+
+**Step 2: Run test to verify it fails**
+
+Run: `pnpm vitest run src/sandbox/manager.test.ts`
+Expected: FAIL — cannot find module `./manager.js`
+
+**Step 3: Implement SandboxManager**
+
+Create file: `src/sandbox/manager.ts`
+
+```typescript
+import { DockerSandbox } from './docker.js';
+import type { SandboxConfig } from '../config/schema.js';
+
+/**
+ * Manages per-session Docker sandboxes.
+ * Creates containers lazily on first access, destroys on session cleanup.
+ */
+export class SandboxManager {
+  private sandboxes = new Map<string, DockerSandbox>();
+  private config: SandboxConfig;
+
+  constructor(config: SandboxConfig) {
+    this.config = config;
+  }
+
+  /** Get or create a sandbox for a session. */
+  async getOrCreate(sessionId: string): Promise<DockerSandbox> {
+    let sandbox = this.sandboxes.get(sessionId);
+    if (sandbox) return sandbox;
+
+    sandbox = new DockerSandbox({
+      sessionId,
+      image: this.config.image,
+      workspaceDir: this.config.workspace_dir,
+      network: this.config.network,
+      memoryLimit: this.config.memory_limit,
+      cpuLimit: this.config.cpu_limit,
+      timeoutSeconds: this.config.timeout_seconds,
+    });
+
+    await sandbox.create();
+    this.sandboxes.set(sessionId, sandbox);
+    return sandbox;
+  }
+
+  /** Destroy a specific session's sandbox. */
+  async destroy(sessionId: string): Promise<void> {
+    const sandbox = this.sandboxes.get(sessionId);
+    if (!sandbox) return;
+
+    await sandbox.destroy();
+    this.sandboxes.delete(sessionId);
+  }
+
+  /** Destroy all sandboxes (daemon shutdown). */
+  async destroyAll(): Promise<void> {
+    const entries = Array.from(this.sandboxes.entries());
+    await Promise.allSettled(
+      entries.map(async ([id, sandbox]) => {
+        await sandbox.destroy();
+        this.sandboxes.delete(id);
+      }),
+    );
+  }
+}
+```
+
+**Step 4: Run test to verify it passes**
+
+Run: `pnpm vitest run src/sandbox/manager.test.ts`
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add src/sandbox/manager.ts src/sandbox/manager.test.ts
+git commit -m "feat: add SandboxManager for per-session container lifecycle"
+```
+
+---
+
+## Task 4: Sandboxed Tool Wrappers
+
+**Files:**
+- Create: `src/sandbox/tools.ts`
+- Create: `src/sandbox/tools.test.ts`
+
+**Step 1: Write the failing test**
+
+Create file: `src/sandbox/tools.test.ts`
+
+```typescript
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { createSandboxedShellTool, createSandboxedProcessStartTool } from './tools.js';
+import type { DockerSandbox } from './docker.js';
+
+function mockSandbox(): DockerSandbox {
+  return {
+    exec: vi.fn().mockResolvedValue({ stdout: 'output', stderr: '' }),
+    create: vi.fn(),
+    destroy: vi.fn(),
+    containerId: 'test-container',
+    containerName: 'flynn-test',
+    config: {},
+  } as unknown as DockerSandbox;
+}
+
+describe('createSandboxedShellTool', () => {
+  let sandbox: DockerSandbox;
+
+  beforeEach(() => {
+    sandbox = mockSandbox();
+  });
+
+  it('has the same name as shell.exec', () => {
+    const tool = createSandboxedShellTool(sandbox);
+    expect(tool.name).toBe('shell.exec');
+  });
+
+  it('delegates to sandbox.exec', async () => {
+    const tool = createSandboxedShellTool(sandbox);
+    const result = await tool.execute({ command: 'echo hello' });
+
+    expect(sandbox.exec).toHaveBeenCalledWith('echo hello', { cwd: undefined, timeout: 30000 });
+    expect(result.success).toBe(true);
+    expect(result.output).toBe('output');
+  });
+
+  it('passes cwd to sandbox.exec', async () => {
+    const tool = createSandboxedShellTool(sandbox);
+    await tool.execute({ command: 'ls', cwd: '/workspace/project' });
+
+    expect(sandbox.exec).toHaveBeenCalledWith('ls', { cwd: '/workspace/project', timeout: 30000 });
+  });
+
+  it('passes timeout to sandbox.exec', async () => {
+    const tool = createSandboxedShellTool(sandbox);
+    await tool.execute({ command: 'sleep 10', timeout: 5000 });
+
+    expect(sandbox.exec).toHaveBeenCalledWith('sleep 10', { cwd: undefined, timeout: 5000 });
+  });
+
+  it('returns error on sandbox.exec failure', async () => {
+    (sandbox.exec as ReturnType<typeof vi.fn>).mockRejectedValue(new Error('container dead'));
+    const tool = createSandboxedShellTool(sandbox);
+    const result = await tool.execute({ command: 'fail' });
+
+    expect(result.success).toBe(false);
+    expect(result.error).toBe('container dead');
+  });
+
+  it('includes stderr in output', async () => {
+    (sandbox.exec as ReturnType<typeof vi.fn>).mockResolvedValue({ stdout: 'out', stderr: 'warn' });
+    const tool = createSandboxedShellTool(sandbox);
+    const result = await tool.execute({ command: 'cmd' });
+
+    expect(result.output).toContain('out');
+    expect(result.output).toContain('stderr: warn');
+  });
+});
+
+describe('createSandboxedProcessStartTool', () => {
+  let sandbox: DockerSandbox;
+
+  beforeEach(() => {
+    sandbox = mockSandbox();
+  });
+
+  it('has the same name as process.start', () => {
+    const tool = createSandboxedProcessStartTool(sandbox);
+    expect(tool.name).toBe('process.start');
+  });
+
+  it('runs detached command via sandbox', async () => {
+    const tool = createSandboxedProcessStartTool(sandbox);
+    const result = await tool.execute({ command: 'npm run dev' });
+
+    expect(sandbox.exec).toHaveBeenCalledWith(
+      expect.stringContaining('npm run dev'),
+      expect.any(Object),
+    );
+    expect(result.success).toBe(true);
+    expect(result.output).toContain('Started sandboxed background process');
+  });
+});
+```
+
+**Step 2: Run test to verify it fails**
+
+Run: `pnpm vitest run src/sandbox/tools.test.ts`
+Expected: FAIL — cannot find module `./tools.js`
+
+**Step 3: Implement sandboxed tools**
+
+Create file: `src/sandbox/tools.ts`
+
+```typescript
+import type { Tool, ToolResult } from '../tools/types.js';
+import type { DockerSandbox } from './docker.js';
+
+interface ShellExecArgs {
+  command: string;
+  cwd?: string;
+  timeout?: number;
+}
+
+interface ProcessStartArgs {
+  command: string;
+  cwd?: string;
+}
+
+/**
+ * Create a sandboxed version of shell.exec that delegates to docker exec.
+ * Same Tool interface — drop-in replacement for the host shell.exec.
+ */
+export function createSandboxedShellTool(sandbox: DockerSandbox): Tool {
+  return {
+    name: 'shell.exec',
+    description: 'Execute a shell command inside a sandboxed container and return stdout/stderr.',
+    inputSchema: {
+      type: 'object',
+      properties: {
+        command: { type: 'string', description: 'The shell command to execute' },
+        cwd: { type: 'string', description: 'Working directory inside the container (optional)' },
+        timeout: { type: 'number', description: 'Timeout in milliseconds (default 30000)' },
+      },
+      required: ['command'],
+    },
+    execute: async (rawArgs: unknown): Promise<ToolResult> => {
+      const args = rawArgs as ShellExecArgs;
+      const timeout = args.timeout ?? 30_000;
+
+      try {
+        const result = await sandbox.exec(args.command, {
+          cwd: args.cwd,
+          timeout,
+        });
+
+        const output = result.stdout + (result.stderr ? `\nstderr: ${result.stderr}` : '');
+        return { success: true, output };
+      } catch (error) {
+        return {
+          success: false,
+          output: '',
+          error: error instanceof Error ? error.message : String(error),
+        };
+      }
+    },
+  };
+}
+
+/**
+ * Create a sandboxed version of process.start that runs in the container.
+ * Uses `nohup ... &` via docker exec since we can't spawn detached inside containers.
+ */
+export function createSandboxedProcessStartTool(sandbox: DockerSandbox): Tool {
+  return {
+    name: 'process.start',
+    description: 'Start a command in the background inside a sandboxed container.',
+    inputSchema: {
+      type: 'object',
+      properties: {
+        command: { type: 'string', description: 'The shell command to run in the background' },
+        cwd: { type: 'string', description: 'Working directory inside the container (optional)' },
+      },
+      required: ['command'],
+    },
+    execute: async (rawArgs: unknown): Promise<ToolResult> => {
+      const args = rawArgs as ProcessStartArgs;
+
+      try {
+        // Run via nohup + background in the container
+        const wrappedCmd = `nohup bash -c '${args.command.replace(/'/g, "'\\''")}' > /tmp/proc.log 2>&1 & echo $!`;
+        const result = await sandbox.exec(wrappedCmd, { cwd: args.cwd });
+
+        const pid = result.stdout.trim();
+        return {
+          success: true,
+          output: `Started sandboxed background process (PID ${pid})\nCommand: ${args.command}`,
+        };
+      } catch (error) {
+        return {
+          success: false,
+          output: '',
+          error: error instanceof Error ? error.message : 'Failed to start sandboxed process',
+        };
+      }
+    },
+  };
+}
+```
+
+**Step 4: Run test to verify it passes**
+
+Run: `pnpm vitest run src/sandbox/tools.test.ts`
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add src/sandbox/tools.ts src/sandbox/tools.test.ts
+git commit -m "feat: add sandboxed tool wrappers for shell.exec and process.start"
+```
+
+---
+
+## Task 5: Sandbox Barrel Export + ToolRegistry.clone()
+
+**Files:**
+- Create: `src/sandbox/index.ts`
+- Modify: `src/tools/registry.ts:19-97`
+
+**Step 1: Write the failing test for ToolRegistry.clone()**
+
+Add to a new test or extend existing tests. Create file `src/tools/registry.test.ts` (if it doesn't exist — check first):
+
+```typescript
+import { describe, it, expect } from 'vitest';
+import { ToolRegistry } from './registry.js';
+import type { Tool } from './types.js';
+
+function makeTool(name: string): Tool {
+  return {
+    name,
+    description: `Mock ${name}`,
+    inputSchema: { type: 'object', properties: {} },
+    execute: async () => ({ success: true, output: '' }),
+  };
+}
+
+describe('ToolRegistry', () => {
+  describe('clone()', () => {
+    it('creates a copy with all tools', () => {
+      const reg = new ToolRegistry();
+      reg.register(makeTool('tool.a'));
+      reg.register(makeTool('tool.b'));
+
+      const cloned = reg.clone();
+      expect(cloned.list().map(t => t.name).sort()).toEqual(['tool.a', 'tool.b']);
+    });
+
+    it('inherits the policy from original', () => {
+      const reg = new ToolRegistry();
+      const mockPolicy = { filterTools: vi.fn(), isAllowed: vi.fn(), resolveAllowedNames: vi.fn(), getEffectiveProfile: vi.fn() };
+      reg.setPolicy(mockPolicy as any);
+
+      const cloned = reg.clone();
+      expect(cloned.getPolicy()).toBe(mockPolicy);
+    });
+
+    it('allows replacing tools in clone without affecting original', () => {
+      const reg = new ToolRegistry();
+      const originalTool = makeTool('shell.exec');
+      reg.register(originalTool);
+
+      const cloned = reg.clone();
+      const replacementTool = makeTool('shell.exec');
+      replacementTool.description = 'Sandboxed version';
+
+      cloned.replace(replacementTool);
+      expect(cloned.get('shell.exec')!.description).toBe('Sandboxed version');
+      expect(reg.get('shell.exec')!.description).toBe('Mock shell.exec');
+    });
+  });
+
+  describe('replace()', () => {
+    it('replaces an existing tool', () => {
+      const reg = new ToolRegistry();
+      reg.register(makeTool('tool.a'));
+      const replacement = makeTool('tool.a');
+      replacement.description = 'New description';
+
+      reg.replace(replacement);
+      expect(reg.get('tool.a')!.description).toBe('New description');
+    });
+
+    it('throws if tool does not exist', () => {
+      const reg = new ToolRegistry();
+      expect(() => reg.replace(makeTool('nonexistent'))).toThrow('not registered');
+    });
+  });
+});
+```
+
+Note: Add `import { vi } from 'vitest'` to the imports at the top.
+
+**Step 2: Run test to verify it fails**
+
+Run: `pnpm vitest run src/tools/registry.test.ts`
+Expected: FAIL — `clone()` and `replace()` don't exist on ToolRegistry
+
+**Step 3: Add clone() and replace() to ToolRegistry**
+
+In `src/tools/registry.ts`, add these two methods to the `ToolRegistry` class (after the `unregister` method, around line 32):
+
+```typescript
+  /** Replace an existing tool with a new implementation. Throws if not registered. */
+  replace(tool: Tool): void {
+    if (!this.tools.has(tool.name)) {
+      throw new Error(`Tool '${tool.name}' is not registered — cannot replace`);
+    }
+    this.tools.set(tool.name, tool);
+  }
+
+  /** Create a shallow clone of this registry (new Map, same Tool objects + policy). */
+  clone(): ToolRegistry {
+    const cloned = new ToolRegistry();
+    for (const tool of this.tools.values()) {
+      cloned.register(tool);
+    }
+    if (this._policy) {
+      cloned.setPolicy(this._policy);
+    }
+    return cloned;
+  }
+```
+
+**Step 4: Create the sandbox barrel export**
+
+Create file: `src/sandbox/index.ts`
+
+```typescript
+export { DockerSandbox, type DockerSandboxConfig, type ExecOptions, type ExecResult } from './docker.js';
+export { SandboxManager } from './manager.js';
+export { createSandboxedShellTool, createSandboxedProcessStartTool } from './tools.js';
+```
+
+**Step 5: Run tests to verify they pass**
+
+Run: `pnpm vitest run src/tools/registry.test.ts`
+Expected: PASS
+
+**Step 6: Run full test suite**
+
+Run: `pnpm test:run`
+Expected: All tests pass
+
+**Step 7: Commit**
+
+```bash
+git add src/sandbox/index.ts src/tools/registry.ts src/tools/registry.test.ts
+git commit -m "feat: add ToolRegistry.clone() and replace() for per-session registries"
+```
+
+---
+
+## Task 6: Agent Config Registry
+
+**Files:**
+- Create: `src/agents/registry.ts`
+- Create: `src/agents/registry.test.ts`
+
+**Step 1: Write the failing test**
+
+Create file: `src/agents/registry.test.ts`
+
+```typescript
+import { describe, it, expect } from 'vitest';
+import { AgentConfigRegistry, type AgentConfig } from './registry.js';
+
+describe('AgentConfigRegistry', () => {
+  describe('register()', () => {
+    it('registers a named agent config', () => {
+      const registry = new AgentConfigRegistry();
+      const config: AgentConfig = { name: 'assistant', systemPrompt: 'Be helpful.' };
+      registry.register(config);
+
+      expect(registry.get('assistant')).toEqual(config);
+    });
+
+    it('throws on duplicate name', () => {
+      const registry = new AgentConfigRegistry();
+      registry.register({ name: 'assistant' });
+      expect(() => registry.register({ name: 'assistant' })).toThrow('already registered');
+    });
+  });
+
+  describe('get()', () => {
+    it('returns undefined for unknown name', () => {
+      const registry = new AgentConfigRegistry();
+      expect(registry.get('nonexistent')).toBeUndefined();
+    });
+  });
+
+  describe('list()', () => {
+    it('returns all registered configs', () => {
+      const registry = new AgentConfigRegistry();
+      registry.register({ name: 'a' });
+      registry.register({ name: 'b' });
+      expect(registry.list().map(c => c.name).sort()).toEqual(['a', 'b']);
+    });
+  });
+
+  describe('loadFromConfig()', () => {
+    it('loads configs from a raw config object', () => {
+      const registry = new AgentConfigRegistry();
+      registry.loadFromConfig({
+        assistant: {
+          system_prompt: 'Be helpful.',
+          model_tier: 'default',
+          tool_profile: 'messaging',
+          sandbox: false,
+        },
+        coder: {
+          model_tier: 'complex',
+          tool_profile: 'coding',
+          sandbox: true,
+        },
+      });
+
+      expect(registry.list()).toHaveLength(2);
+      const assistant = registry.get('assistant')!;
+      expect(assistant.systemPrompt).toBe('Be helpful.');
+      expect(assistant.modelTier).toBe('default');
+      expect(assistant.toolProfile).toBe('messaging');
+
+      const coder = registry.get('coder')!;
+      expect(coder.sandbox).toBe(true);
+    });
+  });
+});
+```
+
+**Step 2: Run test to verify it fails**
+
+Run: `pnpm vitest run src/agents/registry.test.ts`
+Expected: FAIL — cannot find module `./registry.js`
+
+**Step 3: Implement AgentConfigRegistry**
+
+Create file: `src/agents/registry.ts`
+
+```typescript
+import type { ToolProfile, ToolOverrideConfig } from '../config/schema.js';
+import type { ModelTier } from '../models/router.js';
+
+export interface AgentConfig {
+  name: string;
+  systemPrompt?: string;
+  modelTier?: ModelTier;
+  toolProfile?: ToolProfile;
+  toolOverrides?: ToolOverrideConfig;
+  sandbox?: boolean;
+}
+
+/**
+ * AgentConfigRegistry — stores named agent configurations.
+ * Loaded from YAML config at startup.
+ */
+export class AgentConfigRegistry {
+  private configs = new Map<string, AgentConfig>();
+
+  register(config: AgentConfig): void {
+    if (this.configs.has(config.name)) {
+      throw new Error(`Agent config '${config.name}' is already registered`);
+    }
+    this.configs.set(config.name, config);
+  }
+
+  get(name: string): AgentConfig | undefined {
+    return this.configs.get(name);
+  }
+
+  list(): AgentConfig[] {
+    return Array.from(this.configs.values());
+  }
+
+  /**
+   * Load agent configs from the parsed YAML config.
+   * Maps from the config schema format to the internal AgentConfig format.
+   */
+  loadFromConfig(rawConfigs: Record<string, {
+    system_prompt?: string;
+    model_tier?: string;
+    tool_profile?: string;
+    tool_overrides?: ToolOverrideConfig;
+    sandbox?: boolean;
+  }>): void {
+    for (const [name, raw] of Object.entries(rawConfigs)) {
+      this.register({
+        name,
+        systemPrompt: raw.system_prompt,
+        modelTier: raw.model_tier as ModelTier | undefined,
+        toolProfile: raw.tool_profile as ToolProfile | undefined,
+        toolOverrides: raw.tool_overrides,
+        sandbox: raw.sandbox,
+      });
+    }
+  }
+}
+```
+
+**Step 4: Run test to verify it passes**
+
+Run: `pnpm vitest run src/agents/registry.test.ts`
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add src/agents/registry.ts src/agents/registry.test.ts
+git commit -m "feat: add AgentConfigRegistry for named agent configurations"
+```
+
+---
+
+## Task 7: Agent Router
+
+**Files:**
+- Create: `src/agents/router.ts`
+- Create: `src/agents/router.test.ts`
+
+**Step 1: Write the failing test**
+
+Create file: `src/agents/router.test.ts`
+
+```typescript
+import { describe, it, expect } from 'vitest';
+import { AgentRouter, type RoutingConfig } from './router.js';
+
+describe('AgentRouter', () => {
+  describe('resolve()', () => {
+    it('returns default_agent when no specific match', () => {
+      const router = new AgentRouter({
+        default_agent: 'assistant',
+        channels: {},
+        senders: {},
+      });
+      expect(router.resolve('telegram', '12345')).toBe('assistant');
+    });
+
+    it('returns undefined when no default and no match', () => {
+      const router = new AgentRouter({
+        channels: {},
+        senders: {},
+      });
+      expect(router.resolve('telegram', '12345')).toBeUndefined();
+    });
+
+    it('matches exact sender', () => {
+      const router = new AgentRouter({
+        default_agent: 'assistant',
+        channels: {},
+        senders: { 'telegram:12345': 'coder' },
+      });
+      expect(router.resolve('telegram', '12345')).toBe('coder');
+    });
+
+    it('matches sender with glob pattern', () => {
+      const router = new AgentRouter({
+        default_agent: 'assistant',
+        channels: {},
+        senders: { 'slack:U0*': 'coder' },
+      });
+      expect(router.resolve('slack', 'U0ABC')).toBe('coder');
+      expect(router.resolve('slack', 'U1ABC')).toBeUndefined(); // no channel match, no default... wait
+    });
+
+    it('matches channel when no sender match', () => {
+      const router = new AgentRouter({
+        default_agent: 'assistant',
+        channels: { discord: 'coder' },
+        senders: {},
+      });
+      expect(router.resolve('discord', 'any-user')).toBe('coder');
+    });
+
+    it('sender match takes priority over channel match', () => {
+      const router = new AgentRouter({
+        default_agent: 'assistant',
+        channels: { discord: 'coder' },
+        senders: { 'discord:special-user': 'vip' },
+      });
+      expect(router.resolve('discord', 'special-user')).toBe('vip');
+      expect(router.resolve('discord', 'normal-user')).toBe('coder');
+    });
+
+    it('falls through: sender → channel → default', () => {
+      const router = new AgentRouter({
+        default_agent: 'fallback',
+        channels: { discord: 'guild-agent' },
+        senders: { 'discord:admin': 'admin-agent' },
+      });
+      expect(router.resolve('discord', 'admin')).toBe('admin-agent');
+      expect(router.resolve('discord', 'regular')).toBe('guild-agent');
+      expect(router.resolve('telegram', 'someone')).toBe('fallback');
+    });
+  });
+});
+```
+
+**Step 2: Run test to verify it fails**
+
+Run: `pnpm vitest run src/agents/router.test.ts`
+Expected: FAIL — cannot find module `./router.js`
+
+**Step 3: Implement AgentRouter**
+
+Create file: `src/agents/router.ts`
+
+```typescript
+/**
+ * AgentRouter resolves which agent config to use for a given channel+sender.
+ *
+ * Resolution order:
+ * 1. Exact sender match (channel:senderId)
+ * 2. Glob pattern sender match
+ * 3. Channel match
+ * 4. default_agent fallback
+ */
+
+export interface RoutingConfig {
+  default_agent?: string;
+  channels: Record<string, string>;
+  senders: Record<string, string>;
+}
+
+/**
+ * Convert a simple glob pattern to regex.
+ * Supports `*` (any chars) with `.` escaped.
+ */
+function patternToRegex(pattern: string): RegExp {
+  const escaped = pattern
+    .replace(/[.+^${}()|[\]\\]/g, '\\$&')
+    .replace(/\*/g, '.*');
+  return new RegExp(`^${escaped}$`);
+}
+
+export class AgentRouter {
+  private config: RoutingConfig;
+
+  constructor(config: RoutingConfig) {
+    this.config = config;
+  }
+
+  /**
+   * Resolve the agent config name for a channel + sender pair.
+   * Returns undefined if no match and no default.
+   */
+  resolve(channel: string, senderId: string): string | undefined {
+    const senderKey = `${channel}:${senderId}`;
+
+    // 1. Exact sender match
+    if (this.config.senders[senderKey]) {
+      return this.config.senders[senderKey];
+    }
+
+    // 2. Glob pattern sender match
+    for (const [pattern, agentName] of Object.entries(this.config.senders)) {
+      if (pattern.includes('*') && patternToRegex(pattern).test(senderKey)) {
+        return agentName;
+      }
+    }
+
+    // 3. Channel match
+    if (this.config.channels[channel]) {
+      return this.config.channels[channel];
+    }
+
+    // 4. Default fallback
+    return this.config.default_agent;
+  }
+}
+```
+
+**Step 4: Run test to verify it passes**
+
+Run: `pnpm vitest run src/agents/router.test.ts`
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add src/agents/router.ts src/agents/router.test.ts
+git commit -m "feat: add AgentRouter for config-based sender/channel routing"
+```
+
+---
+
+## Task 8: Agents Barrel Export
+
+**Files:**
+- Create: `src/agents/index.ts`
+
+**Step 1: Create the barrel file**
+
+Create file: `src/agents/index.ts`
+
+```typescript
+export { AgentConfigRegistry, type AgentConfig } from './registry.js';
+export { AgentRouter, type RoutingConfig } from './router.js';
+```
+
+**Step 2: Verify build**
+
+Run: `pnpm typecheck`
+Expected: No errors
+
+**Step 3: Commit**
+
+```bash
+git add src/agents/index.ts
+git commit -m "feat: add agents barrel export"
+```
+
+---
+
+## Task 9: Wire Everything Into the Daemon
+
+**Files:**
+- Modify: `src/daemon/index.ts`
+
+This is the integration task. The daemon's `createMessageRouter()` needs to use the `AgentRouter` and `SandboxManager`.
+
+**Step 1: Write the integration test**
+
+Create file: `src/daemon/routing.test.ts`
+
+```typescript
+import { describe, it, expect, vi } from 'vitest';
+import { AgentRouter } from '../agents/router.js';
+import { AgentConfigRegistry } from '../agents/registry.js';
+
+describe('daemon agent routing integration', () => {
+  it('resolves agent config for channel messages', () => {
+    const registry = new AgentConfigRegistry();
+    registry.loadFromConfig({
+      assistant: { system_prompt: 'Be helpful.', model_tier: 'default', tool_profile: 'messaging', sandbox: false },
+      coder: { system_prompt: 'Write code.', model_tier: 'complex', tool_profile: 'coding', sandbox: true },
+    });
+
+    const router = new AgentRouter({
+      default_agent: 'assistant',
+      channels: { discord: 'coder' },
+      senders: { 'telegram:admin': 'coder' },
+    });
+
+    // Discord user gets coder
+    const discordAgent = router.resolve('discord', 'user123');
+    expect(discordAgent).toBe('coder');
+    expect(registry.get(discordAgent!)!.systemPrompt).toBe('Write code.');
+
+    // Telegram admin gets coder
+    const telegramAdmin = router.resolve('telegram', 'admin');
+    expect(telegramAdmin).toBe('coder');
+
+    // Random telegram user gets assistant
+    const telegramUser = router.resolve('telegram', 'random');
+    expect(telegramUser).toBe('assistant');
+    expect(registry.get(telegramUser!)!.systemPrompt).toBe('Be helpful.');
+  });
+
+  it('uses default agent when no routing configured', () => {
+    const router = new AgentRouter({ channels: {}, senders: {} });
+    expect(router.resolve('telegram', '123')).toBeUndefined();
+  });
+});
+```
+
+**Step 2: Run test to verify it passes**
+
+Run: `pnpm vitest run src/daemon/routing.test.ts`
+Expected: PASS (these are testing already-built components together)
+
+**Step 3: Modify daemon/index.ts**
+
+Add imports at the top of `src/daemon/index.ts` (after existing imports):
+
+```typescript
+import { AgentConfigRegistry, AgentRouter } from '../agents/index.js';
+import { SandboxManager, createSandboxedShellTool, createSandboxedProcessStartTool } from '../sandbox/index.js';
+```
+
+Add to `DaemonContext` interface:
+
+```typescript
+  agentConfigRegistry: AgentConfigRegistry;
+  agentRouter: AgentRouter;
+  sandboxManager?: SandboxManager;
+```
+
+Modify `createMessageRouter()` to accept additional dependencies:
+
+```typescript
+function createMessageRouter(deps: {
+  sessionManager: SessionManager;
+  modelRouter: ModelRouter;
+  systemPrompt: string;
+  toolRegistry: ToolRegistry;
+  toolExecutor: ToolExecutor;
+  config: Config;
+  memoryStore?: MemoryStore;
+  agentConfigRegistry?: AgentConfigRegistry;
+  agentRouter?: AgentRouter;
+  sandboxManager?: SandboxManager;
+}) {
+```
+
+Inside `getOrCreateAgent()`, resolve the agent config and create sandboxed registries:
+
+```typescript
+  function getOrCreateAgent(channel: string, senderId: string): AgentOrchestrator {
+    // Resolve agent config name from routing
+    const agentConfigName = deps.agentRouter?.resolve(channel, senderId);
+    const agentConfig = agentConfigName ? deps.agentConfigRegistry?.get(agentConfigName) : undefined;
+
+    const cacheKey = agentConfigName
+      ? `${channel}:${senderId}:${agentConfigName}`
+      : `${channel}:${senderId}`;
+
+    let agent = agents.get(cacheKey);
+    if (!agent) {
+      const session = deps.sessionManager.getSession(channel, senderId);
+
+      // Determine system prompt — agent config overrides global
+      const systemPrompt = agentConfig?.systemPrompt ?? deps.systemPrompt;
+
+      // Determine primary tier
+      const primaryTier = agentConfig?.modelTier ?? deps.config.agents.primary_tier ?? 'default';
+
+      // Determine tool policy context
+      const toolPolicyContext: ToolPolicyContext = {
+        agent: primaryTier,
+        provider: deps.config.models.default.provider,
+      };
+
+      // Determine tool registry — sandbox if configured
+      let toolRegistry = deps.toolRegistry;
+      if (agentConfig?.sandbox && deps.sandboxManager && deps.config.sandbox.enabled) {
+        // Create a cloned registry with sandboxed tools
+        toolRegistry = deps.toolRegistry.clone();
+        // Sandbox will be created lazily on first tool call
+        // For now, create a wrapper that handles lazy initialization
+        const sessionId = `${channel}:${senderId}`;
+        const sandbox = deps.sandboxManager;
+        const sandboxConfig = deps.config.sandbox;
+
+        // Replace shell.exec and process.start with lazy-sandboxed versions
+        const lazySandboxedShell: Tool = {
+          name: 'shell.exec',
+          description: 'Execute a shell command inside a sandboxed container.',
+          inputSchema: {
+            type: 'object',
+            properties: {
+              command: { type: 'string', description: 'The shell command to execute' },
+              cwd: { type: 'string', description: 'Working directory (optional)' },
+              timeout: { type: 'number', description: 'Timeout in milliseconds (default 30000)' },
+            },
+            required: ['command'],
+          },
+          execute: async (rawArgs: unknown) => {
+            const dockerSandbox = await sandbox.getOrCreate(sessionId);
+            const tool = createSandboxedShellTool(dockerSandbox);
+            return tool.execute(rawArgs);
+          },
+        };
+
+        const lazySandboxedProcessStart: Tool = {
+          name: 'process.start',
+          description: 'Start a command in the background inside a sandboxed container.',
+          inputSchema: {
+            type: 'object',
+            properties: {
+              command: { type: 'string', description: 'The shell command to run' },
+              cwd: { type: 'string', description: 'Working directory (optional)' },
+            },
+            required: ['command'],
+          },
+          execute: async (rawArgs: unknown) => {
+            const dockerSandbox = await sandbox.getOrCreate(sessionId);
+            const tool = createSandboxedProcessStartTool(dockerSandbox);
+            return tool.execute(rawArgs);
+          },
+        };
+
+        toolRegistry.replace(lazySandboxedShell);
+        toolRegistry.replace(lazySandboxedProcessStart);
+      }
+
+      const delegationConfig: DelegationConfig = {
+        compaction: deps.config.agents.delegation.compaction ?? 'fast',
+        memory_extraction: deps.config.agents.delegation.memory_extraction ?? 'fast',
+        classification: deps.config.agents.delegation.classification ?? 'fast',
+        tool_summarisation: deps.config.agents.delegation.tool_summarisation ?? 'fast',
+        complex_reasoning: deps.config.agents.delegation.complex_reasoning ?? 'complex',
+      };
+
+      agent = new AgentOrchestrator({
+        modelRouter: deps.modelRouter,
+        systemPrompt,
+        session,
+        toolRegistry,
+        toolExecutor: deps.toolExecutor,
+        primaryTier,
+        delegation: delegationConfig,
+        maxDelegationDepth: deps.config.agents.max_delegation_depth ?? 3,
+        compaction: deps.config.compaction.enabled ? {
+          thresholdPct: deps.config.compaction.threshold_pct,
+          keepTurns: deps.config.compaction.keep_turns,
+          summaryMaxTokens: deps.config.compaction.summary_max_tokens,
+        } : undefined,
+        modelName: deps.config.models.default.model,
+        contextWindow: deps.config.models.default.context_window,
+        memoryStore: deps.memoryStore,
+        toolPolicyContext,
+      });
+      agents.set(cacheKey, agent);
+    }
+    return agent;
+  }
+```
+
+In `startDaemon()`, add agent config registry and router initialization after skills loading (around line 385):
+
+```typescript
+  // Initialize agent config registry and router
+  const agentConfigRegistry = new AgentConfigRegistry();
+  if (config.agent_configs && Object.keys(config.agent_configs).length > 0) {
+    agentConfigRegistry.loadFromConfig(config.agent_configs);
+    console.log(`Loaded ${Object.keys(config.agent_configs).length} agent config(s): ${Object.keys(config.agent_configs).join(', ')}`);
+  }
+
+  const agentRouter = new AgentRouter(config.routing);
+
+  // Initialize sandbox manager if enabled
+  let sandboxManager: SandboxManager | undefined;
+  if (config.sandbox.enabled) {
+    const dockerAvailable = await DockerSandbox.isAvailable();
+    if (dockerAvailable) {
+      sandboxManager = new SandboxManager(config.sandbox);
+      console.log(`Docker sandbox enabled: image=${config.sandbox.image}, network=${config.sandbox.network}`);
+    } else {
+      console.warn('Docker sandbox enabled in config but Docker is not available — falling back to host execution');
+    }
+  }
+```
+
+Add sandbox shutdown hook:
+
+```typescript
+  if (sandboxManager) {
+    lifecycle.onShutdown(async () => {
+      await sandboxManager!.destroyAll();
+      console.log('Docker sandboxes destroyed');
+    });
+  }
+```
+
+Pass new deps to `createMessageRouter()`:
+
+```typescript
+  channelRegistry.setMessageHandler(createMessageRouter({
+    sessionManager,
+    modelRouter,
+    systemPrompt,
+    toolRegistry,
+    toolExecutor,
+    config,
+    memoryStore,
+    agentConfigRegistry,
+    agentRouter,
+    sandboxManager,
+  }));
+```
+
+Add to DaemonContext return:
+
+```typescript
+  return {
+    config,
+    lifecycle,
+    sessionStore,
+    sessionManager,
+    hookEngine,
+    modelRouter,
+    toolRegistry,
+    toolExecutor,
+    gateway,
+    channelRegistry,
+    mcpManager,
+    skillRegistry,
+    skillInstaller,
+    agentConfigRegistry,
+    agentRouter,
+    sandboxManager,
+  };
+```
+
+Note: You'll need to import `DockerSandbox` and the `Tool` type at the top, and import `ToolPolicyContext`:
+
+```typescript
+import { DockerSandbox } from '../sandbox/index.js';
+import type { Tool } from '../tools/types.js';
+import type { ToolPolicyContext } from '../tools/policy.js';
+```
+
+**Step 4: Run full test suite**
+
+Run: `pnpm test:run`
+Expected: All tests pass
+
+**Step 5: Run typecheck**
+
+Run: `pnpm typecheck`
+Expected: No errors
+
+**Step 6: Commit**
+
+```bash
+git add src/daemon/index.ts src/daemon/routing.test.ts
+git commit -m "feat: wire Docker sandboxing and agent routing into daemon"
+```
+
+---
+
+## Task 10: Update state.json + Final Verification
+
+**Files:**
+- Modify: `docs/plans/state.json`
+
+**Step 1: Run full test suite and typecheck**
+
+Run: `pnpm test:run && pnpm typecheck`
+Expected: All tests pass, no type errors
+
+**Step 2: Update state.json**
+
+Add the new P2 entries to `docs/plans/state.json` under the `p2-implementation` plan's `phases` object:
+
+```json
+"docker_sandboxing": {
+  "priority": "P2",
+  "status": "completed",
+  "description": "Docker container sandboxing for channel tool execution (shell.exec, process.start)",
+  "files_created": [
+    "src/sandbox/docker.ts",
+    "src/sandbox/docker.test.ts",
+    "src/sandbox/manager.ts",
+    "src/sandbox/manager.test.ts",
+    "src/sandbox/tools.ts",
+    "src/sandbox/tools.test.ts",
+    "src/sandbox/index.ts"
+  ],
+  "files_modified": [
+    "src/config/schema.ts",
+    "src/config/index.ts",
+    "src/tools/registry.ts",
+    "src/daemon/index.ts"
+  ],
+  "test_status": "N/N passing"
+},
+"multi_agent_routing": {
+  "priority": "P2",
+  "status": "completed",
+  "description": "Named agent configs with config-based channel/sender routing",
+  "files_created": [
+    "src/agents/registry.ts",
+    "src/agents/registry.test.ts",
+    "src/agents/router.ts",
+    "src/agents/router.test.ts",
+    "src/agents/index.ts",
+    "src/daemon/routing.test.ts",
+    "src/config/schema.test.ts"
+  ],
+  "files_modified": [
+    "src/config/schema.ts",
+    "src/config/index.ts",
+    "src/daemon/index.ts"
+  ],
+  "test_status": "N/N passing"
+}
+```
+
+Update `overall_progress.p2_completion` to `"7/7 (100%)"` and `next_up` to `"p3 (group chat, gateway auth, gemini provider, browser control, additional providers)"`.
+
+Update `overall_progress.total_test_count` with the actual count.
+
+**Step 3: Commit**
+
+```bash
+git add docs/plans/state.json
+git commit -m "docs: update state.json with Docker sandbox and multi-agent routing"
+```
+
+---
+
+## Summary
+
+| Task | Component | Est. Time |
+|------|-----------|-----------|
+| 1 | Config schemas (sandbox + agent_configs + routing) | 5 min |
+| 2 | DockerSandbox class | 5 min |
+| 3 | SandboxManager | 3 min |
+| 4 | Sandboxed tool wrappers | 5 min |
+| 5 | Barrel export + ToolRegistry.clone() | 3 min |
+| 6 | AgentConfigRegistry | 3 min |
+| 7 | AgentRouter | 3 min |
+| 8 | Agents barrel export | 1 min |
+| 9 | Daemon integration | 10 min |
+| 10 | State update + verification | 3 min |
+
+**Total estimated: ~40 minutes**