54 KiB
P2: Docker Sandboxing + Multi-Agent Routing — Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Add Docker container sandboxing for channel tool execution and named agent configuration with config-based routing.
Architecture: Tool-level wrapping — sandboxed shell.exec and process.start delegate to docker exec inside per-session containers. Agent config registry stores named agent definitions (system prompt, model tier, tool profile, sandbox flag) with config-based routing that maps channels/senders to agent configs.
Tech Stack: TypeScript (ES2022, NodeNext), Zod schemas, Vitest tests, Docker CLI (no SDK dependency), child_process.execFile.
Task 1: Config Schema — Sandbox + Agent Configs + Routing
Files:
- Modify:
src/config/schema.ts:164-231 - Modify:
src/config/index.ts:1-3
Step 1: Write the failing test
Create file: src/config/schema.test.ts
import { describe, it, expect } from 'vitest';
import { configSchema } from './schema.js';
describe('configSchema — sandbox', () => {
const minimalConfig = {
telegram: { bot_token: 'test', allowed_chat_ids: [1] },
models: { default: { provider: 'anthropic', model: 'claude-3' } },
};
it('defaults sandbox to disabled', () => {
const result = configSchema.parse(minimalConfig);
expect(result.sandbox.enabled).toBe(false);
expect(result.sandbox.image).toBe('node:22-slim');
expect(result.sandbox.network).toBe('none');
expect(result.sandbox.memory_limit).toBe('512m');
expect(result.sandbox.cpu_limit).toBe('1.0');
expect(result.sandbox.timeout_seconds).toBe(300);
});
it('accepts sandbox config', () => {
const result = configSchema.parse({
...minimalConfig,
sandbox: { enabled: true, image: 'ubuntu:24.04', network: 'bridge' },
});
expect(result.sandbox.enabled).toBe(true);
expect(result.sandbox.image).toBe('ubuntu:24.04');
expect(result.sandbox.network).toBe('bridge');
});
});
describe('configSchema — agent_configs', () => {
const minimalConfig = {
telegram: { bot_token: 'test', allowed_chat_ids: [1] },
models: { default: { provider: 'anthropic', model: 'claude-3' } },
};
it('defaults agent_configs to empty', () => {
const result = configSchema.parse(minimalConfig);
expect(result.agent_configs).toEqual({});
});
it('accepts named agent configs', () => {
const result = configSchema.parse({
...minimalConfig,
agent_configs: {
assistant: {
system_prompt: 'You are helpful.',
model_tier: 'default',
tool_profile: 'messaging',
},
coder: {
model_tier: 'complex',
tool_profile: 'coding',
sandbox: true,
},
},
});
expect(result.agent_configs.assistant.system_prompt).toBe('You are helpful.');
expect(result.agent_configs.assistant.tool_profile).toBe('messaging');
expect(result.agent_configs.coder.sandbox).toBe(true);
});
});
describe('configSchema — routing', () => {
const minimalConfig = {
telegram: { bot_token: 'test', allowed_chat_ids: [1] },
models: { default: { provider: 'anthropic', model: 'claude-3' } },
};
it('defaults routing to empty', () => {
const result = configSchema.parse(minimalConfig);
expect(result.routing.default_agent).toBeUndefined();
expect(result.routing.channels).toEqual({});
expect(result.routing.senders).toEqual({});
});
it('accepts routing config', () => {
const result = configSchema.parse({
...minimalConfig,
routing: {
default_agent: 'assistant',
channels: { discord: 'coder' },
senders: { 'telegram:12345': 'coder' },
},
});
expect(result.routing.default_agent).toBe('assistant');
expect(result.routing.channels.discord).toBe('coder');
expect(result.routing.senders['telegram:12345']).toBe('coder');
});
});
Step 2: Run test to verify it fails
Run: pnpm vitest run src/config/schema.test.ts
Expected: FAIL — sandbox, agent_configs, and routing properties don't exist on config
Step 3: Implement the schema additions
Add to src/config/schema.ts before the configSchema definition (before line 192):
// ── Sandbox schemas ───────────────────────────────────────────────────
const sandboxSchema = z.object({
enabled: z.boolean().default(false),
image: z.string().default('node:22-slim'),
workspace_dir: z.string().default('/workspace'),
network: z.enum(['none', 'bridge', 'host']).default('none'),
memory_limit: z.string().default('512m'),
cpu_limit: z.string().default('1.0'),
timeout_seconds: z.number().min(10).max(3600).default(300),
}).default({});
// ── Agent config + routing schemas ────────────────────────────────────
const modelTierEnum = z.enum(['fast', 'default', 'complex', 'local']);
const agentConfigEntrySchema = z.object({
system_prompt: z.string().optional(),
model_tier: modelTierEnum.optional(),
tool_profile: toolProfileEnum.optional(),
tool_overrides: toolOverrideSchema.optional(),
sandbox: z.boolean().default(false),
});
const agentConfigsSchema = z.record(z.string(), agentConfigEntrySchema).default({});
const routingSchema = z.object({
default_agent: z.string().optional(),
channels: z.record(z.string(), z.string()).default({}),
senders: z.record(z.string(), z.string()).default({}),
}).default({});
Then add to the configSchema z.object (around line 192-212), add these three new fields:
sandbox: sandboxSchema,
agent_configs: agentConfigsSchema,
routing: routingSchema,
And add type exports at the end (after line 230):
export type SandboxConfig = z.infer<typeof sandboxSchema>;
export type AgentConfigEntry = z.infer<typeof agentConfigEntrySchema>;
export type RoutingConfig = z.infer<typeof routingSchema>;
Step 4: Update src/config/index.ts barrel export
Add the new types to the export line:
export { configSchema, type Config, type TelegramConfig, type ModelConfig, type CronJobConfig, type AgentsConfig, type CompactionConfig, type ToolProfile, type ToolOverrideConfig, type ToolsConfig, type SandboxConfig, type AgentConfigEntry, type RoutingConfig } from './schema.js';
Step 5: Run test to verify it passes
Run: pnpm vitest run src/config/schema.test.ts
Expected: PASS (all 6 tests)
Step 6: Run full test suite
Run: pnpm test:run
Expected: All 606+ tests pass
Step 7: Commit
git add src/config/schema.ts src/config/schema.test.ts src/config/index.ts
git commit -m "feat: add sandbox, agent_configs, and routing config schemas"
Task 2: DockerSandbox Class
Files:
- Create:
src/sandbox/docker.ts - Create:
src/sandbox/docker.test.ts
Step 1: Write the failing test
Create file: src/sandbox/docker.test.ts
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { DockerSandbox, type DockerSandboxConfig } from './docker.js';
import * as childProcess from 'child_process';
// Mock child_process.execFile
vi.mock('child_process', () => ({
execFile: vi.fn(),
}));
const mockedExecFile = vi.mocked(childProcess.execFile);
function mockExecFileSuccess(stdout = '', stderr = '') {
mockedExecFile.mockImplementation(
(_cmd: unknown, _args: unknown, _opts: unknown, callback: unknown) => {
(callback as (err: null, stdout: string, stderr: string) => void)(null, stdout, stderr);
return {} as ReturnType<typeof childProcess.execFile>;
},
);
}
function mockExecFileError(message: string) {
mockedExecFile.mockImplementation(
(_cmd: unknown, _args: unknown, _opts: unknown, callback: unknown) => {
(callback as (err: Error) => void)(new Error(message));
return {} as ReturnType<typeof childProcess.execFile>;
},
);
}
describe('DockerSandbox', () => {
const defaultConfig: DockerSandboxConfig = {
sessionId: 'test-session',
image: 'node:22-slim',
workspaceDir: '/workspace',
network: 'none',
memoryLimit: '512m',
cpuLimit: '1.0',
timeoutSeconds: 300,
};
beforeEach(() => {
vi.clearAllMocks();
});
describe('create()', () => {
it('creates a docker container with correct args', async () => {
mockExecFileSuccess('container-abc123');
const sandbox = new DockerSandbox(defaultConfig);
await sandbox.create();
expect(mockedExecFile).toHaveBeenCalledWith(
'docker',
expect.arrayContaining([
'create',
'--name', expect.stringContaining('flynn-test-session'),
'--memory', '512m',
'--cpus', '1.0',
'--network', 'none',
'-v', expect.stringContaining(':/workspace'),
'node:22-slim',
'sleep', 'infinity',
]),
expect.any(Object),
expect.any(Function),
);
expect(sandbox.containerId).toBe('container-abc123');
});
it('starts the container after creating', async () => {
mockExecFileSuccess('container-abc123');
const sandbox = new DockerSandbox(defaultConfig);
await sandbox.create();
// Second call should be docker start
expect(mockedExecFile).toHaveBeenCalledTimes(2);
expect(mockedExecFile).toHaveBeenNthCalledWith(
2, 'docker', ['start', 'container-abc123'],
expect.any(Object), expect.any(Function),
);
});
it('throws if docker create fails', async () => {
mockExecFileError('docker not found');
const sandbox = new DockerSandbox(defaultConfig);
await expect(sandbox.create()).rejects.toThrow('docker not found');
});
});
describe('exec()', () => {
it('runs command inside container', async () => {
const sandbox = new DockerSandbox(defaultConfig);
// Manually set container ID to skip create
(sandbox as unknown as { _containerId: string })._containerId = 'container-abc';
mockExecFileSuccess('hello world\n');
const result = await sandbox.exec('echo hello world');
expect(mockedExecFile).toHaveBeenCalledWith(
'docker',
['exec', 'container-abc', 'bash', '-c', 'echo hello world'],
expect.objectContaining({ timeout: expect.any(Number) }),
expect.any(Function),
);
expect(result).toEqual({ stdout: 'hello world\n', stderr: '' });
});
it('passes cwd as workdir option', async () => {
const sandbox = new DockerSandbox(defaultConfig);
(sandbox as unknown as { _containerId: string })._containerId = 'container-abc';
mockExecFileSuccess('');
await sandbox.exec('ls', { cwd: '/workspace/project' });
expect(mockedExecFile).toHaveBeenCalledWith(
'docker',
['exec', '-w', '/workspace/project', 'container-abc', 'bash', '-c', 'ls'],
expect.any(Object),
expect.any(Function),
);
});
it('throws if no container created', async () => {
const sandbox = new DockerSandbox(defaultConfig);
await expect(sandbox.exec('echo hi')).rejects.toThrow('not created');
});
});
describe('destroy()', () => {
it('force-removes the container', async () => {
const sandbox = new DockerSandbox(defaultConfig);
(sandbox as unknown as { _containerId: string })._containerId = 'container-abc';
mockExecFileSuccess();
await sandbox.destroy();
expect(mockedExecFile).toHaveBeenCalledWith(
'docker', ['rm', '-f', 'container-abc'],
expect.any(Object), expect.any(Function),
);
});
it('does nothing if no container', async () => {
const sandbox = new DockerSandbox(defaultConfig);
await sandbox.destroy(); // should not throw
expect(mockedExecFile).not.toHaveBeenCalled();
});
});
describe('isAvailable()', () => {
it('returns true when docker is installed', async () => {
mockExecFileSuccess('Docker version 27.0.0');
const result = await DockerSandbox.isAvailable();
expect(result).toBe(true);
});
it('returns false when docker is not installed', async () => {
mockExecFileError('command not found');
const result = await DockerSandbox.isAvailable();
expect(result).toBe(false);
});
});
});
Step 2: Run test to verify it fails
Run: pnpm vitest run src/sandbox/docker.test.ts
Expected: FAIL — cannot find module ./docker.js
Step 3: Implement DockerSandbox
Create file: src/sandbox/docker.ts
import { execFile } from 'child_process';
export interface DockerSandboxConfig {
sessionId: string;
image: string;
workspaceDir: string;
network: 'none' | 'bridge' | 'host';
memoryLimit: string;
cpuLimit: string;
timeoutSeconds: number;
}
export interface ExecOptions {
cwd?: string;
timeout?: number;
}
export interface ExecResult {
stdout: string;
stderr: string;
}
/**
* Manages a single Docker container for sandboxed tool execution.
* Uses the Docker CLI directly (no SDK dependency).
*/
export class DockerSandbox {
private config: DockerSandboxConfig;
private _containerId: string | null = null;
private _hostWorkdir: string;
constructor(config: DockerSandboxConfig) {
this.config = config;
// Use a temp directory on the host, named by session
const sanitizedId = config.sessionId.replace(/[^a-zA-Z0-9_-]/g, '_');
this._hostWorkdir = `/tmp/flynn-sandbox-${sanitizedId}`;
}
get containerId(): string | null {
return this._containerId;
}
get containerName(): string {
const sanitizedId = this.config.sessionId.replace(/[^a-zA-Z0-9_-]/g, '_');
return `flynn-${sanitizedId}`;
}
/** Create and start the sandbox container. */
async create(): Promise<void> {
const args = [
'create',
'--name', this.containerName,
'--memory', this.config.memoryLimit,
'--cpus', this.config.cpuLimit,
'--network', this.config.network,
'-v', `${this._hostWorkdir}:${this.config.workspaceDir}`,
this.config.image,
'sleep', 'infinity',
];
const createResult = await this.dockerCmd(args);
this._containerId = createResult.stdout.trim();
await this.dockerCmd(['start', this._containerId]);
}
/** Execute a command inside the container. */
async exec(command: string, opts?: ExecOptions): Promise<ExecResult> {
if (!this._containerId) {
throw new Error('Sandbox container not created. Call create() first.');
}
const args = ['exec'];
if (opts?.cwd) {
args.push('-w', opts.cwd);
}
args.push(this._containerId, 'bash', '-c', command);
const timeout = opts?.timeout ?? this.config.timeoutSeconds * 1000;
return this.dockerCmd(args, timeout);
}
/** Force-remove the container. */
async destroy(): Promise<void> {
if (!this._containerId) return;
try {
await this.dockerCmd(['rm', '-f', this._containerId]);
} catch {
// Ignore errors during cleanup
}
this._containerId = null;
}
/** Check if Docker is available on this host. */
static async isAvailable(): Promise<boolean> {
try {
await new Promise<string>((resolve, reject) => {
execFile('docker', ['version', '--format', '{{.Server.Version}}'], {
timeout: 5000,
}, (error, stdout) => {
if (error) reject(error);
else resolve(stdout);
});
});
return true;
} catch {
return false;
}
}
/** Run a docker CLI command. */
private dockerCmd(args: string[], timeout = 30_000): Promise<ExecResult> {
return new Promise((resolve, reject) => {
execFile('docker', args, { timeout, maxBuffer: 1024 * 1024 }, (error, stdout, stderr) => {
if (error) {
reject(error);
return;
}
resolve({ stdout, stderr });
});
});
}
}
Step 4: Run test to verify it passes
Run: pnpm vitest run src/sandbox/docker.test.ts
Expected: PASS (all tests)
Step 5: Commit
git add src/sandbox/docker.ts src/sandbox/docker.test.ts
git commit -m "feat: add DockerSandbox class for container lifecycle"
Task 3: SandboxManager
Files:
- Create:
src/sandbox/manager.ts - Create:
src/sandbox/manager.test.ts
Step 1: Write the failing test
Create file: src/sandbox/manager.test.ts
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { SandboxManager } from './manager.js';
import { DockerSandbox } from './docker.js';
import type { SandboxConfig } from '../config/schema.js';
// Mock DockerSandbox
vi.mock('./docker.js', () => ({
DockerSandbox: vi.fn().mockImplementation(() => ({
create: vi.fn().mockResolvedValue(undefined),
destroy: vi.fn().mockResolvedValue(undefined),
exec: vi.fn().mockResolvedValue({ stdout: '', stderr: '' }),
containerId: 'mock-container',
})),
}));
describe('SandboxManager', () => {
const defaultConfig: SandboxConfig = {
enabled: true,
image: 'node:22-slim',
workspace_dir: '/workspace',
network: 'none',
memory_limit: '512m',
cpu_limit: '1.0',
timeout_seconds: 300,
};
beforeEach(() => {
vi.clearAllMocks();
});
describe('getOrCreate()', () => {
it('creates a new sandbox for unknown session', async () => {
const manager = new SandboxManager(defaultConfig);
const sandbox = await manager.getOrCreate('session-1');
expect(DockerSandbox).toHaveBeenCalledWith(expect.objectContaining({
sessionId: 'session-1',
image: 'node:22-slim',
}));
expect(sandbox.create).toHaveBeenCalled();
});
it('reuses existing sandbox for same session', async () => {
const manager = new SandboxManager(defaultConfig);
const first = await manager.getOrCreate('session-1');
const second = await manager.getOrCreate('session-1');
expect(first).toBe(second);
expect(DockerSandbox).toHaveBeenCalledTimes(1);
});
it('creates separate sandboxes for different sessions', async () => {
const manager = new SandboxManager(defaultConfig);
await manager.getOrCreate('session-1');
await manager.getOrCreate('session-2');
expect(DockerSandbox).toHaveBeenCalledTimes(2);
});
});
describe('destroy()', () => {
it('destroys sandbox and removes from cache', async () => {
const manager = new SandboxManager(defaultConfig);
const sandbox = await manager.getOrCreate('session-1');
await manager.destroy('session-1');
expect(sandbox.destroy).toHaveBeenCalled();
// Should create a new one now
await manager.getOrCreate('session-1');
expect(DockerSandbox).toHaveBeenCalledTimes(2);
});
it('does nothing for unknown session', async () => {
const manager = new SandboxManager(defaultConfig);
await manager.destroy('nonexistent'); // should not throw
});
});
describe('destroyAll()', () => {
it('destroys all sandboxes', async () => {
const manager = new SandboxManager(defaultConfig);
const s1 = await manager.getOrCreate('session-1');
const s2 = await manager.getOrCreate('session-2');
await manager.destroyAll();
expect(s1.destroy).toHaveBeenCalled();
expect(s2.destroy).toHaveBeenCalled();
});
});
});
Step 2: Run test to verify it fails
Run: pnpm vitest run src/sandbox/manager.test.ts
Expected: FAIL — cannot find module ./manager.js
Step 3: Implement SandboxManager
Create file: src/sandbox/manager.ts
import { DockerSandbox } from './docker.js';
import type { SandboxConfig } from '../config/schema.js';
/**
* Manages per-session Docker sandboxes.
* Creates containers lazily on first access, destroys on session cleanup.
*/
export class SandboxManager {
private sandboxes = new Map<string, DockerSandbox>();
private config: SandboxConfig;
constructor(config: SandboxConfig) {
this.config = config;
}
/** Get or create a sandbox for a session. */
async getOrCreate(sessionId: string): Promise<DockerSandbox> {
let sandbox = this.sandboxes.get(sessionId);
if (sandbox) return sandbox;
sandbox = new DockerSandbox({
sessionId,
image: this.config.image,
workspaceDir: this.config.workspace_dir,
network: this.config.network,
memoryLimit: this.config.memory_limit,
cpuLimit: this.config.cpu_limit,
timeoutSeconds: this.config.timeout_seconds,
});
await sandbox.create();
this.sandboxes.set(sessionId, sandbox);
return sandbox;
}
/** Destroy a specific session's sandbox. */
async destroy(sessionId: string): Promise<void> {
const sandbox = this.sandboxes.get(sessionId);
if (!sandbox) return;
await sandbox.destroy();
this.sandboxes.delete(sessionId);
}
/** Destroy all sandboxes (daemon shutdown). */
async destroyAll(): Promise<void> {
const entries = Array.from(this.sandboxes.entries());
await Promise.allSettled(
entries.map(async ([id, sandbox]) => {
await sandbox.destroy();
this.sandboxes.delete(id);
}),
);
}
}
Step 4: Run test to verify it passes
Run: pnpm vitest run src/sandbox/manager.test.ts
Expected: PASS
Step 5: Commit
git add src/sandbox/manager.ts src/sandbox/manager.test.ts
git commit -m "feat: add SandboxManager for per-session container lifecycle"
Task 4: Sandboxed Tool Wrappers
Files:
- Create:
src/sandbox/tools.ts - Create:
src/sandbox/tools.test.ts
Step 1: Write the failing test
Create file: src/sandbox/tools.test.ts
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { createSandboxedShellTool, createSandboxedProcessStartTool } from './tools.js';
import type { DockerSandbox } from './docker.js';
function mockSandbox(): DockerSandbox {
return {
exec: vi.fn().mockResolvedValue({ stdout: 'output', stderr: '' }),
create: vi.fn(),
destroy: vi.fn(),
containerId: 'test-container',
containerName: 'flynn-test',
config: {},
} as unknown as DockerSandbox;
}
describe('createSandboxedShellTool', () => {
let sandbox: DockerSandbox;
beforeEach(() => {
sandbox = mockSandbox();
});
it('has the same name as shell.exec', () => {
const tool = createSandboxedShellTool(sandbox);
expect(tool.name).toBe('shell.exec');
});
it('delegates to sandbox.exec', async () => {
const tool = createSandboxedShellTool(sandbox);
const result = await tool.execute({ command: 'echo hello' });
expect(sandbox.exec).toHaveBeenCalledWith('echo hello', { cwd: undefined, timeout: 30000 });
expect(result.success).toBe(true);
expect(result.output).toBe('output');
});
it('passes cwd to sandbox.exec', async () => {
const tool = createSandboxedShellTool(sandbox);
await tool.execute({ command: 'ls', cwd: '/workspace/project' });
expect(sandbox.exec).toHaveBeenCalledWith('ls', { cwd: '/workspace/project', timeout: 30000 });
});
it('passes timeout to sandbox.exec', async () => {
const tool = createSandboxedShellTool(sandbox);
await tool.execute({ command: 'sleep 10', timeout: 5000 });
expect(sandbox.exec).toHaveBeenCalledWith('sleep 10', { cwd: undefined, timeout: 5000 });
});
it('returns error on sandbox.exec failure', async () => {
(sandbox.exec as ReturnType<typeof vi.fn>).mockRejectedValue(new Error('container dead'));
const tool = createSandboxedShellTool(sandbox);
const result = await tool.execute({ command: 'fail' });
expect(result.success).toBe(false);
expect(result.error).toBe('container dead');
});
it('includes stderr in output', async () => {
(sandbox.exec as ReturnType<typeof vi.fn>).mockResolvedValue({ stdout: 'out', stderr: 'warn' });
const tool = createSandboxedShellTool(sandbox);
const result = await tool.execute({ command: 'cmd' });
expect(result.output).toContain('out');
expect(result.output).toContain('stderr: warn');
});
});
describe('createSandboxedProcessStartTool', () => {
let sandbox: DockerSandbox;
beforeEach(() => {
sandbox = mockSandbox();
});
it('has the same name as process.start', () => {
const tool = createSandboxedProcessStartTool(sandbox);
expect(tool.name).toBe('process.start');
});
it('runs detached command via sandbox', async () => {
const tool = createSandboxedProcessStartTool(sandbox);
const result = await tool.execute({ command: 'npm run dev' });
expect(sandbox.exec).toHaveBeenCalledWith(
expect.stringContaining('npm run dev'),
expect.any(Object),
);
expect(result.success).toBe(true);
expect(result.output).toContain('Started sandboxed background process');
});
});
Step 2: Run test to verify it fails
Run: pnpm vitest run src/sandbox/tools.test.ts
Expected: FAIL — cannot find module ./tools.js
Step 3: Implement sandboxed tools
Create file: src/sandbox/tools.ts
import type { Tool, ToolResult } from '../tools/types.js';
import type { DockerSandbox } from './docker.js';
interface ShellExecArgs {
command: string;
cwd?: string;
timeout?: number;
}
interface ProcessStartArgs {
command: string;
cwd?: string;
}
/**
* Create a sandboxed version of shell.exec that delegates to docker exec.
* Same Tool interface — drop-in replacement for the host shell.exec.
*/
export function createSandboxedShellTool(sandbox: DockerSandbox): Tool {
return {
name: 'shell.exec',
description: 'Execute a shell command inside a sandboxed container and return stdout/stderr.',
inputSchema: {
type: 'object',
properties: {
command: { type: 'string', description: 'The shell command to execute' },
cwd: { type: 'string', description: 'Working directory inside the container (optional)' },
timeout: { type: 'number', description: 'Timeout in milliseconds (default 30000)' },
},
required: ['command'],
},
execute: async (rawArgs: unknown): Promise<ToolResult> => {
const args = rawArgs as ShellExecArgs;
const timeout = args.timeout ?? 30_000;
try {
const result = await sandbox.exec(args.command, {
cwd: args.cwd,
timeout,
});
const output = result.stdout + (result.stderr ? `\nstderr: ${result.stderr}` : '');
return { success: true, output };
} catch (error) {
return {
success: false,
output: '',
error: error instanceof Error ? error.message : String(error),
};
}
},
};
}
/**
* Create a sandboxed version of process.start that runs in the container.
* Uses `nohup ... &` via docker exec since we can't spawn detached inside containers.
*/
export function createSandboxedProcessStartTool(sandbox: DockerSandbox): Tool {
return {
name: 'process.start',
description: 'Start a command in the background inside a sandboxed container.',
inputSchema: {
type: 'object',
properties: {
command: { type: 'string', description: 'The shell command to run in the background' },
cwd: { type: 'string', description: 'Working directory inside the container (optional)' },
},
required: ['command'],
},
execute: async (rawArgs: unknown): Promise<ToolResult> => {
const args = rawArgs as ProcessStartArgs;
try {
// Run via nohup + background in the container
const wrappedCmd = `nohup bash -c '${args.command.replace(/'/g, "'\\''")}' > /tmp/proc.log 2>&1 & echo $!`;
const result = await sandbox.exec(wrappedCmd, { cwd: args.cwd });
const pid = result.stdout.trim();
return {
success: true,
output: `Started sandboxed background process (PID ${pid})\nCommand: ${args.command}`,
};
} catch (error) {
return {
success: false,
output: '',
error: error instanceof Error ? error.message : 'Failed to start sandboxed process',
};
}
},
};
}
Step 4: Run test to verify it passes
Run: pnpm vitest run src/sandbox/tools.test.ts
Expected: PASS
Step 5: Commit
git add src/sandbox/tools.ts src/sandbox/tools.test.ts
git commit -m "feat: add sandboxed tool wrappers for shell.exec and process.start"
Task 5: Sandbox Barrel Export + ToolRegistry.clone()
Files:
- Create:
src/sandbox/index.ts - Modify:
src/tools/registry.ts:19-97
Step 1: Write the failing test for ToolRegistry.clone()
Add to a new test or extend existing tests. Create file src/tools/registry.test.ts (if it doesn't exist — check first):
import { describe, it, expect } from 'vitest';
import { ToolRegistry } from './registry.js';
import type { Tool } from './types.js';
function makeTool(name: string): Tool {
return {
name,
description: `Mock ${name}`,
inputSchema: { type: 'object', properties: {} },
execute: async () => ({ success: true, output: '' }),
};
}
describe('ToolRegistry', () => {
describe('clone()', () => {
it('creates a copy with all tools', () => {
const reg = new ToolRegistry();
reg.register(makeTool('tool.a'));
reg.register(makeTool('tool.b'));
const cloned = reg.clone();
expect(cloned.list().map(t => t.name).sort()).toEqual(['tool.a', 'tool.b']);
});
it('inherits the policy from original', () => {
const reg = new ToolRegistry();
const mockPolicy = { filterTools: vi.fn(), isAllowed: vi.fn(), resolveAllowedNames: vi.fn(), getEffectiveProfile: vi.fn() };
reg.setPolicy(mockPolicy as any);
const cloned = reg.clone();
expect(cloned.getPolicy()).toBe(mockPolicy);
});
it('allows replacing tools in clone without affecting original', () => {
const reg = new ToolRegistry();
const originalTool = makeTool('shell.exec');
reg.register(originalTool);
const cloned = reg.clone();
const replacementTool = makeTool('shell.exec');
replacementTool.description = 'Sandboxed version';
cloned.replace(replacementTool);
expect(cloned.get('shell.exec')!.description).toBe('Sandboxed version');
expect(reg.get('shell.exec')!.description).toBe('Mock shell.exec');
});
});
describe('replace()', () => {
it('replaces an existing tool', () => {
const reg = new ToolRegistry();
reg.register(makeTool('tool.a'));
const replacement = makeTool('tool.a');
replacement.description = 'New description';
reg.replace(replacement);
expect(reg.get('tool.a')!.description).toBe('New description');
});
it('throws if tool does not exist', () => {
const reg = new ToolRegistry();
expect(() => reg.replace(makeTool('nonexistent'))).toThrow('not registered');
});
});
});
Note: Add import { vi } from 'vitest' to the imports at the top.
Step 2: Run test to verify it fails
Run: pnpm vitest run src/tools/registry.test.ts
Expected: FAIL — clone() and replace() don't exist on ToolRegistry
Step 3: Add clone() and replace() to ToolRegistry
In src/tools/registry.ts, add these two methods to the ToolRegistry class (after the unregister method, around line 32):
/** Replace an existing tool with a new implementation. Throws if not registered. */
replace(tool: Tool): void {
if (!this.tools.has(tool.name)) {
throw new Error(`Tool '${tool.name}' is not registered — cannot replace`);
}
this.tools.set(tool.name, tool);
}
/** Create a shallow clone of this registry (new Map, same Tool objects + policy). */
clone(): ToolRegistry {
const cloned = new ToolRegistry();
for (const tool of this.tools.values()) {
cloned.register(tool);
}
if (this._policy) {
cloned.setPolicy(this._policy);
}
return cloned;
}
Step 4: Create the sandbox barrel export
Create file: src/sandbox/index.ts
export { DockerSandbox, type DockerSandboxConfig, type ExecOptions, type ExecResult } from './docker.js';
export { SandboxManager } from './manager.js';
export { createSandboxedShellTool, createSandboxedProcessStartTool } from './tools.js';
Step 5: Run tests to verify they pass
Run: pnpm vitest run src/tools/registry.test.ts
Expected: PASS
Step 6: Run full test suite
Run: pnpm test:run
Expected: All tests pass
Step 7: Commit
git add src/sandbox/index.ts src/tools/registry.ts src/tools/registry.test.ts
git commit -m "feat: add ToolRegistry.clone() and replace() for per-session registries"
Task 6: Agent Config Registry
Files:
- Create:
src/agents/registry.ts - Create:
src/agents/registry.test.ts
Step 1: Write the failing test
Create file: src/agents/registry.test.ts
import { describe, it, expect } from 'vitest';
import { AgentConfigRegistry, type AgentConfig } from './registry.js';
describe('AgentConfigRegistry', () => {
describe('register()', () => {
it('registers a named agent config', () => {
const registry = new AgentConfigRegistry();
const config: AgentConfig = { name: 'assistant', systemPrompt: 'Be helpful.' };
registry.register(config);
expect(registry.get('assistant')).toEqual(config);
});
it('throws on duplicate name', () => {
const registry = new AgentConfigRegistry();
registry.register({ name: 'assistant' });
expect(() => registry.register({ name: 'assistant' })).toThrow('already registered');
});
});
describe('get()', () => {
it('returns undefined for unknown name', () => {
const registry = new AgentConfigRegistry();
expect(registry.get('nonexistent')).toBeUndefined();
});
});
describe('list()', () => {
it('returns all registered configs', () => {
const registry = new AgentConfigRegistry();
registry.register({ name: 'a' });
registry.register({ name: 'b' });
expect(registry.list().map(c => c.name).sort()).toEqual(['a', 'b']);
});
});
describe('loadFromConfig()', () => {
it('loads configs from a raw config object', () => {
const registry = new AgentConfigRegistry();
registry.loadFromConfig({
assistant: {
system_prompt: 'Be helpful.',
model_tier: 'default',
tool_profile: 'messaging',
sandbox: false,
},
coder: {
model_tier: 'complex',
tool_profile: 'coding',
sandbox: true,
},
});
expect(registry.list()).toHaveLength(2);
const assistant = registry.get('assistant')!;
expect(assistant.systemPrompt).toBe('Be helpful.');
expect(assistant.modelTier).toBe('default');
expect(assistant.toolProfile).toBe('messaging');
const coder = registry.get('coder')!;
expect(coder.sandbox).toBe(true);
});
});
});
Step 2: Run test to verify it fails
Run: pnpm vitest run src/agents/registry.test.ts
Expected: FAIL — cannot find module ./registry.js
Step 3: Implement AgentConfigRegistry
Create file: src/agents/registry.ts
import type { ToolProfile, ToolOverrideConfig } from '../config/schema.js';
import type { ModelTier } from '../models/router.js';
export interface AgentConfig {
name: string;
systemPrompt?: string;
modelTier?: ModelTier;
toolProfile?: ToolProfile;
toolOverrides?: ToolOverrideConfig;
sandbox?: boolean;
}
/**
* AgentConfigRegistry — stores named agent configurations.
* Loaded from YAML config at startup.
*/
export class AgentConfigRegistry {
private configs = new Map<string, AgentConfig>();
register(config: AgentConfig): void {
if (this.configs.has(config.name)) {
throw new Error(`Agent config '${config.name}' is already registered`);
}
this.configs.set(config.name, config);
}
get(name: string): AgentConfig | undefined {
return this.configs.get(name);
}
list(): AgentConfig[] {
return Array.from(this.configs.values());
}
/**
* Load agent configs from the parsed YAML config.
* Maps from the config schema format to the internal AgentConfig format.
*/
loadFromConfig(rawConfigs: Record<string, {
system_prompt?: string;
model_tier?: string;
tool_profile?: string;
tool_overrides?: ToolOverrideConfig;
sandbox?: boolean;
}>): void {
for (const [name, raw] of Object.entries(rawConfigs)) {
this.register({
name,
systemPrompt: raw.system_prompt,
modelTier: raw.model_tier as ModelTier | undefined,
toolProfile: raw.tool_profile as ToolProfile | undefined,
toolOverrides: raw.tool_overrides,
sandbox: raw.sandbox,
});
}
}
}
Step 4: Run test to verify it passes
Run: pnpm vitest run src/agents/registry.test.ts
Expected: PASS
Step 5: Commit
git add src/agents/registry.ts src/agents/registry.test.ts
git commit -m "feat: add AgentConfigRegistry for named agent configurations"
Task 7: Agent Router
Files:
- Create:
src/agents/router.ts - Create:
src/agents/router.test.ts
Step 1: Write the failing test
Create file: src/agents/router.test.ts
import { describe, it, expect } from 'vitest';
import { AgentRouter, type RoutingConfig } from './router.js';
describe('AgentRouter', () => {
describe('resolve()', () => {
it('returns default_agent when no specific match', () => {
const router = new AgentRouter({
default_agent: 'assistant',
channels: {},
senders: {},
});
expect(router.resolve('telegram', '12345')).toBe('assistant');
});
it('returns undefined when no default and no match', () => {
const router = new AgentRouter({
channels: {},
senders: {},
});
expect(router.resolve('telegram', '12345')).toBeUndefined();
});
it('matches exact sender', () => {
const router = new AgentRouter({
default_agent: 'assistant',
channels: {},
senders: { 'telegram:12345': 'coder' },
});
expect(router.resolve('telegram', '12345')).toBe('coder');
});
it('matches sender with glob pattern', () => {
const router = new AgentRouter({
default_agent: 'assistant',
channels: {},
senders: { 'slack:U0*': 'coder' },
});
expect(router.resolve('slack', 'U0ABC')).toBe('coder');
expect(router.resolve('slack', 'U1ABC')).toBeUndefined(); // no channel match, no default... wait
});
it('matches channel when no sender match', () => {
const router = new AgentRouter({
default_agent: 'assistant',
channels: { discord: 'coder' },
senders: {},
});
expect(router.resolve('discord', 'any-user')).toBe('coder');
});
it('sender match takes priority over channel match', () => {
const router = new AgentRouter({
default_agent: 'assistant',
channels: { discord: 'coder' },
senders: { 'discord:special-user': 'vip' },
});
expect(router.resolve('discord', 'special-user')).toBe('vip');
expect(router.resolve('discord', 'normal-user')).toBe('coder');
});
it('falls through: sender → channel → default', () => {
const router = new AgentRouter({
default_agent: 'fallback',
channels: { discord: 'guild-agent' },
senders: { 'discord:admin': 'admin-agent' },
});
expect(router.resolve('discord', 'admin')).toBe('admin-agent');
expect(router.resolve('discord', 'regular')).toBe('guild-agent');
expect(router.resolve('telegram', 'someone')).toBe('fallback');
});
});
});
Step 2: Run test to verify it fails
Run: pnpm vitest run src/agents/router.test.ts
Expected: FAIL — cannot find module ./router.js
Step 3: Implement AgentRouter
Create file: src/agents/router.ts
/**
* AgentRouter resolves which agent config to use for a given channel+sender.
*
* Resolution order:
* 1. Exact sender match (channel:senderId)
* 2. Glob pattern sender match
* 3. Channel match
* 4. default_agent fallback
*/
export interface RoutingConfig {
default_agent?: string;
channels: Record<string, string>;
senders: Record<string, string>;
}
/**
* Convert a simple glob pattern to regex.
* Supports `*` (any chars) with `.` escaped.
*/
function patternToRegex(pattern: string): RegExp {
const escaped = pattern
.replace(/[.+^${}()|[\]\\]/g, '\\$&')
.replace(/\*/g, '.*');
return new RegExp(`^${escaped}$`);
}
export class AgentRouter {
private config: RoutingConfig;
constructor(config: RoutingConfig) {
this.config = config;
}
/**
* Resolve the agent config name for a channel + sender pair.
* Returns undefined if no match and no default.
*/
resolve(channel: string, senderId: string): string | undefined {
const senderKey = `${channel}:${senderId}`;
// 1. Exact sender match
if (this.config.senders[senderKey]) {
return this.config.senders[senderKey];
}
// 2. Glob pattern sender match
for (const [pattern, agentName] of Object.entries(this.config.senders)) {
if (pattern.includes('*') && patternToRegex(pattern).test(senderKey)) {
return agentName;
}
}
// 3. Channel match
if (this.config.channels[channel]) {
return this.config.channels[channel];
}
// 4. Default fallback
return this.config.default_agent;
}
}
Step 4: Run test to verify it passes
Run: pnpm vitest run src/agents/router.test.ts
Expected: PASS
Step 5: Commit
git add src/agents/router.ts src/agents/router.test.ts
git commit -m "feat: add AgentRouter for config-based sender/channel routing"
Task 8: Agents Barrel Export
Files:
- Create:
src/agents/index.ts
Step 1: Create the barrel file
Create file: src/agents/index.ts
export { AgentConfigRegistry, type AgentConfig } from './registry.js';
export { AgentRouter, type RoutingConfig } from './router.js';
Step 2: Verify build
Run: pnpm typecheck
Expected: No errors
Step 3: Commit
git add src/agents/index.ts
git commit -m "feat: add agents barrel export"
Task 9: Wire Everything Into the Daemon
Files:
- Modify:
src/daemon/index.ts
This is the integration task. The daemon's createMessageRouter() needs to use the AgentRouter and SandboxManager.
Step 1: Write the integration test
Create file: src/daemon/routing.test.ts
import { describe, it, expect, vi } from 'vitest';
import { AgentRouter } from '../agents/router.js';
import { AgentConfigRegistry } from '../agents/registry.js';
describe('daemon agent routing integration', () => {
it('resolves agent config for channel messages', () => {
const registry = new AgentConfigRegistry();
registry.loadFromConfig({
assistant: { system_prompt: 'Be helpful.', model_tier: 'default', tool_profile: 'messaging', sandbox: false },
coder: { system_prompt: 'Write code.', model_tier: 'complex', tool_profile: 'coding', sandbox: true },
});
const router = new AgentRouter({
default_agent: 'assistant',
channels: { discord: 'coder' },
senders: { 'telegram:admin': 'coder' },
});
// Discord user gets coder
const discordAgent = router.resolve('discord', 'user123');
expect(discordAgent).toBe('coder');
expect(registry.get(discordAgent!)!.systemPrompt).toBe('Write code.');
// Telegram admin gets coder
const telegramAdmin = router.resolve('telegram', 'admin');
expect(telegramAdmin).toBe('coder');
// Random telegram user gets assistant
const telegramUser = router.resolve('telegram', 'random');
expect(telegramUser).toBe('assistant');
expect(registry.get(telegramUser!)!.systemPrompt).toBe('Be helpful.');
});
it('uses default agent when no routing configured', () => {
const router = new AgentRouter({ channels: {}, senders: {} });
expect(router.resolve('telegram', '123')).toBeUndefined();
});
});
Step 2: Run test to verify it passes
Run: pnpm vitest run src/daemon/routing.test.ts
Expected: PASS (these are testing already-built components together)
Step 3: Modify daemon/index.ts
Add imports at the top of src/daemon/index.ts (after existing imports):
import { AgentConfigRegistry, AgentRouter } from '../agents/index.js';
import { SandboxManager, createSandboxedShellTool, createSandboxedProcessStartTool } from '../sandbox/index.js';
Add to DaemonContext interface:
agentConfigRegistry: AgentConfigRegistry;
agentRouter: AgentRouter;
sandboxManager?: SandboxManager;
Modify createMessageRouter() to accept additional dependencies:
function createMessageRouter(deps: {
sessionManager: SessionManager;
modelRouter: ModelRouter;
systemPrompt: string;
toolRegistry: ToolRegistry;
toolExecutor: ToolExecutor;
config: Config;
memoryStore?: MemoryStore;
agentConfigRegistry?: AgentConfigRegistry;
agentRouter?: AgentRouter;
sandboxManager?: SandboxManager;
}) {
Inside getOrCreateAgent(), resolve the agent config and create sandboxed registries:
function getOrCreateAgent(channel: string, senderId: string): AgentOrchestrator {
// Resolve agent config name from routing
const agentConfigName = deps.agentRouter?.resolve(channel, senderId);
const agentConfig = agentConfigName ? deps.agentConfigRegistry?.get(agentConfigName) : undefined;
const cacheKey = agentConfigName
? `${channel}:${senderId}:${agentConfigName}`
: `${channel}:${senderId}`;
let agent = agents.get(cacheKey);
if (!agent) {
const session = deps.sessionManager.getSession(channel, senderId);
// Determine system prompt — agent config overrides global
const systemPrompt = agentConfig?.systemPrompt ?? deps.systemPrompt;
// Determine primary tier
const primaryTier = agentConfig?.modelTier ?? deps.config.agents.primary_tier ?? 'default';
// Determine tool policy context
const toolPolicyContext: ToolPolicyContext = {
agent: primaryTier,
provider: deps.config.models.default.provider,
};
// Determine tool registry — sandbox if configured
let toolRegistry = deps.toolRegistry;
if (agentConfig?.sandbox && deps.sandboxManager && deps.config.sandbox.enabled) {
// Create a cloned registry with sandboxed tools
toolRegistry = deps.toolRegistry.clone();
// Sandbox will be created lazily on first tool call
// For now, create a wrapper that handles lazy initialization
const sessionId = `${channel}:${senderId}`;
const sandbox = deps.sandboxManager;
const sandboxConfig = deps.config.sandbox;
// Replace shell.exec and process.start with lazy-sandboxed versions
const lazySandboxedShell: Tool = {
name: 'shell.exec',
description: 'Execute a shell command inside a sandboxed container.',
inputSchema: {
type: 'object',
properties: {
command: { type: 'string', description: 'The shell command to execute' },
cwd: { type: 'string', description: 'Working directory (optional)' },
timeout: { type: 'number', description: 'Timeout in milliseconds (default 30000)' },
},
required: ['command'],
},
execute: async (rawArgs: unknown) => {
const dockerSandbox = await sandbox.getOrCreate(sessionId);
const tool = createSandboxedShellTool(dockerSandbox);
return tool.execute(rawArgs);
},
};
const lazySandboxedProcessStart: Tool = {
name: 'process.start',
description: 'Start a command in the background inside a sandboxed container.',
inputSchema: {
type: 'object',
properties: {
command: { type: 'string', description: 'The shell command to run' },
cwd: { type: 'string', description: 'Working directory (optional)' },
},
required: ['command'],
},
execute: async (rawArgs: unknown) => {
const dockerSandbox = await sandbox.getOrCreate(sessionId);
const tool = createSandboxedProcessStartTool(dockerSandbox);
return tool.execute(rawArgs);
},
};
toolRegistry.replace(lazySandboxedShell);
toolRegistry.replace(lazySandboxedProcessStart);
}
const delegationConfig: DelegationConfig = {
compaction: deps.config.agents.delegation.compaction ?? 'fast',
memory_extraction: deps.config.agents.delegation.memory_extraction ?? 'fast',
classification: deps.config.agents.delegation.classification ?? 'fast',
tool_summarisation: deps.config.agents.delegation.tool_summarisation ?? 'fast',
complex_reasoning: deps.config.agents.delegation.complex_reasoning ?? 'complex',
};
agent = new AgentOrchestrator({
modelRouter: deps.modelRouter,
systemPrompt,
session,
toolRegistry,
toolExecutor: deps.toolExecutor,
primaryTier,
delegation: delegationConfig,
maxDelegationDepth: deps.config.agents.max_delegation_depth ?? 3,
compaction: deps.config.compaction.enabled ? {
thresholdPct: deps.config.compaction.threshold_pct,
keepTurns: deps.config.compaction.keep_turns,
summaryMaxTokens: deps.config.compaction.summary_max_tokens,
} : undefined,
modelName: deps.config.models.default.model,
contextWindow: deps.config.models.default.context_window,
memoryStore: deps.memoryStore,
toolPolicyContext,
});
agents.set(cacheKey, agent);
}
return agent;
}
In startDaemon(), add agent config registry and router initialization after skills loading (around line 385):
// Initialize agent config registry and router
const agentConfigRegistry = new AgentConfigRegistry();
if (config.agent_configs && Object.keys(config.agent_configs).length > 0) {
agentConfigRegistry.loadFromConfig(config.agent_configs);
console.log(`Loaded ${Object.keys(config.agent_configs).length} agent config(s): ${Object.keys(config.agent_configs).join(', ')}`);
}
const agentRouter = new AgentRouter(config.routing);
// Initialize sandbox manager if enabled
let sandboxManager: SandboxManager | undefined;
if (config.sandbox.enabled) {
const dockerAvailable = await DockerSandbox.isAvailable();
if (dockerAvailable) {
sandboxManager = new SandboxManager(config.sandbox);
console.log(`Docker sandbox enabled: image=${config.sandbox.image}, network=${config.sandbox.network}`);
} else {
console.warn('Docker sandbox enabled in config but Docker is not available — falling back to host execution');
}
}
Add sandbox shutdown hook:
if (sandboxManager) {
lifecycle.onShutdown(async () => {
await sandboxManager!.destroyAll();
console.log('Docker sandboxes destroyed');
});
}
Pass new deps to createMessageRouter():
channelRegistry.setMessageHandler(createMessageRouter({
sessionManager,
modelRouter,
systemPrompt,
toolRegistry,
toolExecutor,
config,
memoryStore,
agentConfigRegistry,
agentRouter,
sandboxManager,
}));
Add to DaemonContext return:
return {
config,
lifecycle,
sessionStore,
sessionManager,
hookEngine,
modelRouter,
toolRegistry,
toolExecutor,
gateway,
channelRegistry,
mcpManager,
skillRegistry,
skillInstaller,
agentConfigRegistry,
agentRouter,
sandboxManager,
};
Note: You'll need to import DockerSandbox and the Tool type at the top, and import ToolPolicyContext:
import { DockerSandbox } from '../sandbox/index.js';
import type { Tool } from '../tools/types.js';
import type { ToolPolicyContext } from '../tools/policy.js';
Step 4: Run full test suite
Run: pnpm test:run
Expected: All tests pass
Step 5: Run typecheck
Run: pnpm typecheck
Expected: No errors
Step 6: Commit
git add src/daemon/index.ts src/daemon/routing.test.ts
git commit -m "feat: wire Docker sandboxing and agent routing into daemon"
Task 10: Update state.json + Final Verification
Files:
- Modify:
docs/plans/state.json
Step 1: Run full test suite and typecheck
Run: pnpm test:run && pnpm typecheck
Expected: All tests pass, no type errors
Step 2: Update state.json
Add the new P2 entries to docs/plans/state.json under the p2-implementation plan's phases object:
"docker_sandboxing": {
"priority": "P2",
"status": "completed",
"description": "Docker container sandboxing for channel tool execution (shell.exec, process.start)",
"files_created": [
"src/sandbox/docker.ts",
"src/sandbox/docker.test.ts",
"src/sandbox/manager.ts",
"src/sandbox/manager.test.ts",
"src/sandbox/tools.ts",
"src/sandbox/tools.test.ts",
"src/sandbox/index.ts"
],
"files_modified": [
"src/config/schema.ts",
"src/config/index.ts",
"src/tools/registry.ts",
"src/daemon/index.ts"
],
"test_status": "N/N passing"
},
"multi_agent_routing": {
"priority": "P2",
"status": "completed",
"description": "Named agent configs with config-based channel/sender routing",
"files_created": [
"src/agents/registry.ts",
"src/agents/registry.test.ts",
"src/agents/router.ts",
"src/agents/router.test.ts",
"src/agents/index.ts",
"src/daemon/routing.test.ts",
"src/config/schema.test.ts"
],
"files_modified": [
"src/config/schema.ts",
"src/config/index.ts",
"src/daemon/index.ts"
],
"test_status": "N/N passing"
}
Update overall_progress.p2_completion to "7/7 (100%)" and next_up to "p3 (group chat, gateway auth, gemini provider, browser control, additional providers)".
Update overall_progress.total_test_count with the actual count.
Step 3: Commit
git add docs/plans/state.json
git commit -m "docs: update state.json with Docker sandbox and multi-agent routing"
Summary
| Task | Component | Est. Time |
|---|---|---|
| 1 | Config schemas (sandbox + agent_configs + routing) | 5 min |
| 2 | DockerSandbox class | 5 min |
| 3 | SandboxManager | 3 min |
| 4 | Sandboxed tool wrappers | 5 min |
| 5 | Barrel export + ToolRegistry.clone() | 3 min |
| 6 | AgentConfigRegistry | 3 min |
| 7 | AgentRouter | 3 min |
| 8 | Agents barrel export | 1 min |
| 9 | Daemon integration | 10 min |
| 10 | State update + verification | 3 min |
Total estimated: ~40 minutes