# P2: Docker Sandboxing + Multi-Agent Routing — Design **Date:** 2026-02-06 **Status:** Approved **Priority:** P2 (completes all P2 work) --- ## Feature 1: Docker Sandboxing ### Goal Channel sessions (Telegram, Discord, Slack, WhatsApp) execute `shell.exec` and `process.start` inside Docker containers. TUI and local WebSocket sessions continue running on the host. ### Architecture Tool-level wrapping: sandboxed versions of dangerous tools (`shell.exec`, `process.start`) delegate to `docker exec` inside a per-session container. All other tools (file.read, web.fetch, memory.*, etc.) run on the host unchanged. ``` src/sandbox/ docker.ts — DockerSandbox class (create/exec/destroy containers via CLI) docker.test.ts — Tests (mocked Docker CLI) manager.ts — SandboxManager (session→container mapping + lifecycle) manager.test.ts — Tests tools.ts — createSandboxedShellTool(), createSandboxedProcessStartTool() tools.test.ts — Tests index.ts — Barrel export ``` ### Config Schema ```yaml sandbox: enabled: false # opt-in image: "node:22-slim" # base container image workspace_dir: "/workspace" # mount path inside container network: "none" # container network mode (none/bridge/host) memory_limit: "512m" # memory limit per container cpu_limit: "1.0" # CPU limit per container timeout_seconds: 300 # auto-kill timeout per container ``` ### DockerSandbox Class Wraps Docker CLI via `child_process.execFile` (no Docker SDK dependency): - `create()` — `docker create` with resource limits, bind mount, network mode - `start()` — `docker start` - `exec(command, opts)` — `docker exec` with timeout, returns stdout/stderr - `destroy()` — `docker rm -f` - `isRunning()` — `docker inspect` check ### SandboxManager - `getOrCreate(sessionId, config)` — Lazy container creation on first tool call - `destroy(sessionId)` — Stop and remove container - `destroyAll()` — Shutdown hook for daemon cleanup ### Sandboxed Tools - `createSandboxedShellTool(sandbox)` — Same `Tool` interface as `shell.exec`, but runs via `sandbox.exec(command)`. Preserves cwd (translated to container path), timeout, output truncation. - `createSandboxedProcessStartTool(sandbox)` — Wraps `process.start` to spawn via `docker exec -d` (detached mode). ### Per-Session ToolRegistry When sandbox is active for a channel session, the daemon creates a cloned `ToolRegistry` that replaces `shell.exec` and `process.start` with sandboxed versions. All other tools reference the shared host registry. ### Error Handling - Docker not installed → log warning at startup, fall through to host execution - Container creation fails → log error, return tool error (not crash) - Container timeout → `docker rm -f`, return timeout error - Docker daemon unavailable → graceful degradation with clear error messages --- ## Feature 2: Multi-Agent Routing ### Goal Named agent configurations that can be assigned to channels, senders, or sender patterns. Each agent config specifies its own system prompt, model tier, tool profile, and sandbox setting. ### Architecture ``` src/agents/ registry.ts — AgentConfigRegistry (stores named AgentConfig objects) router.ts — AgentRouter (resolves {channel, senderId} → AgentConfig) router.test.ts — Tests index.ts — Barrel export ``` ### Config Schema ```yaml agent_configs: assistant: system_prompt: "You are a helpful assistant." model_tier: default tool_profile: messaging sandbox: true coder: system_prompt: "You are a coding assistant. Focus on writing clean code." model_tier: complex tool_profile: coding sandbox: true routing: default_agent: assistant channels: discord: coder senders: "telegram:12345": coder "slack:U0*": assistant ``` ### AgentConfigRegistry Stores parsed `AgentConfig` objects by name: - `register(config)` — Add a named config - `get(name)` — Look up by name - `list()` — All registered configs - `loadFromConfig(rawConfig)` — Parse from validated YAML ### AgentConfig Type ```typescript interface AgentConfig { name: string; systemPrompt?: string; // overrides global system prompt modelTier?: ModelTier; // fast/default/complex/local toolProfile?: ToolProfile; // minimal/messaging/coding/full toolOverrides?: ToolOverrideConfig; sandbox?: boolean; // use Docker sandbox (if globally enabled) } ``` ### AgentRouter Resolves which `AgentConfig` to use for a given message: 1. Check `senders` map — exact match first, then glob patterns (via `minimatch`) 2. Check `channels` map — channel name match 3. Fall back to `routing.default_agent` ### Daemon Integration The `createMessageRouter()` function changes: 1. On message: `agentRouter.resolve(channel, senderId)` returns agent config name 2. Cache key: `${channel}:${senderId}:${agentConfigName}` (agent change = new orchestrator) 3. Create `AgentOrchestrator` with resolved config's system prompt, model tier, tool policy 4. If sandbox enabled for this config + globally: create per-session sandboxed ToolRegistry 5. Otherwise: use shared host ToolRegistry --- ## Modified Files - `src/config/schema.ts` — Add `sandboxSchema`, `agentConfigSchema`, `routingSchema` - `src/config/index.ts` — Export new types - `src/daemon/index.ts` — Wire SandboxManager + AgentRouter into message handler - `src/tools/registry.ts` — Add `clone()` method for per-session copies ## Testing - All Docker interactions mocked (no real Docker in tests) - Agent router tested with config fixtures (exact, glob, channel, default fallback) - Sandboxed tools tested with mocked Docker CLI exec - Integration tested via daemon message handler with mocked dependencies ## Dependencies - No new npm dependencies (Docker CLI, `minimatch` already available or trivially implemented) - Runtime: Docker must be installed on host for sandbox feature to work (graceful degradation if absent)