docs: add Docker sandbox and multi-agent routing design/implementation plans

2026-02-06 16:52:38 -08:00
parent 880744846f
commit cfdd448495
2 changed files with 2005 additions and 0 deletions
@@ -0,0 +1,173 @@
 # P2: Docker Sandboxing + Multi-Agent Routing — Design
 **Date:** 2026-02-06
 **Status:** Approved
 **Priority:** P2 (completes all P2 work)
 ---
 ## Feature 1: Docker Sandboxing
 ### Goal
 Channel sessions (Telegram, Discord, Slack, WhatsApp) execute `shell.exec` and `process.start` inside Docker containers. TUI and local WebSocket sessions continue running on the host.
 ### Architecture
 Tool-level wrapping: sandboxed versions of dangerous tools (`shell.exec`, `process.start`) delegate to `docker exec` inside a per-session container. All other tools (file.read, web.fetch, memory.*, etc.) run on the host unchanged.
 ```
 src/sandbox/
  docker.ts         — DockerSandbox class (create/exec/destroy containers via CLI)
  docker.test.ts    — Tests (mocked Docker CLI)
  manager.ts        — SandboxManager (session→container mapping + lifecycle)
  manager.test.ts   — Tests
  tools.ts          — createSandboxedShellTool(), createSandboxedProcessStartTool()
  tools.test.ts     — Tests
  index.ts          — Barrel export
 ```
 ### Config Schema
 ```yaml
 sandbox:
  enabled: false              # opt-in
  image: "node:22-slim"       # base container image
  workspace_dir: "/workspace" # mount path inside container
  network: "none"             # container network mode (none/bridge/host)
  memory_limit: "512m"        # memory limit per container
  cpu_limit: "1.0"            # CPU limit per container
  timeout_seconds: 300        # auto-kill timeout per container
 ```
 ### DockerSandbox Class
 Wraps Docker CLI via `child_process.execFile` (no Docker SDK dependency):
 - `create()` — `docker create` with resource limits, bind mount, network mode
 - `start()` — `docker start`
 - `exec(command, opts)` — `docker exec` with timeout, returns stdout/stderr
 - `destroy()` — `docker rm -f`
 - `isRunning()` — `docker inspect` check
 ### SandboxManager
 - `getOrCreate(sessionId, config)` — Lazy container creation on first tool call
 - `destroy(sessionId)` — Stop and remove container
 - `destroyAll()` — Shutdown hook for daemon cleanup
 ### Sandboxed Tools
 - `createSandboxedShellTool(sandbox)` — Same `Tool` interface as `shell.exec`, but runs via `sandbox.exec(command)`. Preserves cwd (translated to container path), timeout, output truncation.
 - `createSandboxedProcessStartTool(sandbox)` — Wraps `process.start` to spawn via `docker exec -d` (detached mode).
 ### Per-Session ToolRegistry
 When sandbox is active for a channel session, the daemon creates a cloned `ToolRegistry` that replaces `shell.exec` and `process.start` with sandboxed versions. All other tools reference the shared host registry.
 ### Error Handling
 - Docker not installed → log warning at startup, fall through to host execution
 - Container creation fails → log error, return tool error (not crash)
 - Container timeout → `docker rm -f`, return timeout error
 - Docker daemon unavailable → graceful degradation with clear error messages
 ---
 ## Feature 2: Multi-Agent Routing
 ### Goal
 Named agent configurations that can be assigned to channels, senders, or sender patterns. Each agent config specifies its own system prompt, model tier, tool profile, and sandbox setting.
 ### Architecture
 ```
 src/agents/
  registry.ts        — AgentConfigRegistry (stores named AgentConfig objects)
  router.ts          — AgentRouter (resolves {channel, senderId} → AgentConfig)
  router.test.ts     — Tests
  index.ts           — Barrel export
 ```
 ### Config Schema
 ```yaml
 agent_configs:
  assistant:
    system_prompt: "You are a helpful assistant."
    model_tier: default
    tool_profile: messaging
    sandbox: true
  coder:
    system_prompt: "You are a coding assistant. Focus on writing clean code."
    model_tier: complex
    tool_profile: coding
    sandbox: true
 routing:
  default_agent: assistant
  channels:
    discord: coder
  senders:
    "telegram:12345": coder
    "slack:U0*": assistant
 ```
 ### AgentConfigRegistry
 Stores parsed `AgentConfig` objects by name:
 - `register(config)` — Add a named config
 - `get(name)` — Look up by name
 - `list()` — All registered configs
 - `loadFromConfig(rawConfig)` — Parse from validated YAML
 ### AgentConfig Type
 ```typescript
 interface AgentConfig {
  name: string;
  systemPrompt?: string;     // overrides global system prompt
  modelTier?: ModelTier;     // fast/default/complex/local
  toolProfile?: ToolProfile; // minimal/messaging/coding/full
  toolOverrides?: ToolOverrideConfig;
  sandbox?: boolean;         // use Docker sandbox (if globally enabled)
 }
 ```
 ### AgentRouter
 Resolves which `AgentConfig` to use for a given message:
 1. Check `senders` map — exact match first, then glob patterns (via `minimatch`)
 2. Check `channels` map — channel name match
 3. Fall back to `routing.default_agent`
 ### Daemon Integration
 The `createMessageRouter()` function changes:
 1. On message: `agentRouter.resolve(channel, senderId)` returns agent config name
 2. Cache key: `${channel}:${senderId}:${agentConfigName}` (agent change = new orchestrator)
 3. Create `AgentOrchestrator` with resolved config's system prompt, model tier, tool policy
 4. If sandbox enabled for this config + globally: create per-session sandboxed ToolRegistry
 5. Otherwise: use shared host ToolRegistry
 ---
 ## Modified Files
 - `src/config/schema.ts` — Add `sandboxSchema`, `agentConfigSchema`, `routingSchema`
 - `src/config/index.ts` — Export new types
 - `src/daemon/index.ts` — Wire SandboxManager + AgentRouter into message handler
 - `src/tools/registry.ts` — Add `clone()` method for per-session copies
 ## Testing
 - All Docker interactions mocked (no real Docker in tests)
 - Agent router tested with config fixtures (exact, glob, channel, default fallback)
 - Sandboxed tools tested with mocked Docker CLI exec
 - Integration tested via daemon message handler with mocked dependencies
 ## Dependencies
 - No new npm dependencies (Docker CLI, `minimatch` already available or trivially implemented)
 - Runtime: Docker must be installed on host for sandbox feature to work (graceful degradation if absent)