will/flynn

Fork 0

Files

T

William Valentin cfdd448495 docs: add Docker sandbox and multi-agent routing design/implementation plans

2026-02-06 16:52:38 -08:00

6.0 KiB

Raw Blame History

P2: Docker Sandboxing + Multi-Agent Routing — Design

Date: 2026-02-06 Status: Approved Priority: P2 (completes all P2 work)

Feature 1: Docker Sandboxing

Goal

Channel sessions (Telegram, Discord, Slack, WhatsApp) execute shell.exec and process.start inside Docker containers. TUI and local WebSocket sessions continue running on the host.

Architecture

Tool-level wrapping: sandboxed versions of dangerous tools (shell.exec, process.start) delegate to docker exec inside a per-session container. All other tools (file.read, web.fetch, memory.*, etc.) run on the host unchanged.

src/sandbox/
  docker.ts         — DockerSandbox class (create/exec/destroy containers via CLI)
  docker.test.ts    — Tests (mocked Docker CLI)
  manager.ts        — SandboxManager (session→container mapping + lifecycle)
  manager.test.ts   — Tests
  tools.ts          — createSandboxedShellTool(), createSandboxedProcessStartTool()
  tools.test.ts     — Tests
  index.ts          — Barrel export

Config Schema

sandbox:
  enabled: false              # opt-in
  image: "node:22-slim"       # base container image
  workspace_dir: "/workspace" # mount path inside container
  network: "none"             # container network mode (none/bridge/host)
  memory_limit: "512m"        # memory limit per container
  cpu_limit: "1.0"            # CPU limit per container
  timeout_seconds: 300        # auto-kill timeout per container

DockerSandbox Class

Wraps Docker CLI via child_process.execFile (no Docker SDK dependency):

create() — docker create with resource limits, bind mount, network mode
start() — docker start
exec(command, opts) — docker exec with timeout, returns stdout/stderr
destroy() — docker rm -f
isRunning() — docker inspect check

SandboxManager

getOrCreate(sessionId, config) — Lazy container creation on first tool call
destroy(sessionId) — Stop and remove container
destroyAll() — Shutdown hook for daemon cleanup

Sandboxed Tools

createSandboxedShellTool(sandbox) — Same Tool interface as shell.exec, but runs via sandbox.exec(command). Preserves cwd (translated to container path), timeout, output truncation.
createSandboxedProcessStartTool(sandbox) — Wraps process.start to spawn via docker exec -d (detached mode).

Per-Session ToolRegistry

When sandbox is active for a channel session, the daemon creates a cloned ToolRegistry that replaces shell.exec and process.start with sandboxed versions. All other tools reference the shared host registry.

Error Handling

Docker not installed → log warning at startup, fall through to host execution
Container creation fails → log error, return tool error (not crash)
Container timeout → docker rm -f, return timeout error
Docker daemon unavailable → graceful degradation with clear error messages

Feature 2: Multi-Agent Routing

Goal

Named agent configurations that can be assigned to channels, senders, or sender patterns. Each agent config specifies its own system prompt, model tier, tool profile, and sandbox setting.

Architecture

src/agents/
  registry.ts        — AgentConfigRegistry (stores named AgentConfig objects)
  router.ts          — AgentRouter (resolves {channel, senderId} → AgentConfig)
  router.test.ts     — Tests
  index.ts           — Barrel export

Config Schema

agent_configs:
  assistant:
    system_prompt: "You are a helpful assistant."
    model_tier: default
    tool_profile: messaging
    sandbox: true

  coder:
    system_prompt: "You are a coding assistant. Focus on writing clean code."
    model_tier: complex
    tool_profile: coding
    sandbox: true

routing:
  default_agent: assistant
  channels:
    discord: coder
  senders:
    "telegram:12345": coder
    "slack:U0*": assistant

AgentConfigRegistry

Stores parsed AgentConfig objects by name:

register(config) — Add a named config
get(name) — Look up by name
list() — All registered configs
loadFromConfig(rawConfig) — Parse from validated YAML

AgentConfig Type

interface AgentConfig {
  name: string;
  systemPrompt?: string;     // overrides global system prompt
  modelTier?: ModelTier;     // fast/default/complex/local
  toolProfile?: ToolProfile; // minimal/messaging/coding/full
  toolOverrides?: ToolOverrideConfig;
  sandbox?: boolean;         // use Docker sandbox (if globally enabled)
}

AgentRouter

Resolves which AgentConfig to use for a given message:

Check senders map — exact match first, then glob patterns (via minimatch)
Check channels map — channel name match
Fall back to routing.default_agent

Daemon Integration

The createMessageRouter() function changes:

On message: agentRouter.resolve(channel, senderId) returns agent config name
Cache key: ${channel}:${senderId}:${agentConfigName} (agent change = new orchestrator)
Create AgentOrchestrator with resolved config's system prompt, model tier, tool policy
If sandbox enabled for this config + globally: create per-session sandboxed ToolRegistry
Otherwise: use shared host ToolRegistry

Modified Files

src/config/schema.ts — Add sandboxSchema, agentConfigSchema, routingSchema
src/config/index.ts — Export new types
src/daemon/index.ts — Wire SandboxManager + AgentRouter into message handler
src/tools/registry.ts — Add clone() method for per-session copies

Testing

All Docker interactions mocked (no real Docker in tests)
Agent router tested with config fixtures (exact, glob, channel, default fallback)
Sandboxed tools tested with mocked Docker CLI exec
Integration tested via daemon message handler with mocked dependencies

Dependencies

No new npm dependencies (Docker CLI, minimatch already available or trivially implemented)
Runtime: Docker must be installed on host for sandbox feature to work (graceful degradation if absent)

6.0 KiB Raw Blame History

P2: Docker Sandboxing + Multi-Agent Routing — Design

Feature 1: Docker Sandboxing

Goal

Architecture

Config Schema

DockerSandbox Class

SandboxManager

Sandboxed Tools

Per-Session ToolRegistry

Error Handling

Feature 2: Multi-Agent Routing

Goal

Architecture

Config Schema

AgentConfigRegistry

AgentConfig Type

AgentRouter

Daemon Integration

Modified Files

Testing

Dependencies

6.0 KiB

Raw Blame History