docs: add safety docs and OpenClaw gap roadmap
This commit is contained in:
@@ -0,0 +1,206 @@
|
||||
# Agent-Oriented Project Diagram
|
||||
|
||||
This is a high-signal, agent-oriented view of Flynn's structure and execution flow.
|
||||
|
||||
If you're new to the codebase, start here, then jump to the referenced files.
|
||||
|
||||
## Big Picture (Runtime Data Flow)
|
||||
|
||||
```text
|
||||
Inbound Message
|
||||
(Telegram/Discord/Slack/WhatsApp/WebChat)
|
||||
|
|
||||
v
|
||||
ChannelAdapter -> ChannelRegistry
|
||||
| |
|
||||
| v
|
||||
| createMessageRouter()
|
||||
| |
|
||||
| v
|
||||
| SessionManager
|
||||
| |
|
||||
| v
|
||||
| AgentOrchestrator
|
||||
| |
|
||||
| v
|
||||
| NativeAgent
|
||||
| |
|
||||
| ModelRouter.chat()
|
||||
| |
|
||||
| v
|
||||
| ModelClient
|
||||
|
|
||||
+----> (optional) PairingManager gate for unknown senders
|
||||
|
||||
Tool Calls (inside NativeAgent loop)
|
||||
NativeAgent -> ToolRegistry (policy-filtered) -> ToolExecutor
|
||||
| |
|
||||
| v
|
||||
| HookEngine + autonomy
|
||||
| |
|
||||
| v
|
||||
| Tool.execute()
|
||||
| |
|
||||
| v
|
||||
+---------------------------> AuditLogger (redacted)
|
||||
|
||||
Outbound Reply
|
||||
-> ChannelAdapter.send() (text + optional attachments)
|
||||
```
|
||||
|
||||
Key files:
|
||||
|
||||
- Routing + per-session agent creation: `src/daemon/routing.ts`
|
||||
- Orchestration: `src/backends/native/orchestrator.ts`
|
||||
- Tool loop: `src/backends/native/agent.ts`
|
||||
- Model routing: `src/models/router.ts`
|
||||
- Tool policy + execution: `src/tools/policy.ts`, `src/tools/executor.ts`
|
||||
|
||||
## Component Graph (Agent-Safety Boundary)
|
||||
|
||||
```text
|
||||
+---------------------------+
|
||||
| Config |
|
||||
| (Zod schema + YAML) |
|
||||
| src/config/schema.ts |
|
||||
+-------------+-------------+
|
||||
|
|
||||
v
|
||||
+-------------------+ +-------------+ +------------------+
|
||||
| SkillRegistry | | ToolPolicy | | HookEngine |
|
||||
| src/skills/* | | src/tools/* | | src/hooks/* |
|
||||
+---------+---------+ +------+------+ +---------+--------+
|
||||
| | |
|
||||
| (system prompt) | (allow/deny) | (confirm/log/silent)
|
||||
v v v
|
||||
+-------------------+ +-------------+ +------------------+
|
||||
| System Prompt | | ToolRegistry| | ToolExecutor |
|
||||
| src/daemon/services.ts| src/tools/* | | src/tools/executor.ts
|
||||
+---------+---------+ +------+------+ +---------+--------+
|
||||
| | |
|
||||
v | |
|
||||
+-------------------+ | v
|
||||
| AgentOrchestrator | | +-----------+
|
||||
| src/backends/* | +------------> | AuditLogger|
|
||||
+---------+---------+ | src/audit/*|
|
||||
|
|
||||
v
|
||||
+-------------------+
|
||||
| NativeAgent |
|
||||
| src/backends/* |
|
||||
+---------+---------+
|
||||
|
|
||||
v
|
||||
+-------------------+
|
||||
| ModelRouter |
|
||||
| src/models/* |
|
||||
+-------------------+
|
||||
```
|
||||
|
||||
## Skills + Capabilities (What Gets Enforced)
|
||||
|
||||
Skills are local directories with:
|
||||
|
||||
- `SKILL.md` (instructions injected into the system prompt)
|
||||
- `manifest.json` (metadata + optional `permissions`)
|
||||
|
||||
### Skill permissions enforcement points
|
||||
|
||||
- Tool availability: `ToolPolicy.resolveAllowedNames()` intersects allowed tools with `manifest.json.permissions`.
|
||||
- Tool execution (defense in depth): `ToolExecutor.execute()` enforces:
|
||||
- fs allowlists (`permissions.fs.read` / `permissions.fs.write`)
|
||||
- net allowlists (best-effort for `web.fetch`)
|
||||
- secret scopes (tools declare `requiredSecretScopes`, skills allow `permissions.secrets`)
|
||||
- injection guard when untrusted content is present
|
||||
|
||||
Important default:
|
||||
|
||||
- If a request is routed into a skill context but the skill has no `permissions` manifest, **tool access is denied**.
|
||||
|
||||
Key files:
|
||||
|
||||
- Skill manifest types: `src/skills/types.ts`
|
||||
- Loader validation: `src/skills/loader.ts`
|
||||
- Policy intersection: `src/tools/policy.ts`
|
||||
- Executor enforcement: `src/tools/executor.ts`
|
||||
|
||||
## Sandbox Execution (High-Risk Tools)
|
||||
|
||||
Flynn supports per-session Docker sandboxes.
|
||||
|
||||
Where sandboxing is applied today:
|
||||
|
||||
- `shell.exec` and `process.start` can be replaced with sandboxed implementations.
|
||||
- Replacement is wired in `src/daemon/routing.ts` by cloning the ToolRegistry and swapping the tool implementations.
|
||||
|
||||
Skill context default:
|
||||
|
||||
- High-risk tool execution defaults to `sandbox` in skill context (when available).
|
||||
- A skill can opt into host execution only by setting `permissions.execution_environment: "host"`.
|
||||
|
||||
Key files:
|
||||
|
||||
- Sandbox lifecycle: `src/sandbox/manager.ts`, `src/sandbox/docker.ts`
|
||||
- Sandboxed tool wrappers: `src/sandbox/tools.ts`
|
||||
- Wiring: `src/daemon/routing.ts`
|
||||
|
||||
## Prompt Injection Hardening (Practical)
|
||||
|
||||
Flynn treats content provenance as part of the control boundary:
|
||||
|
||||
- `web.fetch`, `web.search`, and `browser.content` outputs are treated as untrusted "fetched_content".
|
||||
- Tool results are wrapped in provenance markers inside the tool loop.
|
||||
- Once untrusted content is seen, ToolExecutor applies stricter gating (blocks obvious injection patterns for high-risk tools).
|
||||
|
||||
Key files:
|
||||
|
||||
- Provenance wrapping: `src/backends/native/agent.ts`
|
||||
- Tool-call guard: `src/tools/executor.ts`
|
||||
- System prompt safety guidance: `src/daemon/services.ts`
|
||||
|
||||
## Mermaid (For Fast Visual Scanning)
|
||||
|
||||
If your renderer supports Mermaid, this is the same information as a sequence diagram.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
autonumber
|
||||
participant U as User
|
||||
participant CA as ChannelAdapter
|
||||
participant CR as ChannelRegistry
|
||||
participant SM as SessionManager
|
||||
participant AR as AgentOrchestrator
|
||||
participant NA as NativeAgent
|
||||
participant MR as ModelRouter
|
||||
participant MC as ModelClient
|
||||
participant TP as ToolPolicy/Registry
|
||||
participant TE as ToolExecutor
|
||||
participant HE as HookEngine
|
||||
participant AL as AuditLogger
|
||||
|
||||
U->>CA: message
|
||||
CA->>CR: onMessage(InboundMessage)
|
||||
CR->>SM: getSession(channel, sender)
|
||||
SM-->>CR: Session
|
||||
CR->>AR: getOrCreateAgent(session + routing)
|
||||
AR->>NA: process(userMessage)
|
||||
NA->>MR: chat(messages + tools)
|
||||
MR->>MC: provider request
|
||||
MC-->>MR: response (content or tool_calls)
|
||||
MR-->>NA: ChatResponse
|
||||
|
||||
alt model requests tool use
|
||||
NA->>TP: filtered tool list (skill + policy)
|
||||
NA->>TE: execute(tool, args, context)
|
||||
TE->>HE: confirm/log/silent (autonomy)
|
||||
HE-->>TE: approved/denied
|
||||
TE->>AL: audit (redacted)
|
||||
TE-->>NA: ToolResult
|
||||
NA->>MR: chat(tool_result blocks)
|
||||
end
|
||||
|
||||
NA-->>AR: assistant response
|
||||
AR-->>CR: OutboundMessage
|
||||
CR-->>CA: send()
|
||||
CA-->>U: reply
|
||||
```
|
||||
Reference in New Issue
Block a user