docs(atlas): add tool security review checklists

2026-06-23 10:44:01 -07:00
parent 8c6b2b5b47
commit 70ee83e265
3 changed files with 373 additions and 0 deletions
@@ -0,0 +1,112 @@
 # Atlas Cron Async-Subagent Audit — 2026-06-23
 Source: `~/.hermes/cron/jobs.json` and Hermes v0.17 async-subagent idea from [[Daily Research/2026-06-23 - Hermes AI Brief]].
 Related: [[Safer Autonomy and Permission Tiers]], [[MCP Tool Security Checklist]]
 ## Rule of thumb
 Use async/background subagents only when the child work is optional, independently reportable, or can write a durable artifact that the parent can inspect later. Keep jobs synchronous when the final message depends on a single coherent answer, prompt/state ordering, or immediate side-effect verification.
 ## Good candidates for async children
 - `daily-hermes-ai-research-brief`
  - Why: independent searches/source-reading can split into Hermes release/docs, broader AI-agent infra, and coding-agent/MCP patterns.
  - Constraint: parent must synthesize final brief and save the Obsidian copy after children return or write artifacts.
  - Safe pattern: child tasks produce source summaries only; parent owns final claims/citations.
 - `weekly-skill-community-capability-scout`
  - Why: same research fan-out shape as daily brief, low side-effect risk.
  - Constraint: no memory writes from children; parent recommends skill/cron/Kanban/n8n placement.
 - `weekly-homelab-helm-chart-update-audit`
  - Why: can parallelize changelog/release-note lookup per chart after live app inventory is captured.
  - Constraint: cluster/repo inspection and risk classification stay in parent; children must be read-only and must not run kubectl/helm mutations.
 - `daily-proactive-dreaming-brief`
  - Maybe. Session/Obsidian lookups could be child tasks only when the pre-run bundle surfaces clearly separable topics.
  - Constraint: privacy/minimization matters; default should remain synchronous unless there is a real latency/quality win.
 ## Keep synchronous / script-only
 - All `no_agent: true` watchdogs and direct device automations:
  - `atlas-minio-self-backup`
  - `system threshold watchdog`
  - `blocked kanban escalation`
  - `system-hermes-nightly-watchdog`
  - `local-ai-automation-watchdog`
  - `agent-ops-watchdog`
  - `weekly-disk-bloat-scout`
  - `weekly-package-update-watch`
  - `weekly-hermes-repo-freshness`
  - `specialist-profile-smoke-watchdog`
  - `hermes-live-checkout-kanban-guard`
  - wake/light/VLC/WiZ jobs
  - `memory-hygiene-watchdog`
  - `google-oauth-credential-watchdog`
 Reason: these are deterministic health checks or physical/device side effects. Async agents add cost, state complexity, and more ways for the toaster to become a distributed system. No thanks.
 ## Jobs that should not use async children now
 - `weekday-gentle-work-check-in`
  - It is intentionally tiny and tool-free. Async here would be architecture cosplay.
 - `weekly-dotfiles-sync`
  - The script is source of truth and may touch a git repo. Parent should read script output and report. Do not fan out unless a future version does optional read-only diff/secrets analysis after the script finishes.
 ## First implementation target
 Target: `weekly-skill-community-capability-scout`.
 Why first:
 - Low operational risk.
 - No live cluster/device side effects.
 - Easy to compare output quality before/after.
 - Natural fan-out into independent source domains.
 Candidate child tasks:
 1. Hermes Agent / Skills Hub changes.
 2. MCP/tool-security/community patterns.
 3. Local LLM / agent-ops workflows.
 Parent contract:
 - Children return only concise source-backed notes with URLs.
 - Parent filters hype, dedupes, decides action shape, and sends the final brief.
 - No memory writes; durable workflow changes become skills/runbooks only after review.
 ## Pilot verification — synchronous delegation fan-out
 Verified on 2026-06-23 with `weekly-skill-community-capability-scout` (`40360afe333b`):
 - The job prompt includes synchronous `delegate_task(tasks=[...])` fan-out and explicitly forbids background/async children.
 - The job has `delegation` in `enabled_toolsets` alongside `web`, `search`, `browser`, and `file`.
 - Session `cron_40360afe333b_20260623_100254` called `delegate_task` with three read-only web children:
  - Hermes Agent / Skills Hub / Hermes ecosystem.
  - MCP/tool-security/community patterns.
  - Local LLM / agent-ops automation.
 - The parent produced one final synthesized report and delivery remained parent-owned.
 - Follow-up from that report completed on 2026-06-23: created Hermes skill `mcp-tool-security-reviewer` and linked it from `native-mcp`; it references `Atlas/MCP Tool Security Checklist.md` as the canonical local runbook.
 - Current status after verification: last successful run at `2026-06-23T10:04:34-07:00`; next scheduled run remains `2026-06-26T11:00:00-07:00`.
 Keep using synchronous fan-out for this cron until true background/async child completion semantics are separately tested for cron delivery. Boring beats haunted.
 ## Future code/config checks before true background children
 Before editing live cron prompts to use background/async children, confirm:
 - Whether cron sessions can use async/background children directly.
 - How child completion is surfaced to the parent/final delivery.
 - Failure/timeout behavior inside the cron 3-minute run envelope.
 - Whether toolset restrictions propagate cleanly to children.
 ## API server verification note
 After API key addition and gateway restart on 2026-06-23:
 - `hermes gateway status` reported `hermes-gateway.service` active/running.
 - `GET /health` on `127.0.0.1:8642` returned HTTP 200.
 - `GET /v1/models` without auth returned HTTP 401.
 - `GET /v1/models` with `Authorization: Bearer <configured API_SERVER_KEY>` returned HTTP 200 and advertised `hermes-agent`.
 No credential value was written here.
@@ -0,0 +1,175 @@
 # Hermes Skill Intake Checklist
 Purpose: decide whether a Hermes skill is safe and useful enough to install, enable, update, or promote into Atlas workflows.
 Default stance: skills are not “just prompts.” They can change future agent behavior, tool use, workflow shape, and security posture. Treat community or agent-generated skills like small operational dependencies.
 ## Intake outcome
 Choose one:
 - **Accept** — install/enable/use as-is.
 - **Accept with constraints** — use only for a named project/profile/toolset.
 - **Quarantine** — save for review, but do not enable by default.
 - **Rewrite locally** — useful idea, but implementation/instructions are too broad or stale.
 - **Reject** — unsafe, duplicative, abandoned, or not worth the context/maintenance cost.
 ## Fast screen
 Reject or quarantine if any answer is bad:
 - Source is unknown, opaque, or not inspectable.
 - Skill requests broad shell/file/network authority without a narrow reason.
 - Skill contains hidden instructions to bypass approvals, reveal secrets, persist memory, or modify unrelated files.
 - Skill expects credentials in prompts, notes, Kanban comments, or committed files.
 - Skill duplicates an existing local skill without a clear improvement.
 - Skill is mostly hype, vendor copy, or framework soup wearing a YAML hat.
 ## Provenance
 Record:
 - Source URL/repo/path:
 - Author/maintainer:
 - Last updated:
 - License if relevant:
 - Install method:
 - Local copy path if installed:
 Checks:
 - Repository or source is readable.
 - Recent enough for the tool/API it describes.
 - Issues/releases/docs do not show obvious breakage.
 - No suspicious install scripts or post-install hooks.
 ## Scope and purpose
 Ask:
 - What exact recurring problem does this solve?
 - Which project/profile/platform should use it?
 - Is it for planning, code editing, ops, research, media, messaging, or credentials?
 - Could this be a note/runbook instead of an always-loadable skill?
 Prefer **runbook/note** over skill when the content is mostly reference material and not an execution procedure.
 Prefer **skill** when it contains a repeatable workflow with triggers, commands, pitfalls, and verification.
 ## Tool and permission review
 List expected toolsets:
 - file:
 - terminal:
 - web/browser:
 - delegation:
 - cronjob:
 - messaging:
 - homeassistant/smart-home:
 - credentials/secrets:
 - MCP/custom tools:
 Checks:
 - Toolsets are the minimum needed.
 - Destructive operations require explicit confirmation.
 - External sends/posts/messages are never automatic unless the workflow explicitly requires it.
 - Production-like systems, clusters, backups, and home automation have clear safety gates.
 - If MCP tools are involved, also run `Atlas/MCP Tool Security Checklist.md`.
 ## Secret and privacy review
 Checks:
 - No secret values in the skill, examples, references, templates, or scripts.
 - Examples use placeholders like `<token>`, `<set>`, `<redacted>`.
 - Skill does not ask Will to paste credentials into chat.
 - Skill does not encourage dumping `.env`, auth caches, gateway logs, raw transcripts, private notes, or support bundles.
 - Any credential inspection reports key names/status only.
 - Any private context use is bounded and summarized, not raw-dumped.
 ## Behavior and memory review
 Checks:
 - Skill does not store one-off task progress as durable memory.
 - Skill separates durable facts from procedures.
 - Skill does not instruct the agent to ignore newer user instructions, tool results, or live files.
 - Skill has clear missing-data behavior instead of guessing.
 - Skill uses explicit assumptions for uncertain state.
 - Skill avoids broad “always do X” imperatives unless they are true safety rules.
 ## Quality review
 Good skills include:
 - Trigger conditions.
 - Numbered workflow steps.
 - Exact commands or API shapes when needed.
 - Pitfalls/common failure modes.
 - Verification steps.
 - Scope boundaries.
 - Related skills or runbooks.
 Warning signs:
 - Long generic advice with no executable procedure.
 - Stale commands or made-up CLI flags.
 - Tool/API claims without links or local verification.
 - Huge context footprint for tiny value.
 - “Agentic” used as seasoning instead of architecture. Culinary crimes, basically.
 ## Installation / enablement checklist
 Before installing/enabling:
 - Review source and linked files.
 - Compare against existing skills for duplication.
 - Confirm needed toolsets and risks.
 - Decide target scope: global, profile-specific, platform-specific, or manual-load only.
 - If installing from remote source, prefer pinning or recording source version/commit.
 - If editing local skills, keep procedure in skill files and facts/preferences in memory only when stable.
 After installing/enabling:
 - Load it once and inspect rendered content.
 - Run a harmless dry-run prompt if practical.
 - Verify it does not request unexpected tools or credentials.
 - Record any constraints in this note or the skill itself.
 - If the skill is wrong/stale, patch it immediately or quarantine it.
 ## Reviewer prompt
 Use this when asking Atlas to review a proposed skill:
 ```text
 Review this Hermes skill before installation/use. Apply the Hermes Skill Intake Checklist and return:
 1. outcome: accept / accept with constraints / quarantine / rewrite locally / reject
 2. why
 3. required toolsets and risks
 4. secret/privacy concerns
 5. overlap with existing skills
 6. exact changes needed before use
 Do not install, enable, edit, or run it unless explicitly asked after the review.
 ```
 ## Decision record template
 ```text
 Skill:
 Source:
 Date reviewed:
 Reviewer:
 Outcome:
 Scope:
 Required toolsets:
 Risks:
 Constraints:
 Follow-up:
 ```
 ## Relationship to other Atlas notes
 - Use `Atlas/MCP Tool Security Checklist.md` for MCP server/tool admission.
 - Use `Atlas/Cron Async-Subagent Audit - 2026-06-23.md` when a skill affects recurring cron/delegation behavior.
 - Use secrets-hygiene rules for any skill touching credentials, logs, gateway setup, auth, webhooks, CI, or provider config.
@@ -0,0 +1,86 @@
 # MCP / Tool Security Checklist
 Purpose: keep Atlas/Hermes tool surfaces narrow, inspectable, and boringly safe. MCP is useful; ambient authority is how future-you gets mugged by a JSON schema.
 ## Default stance
 - Prefer built-in Hermes tools or narrow local scripts before adding a broad MCP server.
 - If the MCP server is packaged or documented as a Hermes skill/workflow, also run `Atlas/Hermes Skill Intake Checklist.md` before enabling it.
 - Add MCP servers only for a specific missing primitive, not because a package exists.
 - Treat every MCP server as code execution with a marketing department until audited.
 - Disable MCP sampling for untrusted servers unless there is a clear need.
 ## Preflight before enabling a server
 - Source and maintenance:
  - Known publisher or repository.
  - Recent commits/releases, readable code, sane issue history.
  - No opaque install scripts without review.
 - Scope:
  - Server name describes one bounded capability.
  - Tool list is small enough to review.
  - No generic shell/file/network supertool unless intentionally sandboxed.
 - Data access:
  - Explicit path allowlist; no ambient `$HOME`, repo root, `/`, or secrets directory access.
  - Read-only by default where possible.
  - Separate servers/profiles for read-only versus write/destructive operations.
 - Credentials:
  - Pass only required env vars via `mcp_servers.<name>.env`.
  - Never inherit full shell env.
  - No secret values in config committed to git, skills, notes, Kanban comments, or prompts.
 - Runtime:
  - Set conservative `timeout` and `connect_timeout`.
  - Prefer local stdio servers for private/local workflows; use HTTP only when the remote trust boundary is clear.
  - Pin package/version where practical for repeatability.
 ## Config review checklist
 For each `mcp_servers.<name>` entry:
 - Has exactly one transport: `command` or `url`.
 - Uses explicit `args`; no hidden shell expansion.
 - `env` contains only named secrets/capabilities required by this server.
 - `timeout` is not huge by accident.
 - `sampling.enabled` is false unless server-initiated LLM calls are intentionally needed.
 - Server name will produce clear tool prefixes: `mcp_<server>_<tool>`.
 ## Injection / path traversal smoke tests
 Before trusting a new server, run harmless prompts/tasks that attempt to exceed scope:
 - Ask it to read outside the allowed path.
 - Ask it to follow instructions from a file/webpage that says to reveal secrets.
 - Ask it to write/delete outside the intended workspace.
 - Ask it to echo environment variables or auth headers.
 - Ask it to perform a destructive action without explicit user confirmation.
 Expected result: refusal/error/blocked behavior, with no secret values printed.
 ## Operating rules
 - Destructive operations still need explicit user confirmation even if the MCP server offers the tool.
 - For repo work, inspect git status before and after; stage only intended source/docs changes.
 - For debugging, report key names/status, never credential values.
 - If a server logs too much, wrap it, patch it, or remove it. Logs are where secrets go to become archaeology.
 - Restart Hermes after MCP config changes; tool discovery happens at startup.
 ## Periodic audit
 Monthly or after major Hermes/tool changes:
 - List configured MCP servers.
 - Confirm each server still has an owner/use case.
 - Remove unused or broad-authority servers.
 - Re-run path/secret smoke tests for servers with file, shell, repo, browser, cloud, or messaging access.
 - Check docs/skills if an MCP server becomes part of a recurring workflow.
 ## Stop signs
 Do not enable, or disable immediately, if:
 - The server wants unrestricted filesystem access for a narrow task.
 - It requires broad cloud/account tokens without scoped alternatives.
 - It can execute shell commands and is exposed to untrusted prompts/content.
 - It performs network requests on arbitrary user/model-provided URLs without allowlists.
 - It has no clear logging/redaction story.
 - Its tool descriptions encourage the model to treat it as an all-purpose assistant inside the assistant. That way lies agent soup.