10 KiB
Swarm Infrastructure
This document is the source-of-truth overview for Will's local swarm/agent infrastructure on the zap workstation. It focuses on the runtime services that support Atlas/Hermes, n8n automation, local model/search/voice tooling, Obsidian/RAG automation, and the new agentmon monitoring layer.
High-level topology
Telegram / Discord / Email
|
v
Hermes / Atlas gateway (default profile)
|
+--> local tools and specialist profiles
+--> n8n automation workflows on :18808
n8n automation
|
+--> direct watchdog probes for key service ports
+--> Agentmon Health Watchdog -> agentmon-query :8081
+--> Obsidian, RAG, voice memo, URL capture, digest workflows
agentmon
|
+--> agentmon-swarm-monitor -> Docker labels agentmon.monitor=true
+--> agentmon-openclaw-monitor -> OpenClaw VM snapshots
+--> NATS JetStream -> event processor -> Postgres
+--> query API / UI on :8081 / :8082
local AI/search/voice services
|
+--> LiteLLM :18804
+--> SearXNG :18803
+--> Brave MCP :18802
+--> llama.cpp :18806
+--> Ollama embeddings :18807 (legacy/CPU fallback)
+--> OpenVINO NPU embeddings :18817
+--> Kokoro TTS :18805
+--> Whisper NPU :18816
+--> approved/not-live NPU sidecars: reranker :18818, router/classifier :18819, GenAI worker :18820, doc/image triage optional :18829
See also:
swarm-infrastructure.html— visual architecture diagramdiagram-maintenance.md— how to keep diagrams updated and when to create new ones
Runtime layers
1. Messaging and agent gateway
- Hermes / Atlas default profile is the production messaging gateway.
- Connected platforms include Telegram, Discord, and email.
- Atlas uses local swarm services where suitable, especially search, local LLMs, embeddings, STT/TTS, n8n, and agentmon.
- Specialist Hermes profiles are available for delegated work, but the default profile remains the stable production gateway.
2. n8n automation
Container/service:
n8n-agent- Host URL:
http://127.0.0.1:18808 - Container URL:
http://127.0.0.1:5678 - Compose project:
/home/will/lab/swarm/docker-compose.yaml
Important workflow source exports live under:
swarm-common/n8n-workflows/
Current health/automation patterns:
- Swarm Health Watchdog: direct endpoint checks for search, LLM, voice, n8n, Docker health, etc.
- Agentmon Health Watchdog: polls agentmon aggregate snapshots and alerts on stale/degraded monitoring state.
- RAG and Embedding Health Watchdog: checks RAG/search/embedding path.
- Obsidian workflows: health/reindex, inbox triage, daily review, URL-to-note, chat summary capture, weekly decision/runbook extraction.
3. Agentmon monitoring layer
Repo:
/home/will/lab/agentmon
Compose services:
agentmon-ingeston:8080— ingestion gateway,/healthzagentmon-queryon:8081— query API,/healthz,/v1/events,/v1/stats/summaryagentmon-uion:8082— web UI,/healthzagentmon-processor— NATS to Postgres event processoragentmon-swarm-monitor— monitors Docker containers labeledagentmon.monitor=trueagentmon-openclaw-monitor— emits OpenClaw VM snapshotsagentmon-db— Postgresagentmon-nats— NATS JetStream
Key query endpoints:
http://127.0.0.1:8080/healthz
http://127.0.0.1:8081/healthz
http://127.0.0.1:8082/healthz
http://127.0.0.1:8081/v1/stats/summary
http://127.0.0.1:8081/v1/events?event_type=swarm.snapshot&limit=1
http://127.0.0.1:8081/v1/events?event_type=swarm.service.snapshot&limit=20
http://127.0.0.1:8081/v1/events?event_type=openclaw.snapshot&limit=3
From inside n8n-agent, use the Docker bridge gateway:
http://172.19.0.1:8081/v1/events?event_type=swarm.snapshot&limit=1
4. Local AI, search, and voice services
Docker services:
litellm—:18804, OpenAI-compatible LLM routerlitellm-db— Postgres backing LiteLLMsearxng—:18803, local metasearchbrave-search—:18802, Brave Search MCP serverkokoro-tts—:18805, local TTSwhisper-server-npu—:18816, OpenVINO NPU local transcriptionn8n-agent—:18808, automation
Host/user services:
llama-server.service—:18806, local llama.cpp OpenAI-compatible LLMollama.service—:18807, legacy/CPU embeddings API fallbackopenvino-embeddings.service—:18817, OpenVINO NPU embeddings API (/v1/embeddings,/api/embed,/api/embeddings)docker-health-endpoint.service—:18809, read-only container health for n8nobsidian-reindex-endpoint.service—:18810, Obsidian/RAG reindex trigger; default collectionobsidian_bge_npuusing OpenVINO NPU embeddingsurl-content-extractor.service—:18812, YouTube/PDF/web extractionvoice-memo-processor.service—:18813, voice memo processingrag-embedding-health.service—:18814, RAG/embedding health wrapper
Approved but not live-routed OpenVINO NPU sidecars:
| Port | Component | State | Safety boundary |
|---|---|---|---|
18818 |
reranker | approved prototype; optional foreground/user-systemd only | request-time only; no Chroma/vector mutation; no live RAG integration unless Will approves |
18819 |
router/classifier | approved prototype; dry-run only | no Hermes/Atlas routing, memory writes, service restarts, or outbound messages |
18820 |
bounded GenAI worker | approved prototype | background jobs only; not primary Atlas/Hermes model routing |
18829 |
document/image triage | CLI-first; optional localhost server | synthetic/non-private smoke data only; no private directory processing; NPU stage is embeddings via :18817 |
These sidecars must bind to 127.0.0.1 by default, must not be enabled persistently or wired into live Atlas/Hermes/RAG paths without explicit Will approval, and any NPU claim requires a positive /sys/class/accel/accel0/device/npu_busy_time_us delta before/after inference. HTTP 200 alone is not proof.
5. Obsidian and RAG
Vault:
/home/will/lab/swarm/swarm-common/obsidian-vault/will/will-shared-zap
Local REST API:
- HTTP:
127.0.0.1:27123 - HTTPS:
127.0.0.1:27124
RAG/vector store:
- ChromaDB path:
~/.hermes/data/rag-search/chroma/ - Reindex state/progress: active BGE/NPU state in
~/.hermes/data/rag-search/obsidian_bge_npu_index_state.jsonandobsidian_bge_npu_reindex_progress.json; legacy Ollama state inobsidian_index_state.jsonremains for comparison/fallback. - Active RAG query/reindex embedding backend: OpenVINO NPU embeddings service on
:18817, currentlybge-base-en-v1.5-int8-ov, collectionobsidian_bge_npu. - Legacy comparison/fallback collection:
obsidian, built with Ollama on:18807usingnomic-embed-text. - Reindex endpoint:
POST :18810/reindexfor incremental updates,POST :18810/reindex?full=truefor full semantic rebuilds,GET :18810/semantic-healthto verify vectors plus a search smoke test.
Monitoring model
The monitoring design is intentionally layered:
- n8n direct probes check critical service endpoints and send deduped alerts.
- agentmon continuously observes labeled Docker services and OpenClaw state, then writes snapshots through NATS/Postgres.
- n8n Agentmon Health Watchdog polls agentmon's aggregate state and alerts if the monitoring pipeline itself becomes stale/degraded.
- Hermes/Atlas can inspect both n8n and agentmon when troubleshooting, and can use the same endpoints as part of operational checks.
This means a single process being alive is not enough: the important signal is whether collection, ingestion, processing, storage, query, and alerting are all functioning.
Agentmon Health Watchdog
Workflow source:
swarm-common/n8n-workflows/agentmon-health-watchdog.json
Installed n8n workflow:
- Name:
Agentmon Health Watchdog - ID:
AgentmonHealthWatchdog - Schedule: every 5 minutes
Alert conditions:
agentmon-ingest,agentmon-query, oragentmon-ui/healthzfails.- Latest
swarm.snapshotis missing. - Latest
swarm.snapshotis older than 3 minutes. - Snapshot issues are non-empty.
- Required agentmon services are missing or not healthy/running:
agentmon-ingestagentmon-queryagentmon-uiagentmon-processoragentmon-swarm-monitoragentmon-dbagentmon-nats
Deduplication:
- Alert after 2 failed checks.
- Reminder every 6 failed runs.
- Recovery message when state returns healthy.
Operational quick checks
From the host:
cd /home/will/lab/swarm
make status
make local-ai-health
./scripts/npu-service-health.sh # read-only; includes sysfs busy-time proof for :18817
curl -fsS http://127.0.0.1:18808/healthz
curl -fsS http://127.0.0.1:8081/healthz
curl -fsS 'http://127.0.0.1:8081/v1/events?event_type=swarm.snapshot&limit=1' | jq .
From inside n8n-agent:
docker exec n8n-agent /bin/sh -lc '
wget -qO- -T 5 http://172.19.0.1:8081/healthz
wget -qO- -T 5 "http://172.19.0.1:8081/v1/events?event_type=swarm.snapshot&limit=1" | head -c 500
'
Verify n8n workflow activation:
docker exec -u node n8n-agent n8n export:workflow \
--id=AgentmonHealthWatchdog \
--output=/tmp/agentmon-export.json
docker cp n8n-agent:/tmp/agentmon-export.json /tmp/agentmon-export.json
jq '.[0] | {id,name,active,nodes:(.nodes|length)}' /tmp/agentmon-export.json
Notes and pitfalls
- Do not commit
.env, decrypted credentials, raw credential exports, or runtime DB files. - n8n workflow backups can contain sensitive operational data; keep timestamped raw backups untracked unless intentionally sanitized.
- From host, use
127.0.0.1:<host-port>. - From
n8n-agent, use127.0.0.1:5678for n8n itself and172.19.0.1:<host-port>for host-published swarm services. - Agentmon
/healthzonly proves the web/API process is alive; pair it with snapshot freshness to prove the monitoring pipeline is flowing. - OpenClaw is intentionally dormant unless explicitly re-enabled; do not alert on VMs being shut off by default.
- OpenVINO NPU sidecars on
:18818,:18819,:18820, and optional:18829are prototypes/not-live unless a later approved change installs and routes them. Do not draw live Atlas/Hermes/RAG arrows to them in diagrams until that approval and implementation actually exist.