feat(npu): add advisory gateway wrapper

This commit is contained in:
William Valentin
2026-06-04 16:03:52 -07:00
parent 401321a6d5
commit 59c5fd3e57
6 changed files with 591 additions and 3 deletions
+3 -1
View File
@@ -133,6 +133,7 @@ Host/user services:
- `openvino-router-classifier.service``:18819`, local-only dry-run Atlas/Hermes message classifier; advisory only
- `openvino-genai-npu-worker.service``:18820`, local-only bounded GenAI worker for small background generation jobs
- `openvino-doc-image-triage.service``:18829`, local-only document/image triage HTTP wrapper with allowed-root enforcement
- `openvino-advisory-gateway.service``:18830`, local-only advisory envelope wrapper over classifier, GenAI, and doc/image triage; explicit no-authority contract
Local-only OpenVINO NPU sidecars:
@@ -142,8 +143,9 @@ Local-only OpenVINO NPU sidecars:
| `18819` | router/classifier | live user service; dry-run only | no Hermes/Atlas routing, memory writes, service restarts, or outbound messages |
| `18820` | bounded GenAI worker | live user service | background jobs only; not primary Atlas/Hermes model routing |
| `18829` | document/image triage | live localhost server | allowed-root limited; no private directory processing unless explicitly approved; NPU stage is embeddings via `:18817` |
| `18830` | advisory gateway | live user service; host-local Hermes-accessible wrapper | returns `openvino_advisory_v1` envelopes only; no routing, memory writes, external sends, tool execution, restarts, or process-root broadening from request payloads; n8n bridge access intentionally disabled unless separately approved |
These sidecars bind to `127.0.0.1` by default and must not be wired into live Atlas/Hermes routing, memory writes, broad private document processing, or primary model paths without explicit Will approval. Any NPU claim requires a positive `/sys/class/accel/accel0/device/npu_busy_time_us` delta before/after inference. HTTP 200 alone is not proof.
These sidecars bind to `127.0.0.1` by default and must not be wired into live Atlas/Hermes routing, memory writes, broad private document processing, or primary model paths without explicit Will approval. Any NPU claim requires a positive `/sys/class/accel/accel0/device/npu_busy_time_us` delta before/after inference or service-reported equivalent. HTTP 200 alone is not proof.
### 5. Obsidian and RAG