feat(npu): add advisory gateway wrapper
This commit is contained in:
@@ -133,6 +133,7 @@ Host/user services:
|
||||
- `openvino-router-classifier.service` — `:18819`, local-only dry-run Atlas/Hermes message classifier; advisory only
|
||||
- `openvino-genai-npu-worker.service` — `:18820`, local-only bounded GenAI worker for small background generation jobs
|
||||
- `openvino-doc-image-triage.service` — `:18829`, local-only document/image triage HTTP wrapper with allowed-root enforcement
|
||||
- `openvino-advisory-gateway.service` — `:18830`, local-only advisory envelope wrapper over classifier, GenAI, and doc/image triage; explicit no-authority contract
|
||||
|
||||
Local-only OpenVINO NPU sidecars:
|
||||
|
||||
@@ -142,8 +143,9 @@ Local-only OpenVINO NPU sidecars:
|
||||
| `18819` | router/classifier | live user service; dry-run only | no Hermes/Atlas routing, memory writes, service restarts, or outbound messages |
|
||||
| `18820` | bounded GenAI worker | live user service | background jobs only; not primary Atlas/Hermes model routing |
|
||||
| `18829` | document/image triage | live localhost server | allowed-root limited; no private directory processing unless explicitly approved; NPU stage is embeddings via `:18817` |
|
||||
| `18830` | advisory gateway | live user service; host-local Hermes-accessible wrapper | returns `openvino_advisory_v1` envelopes only; no routing, memory writes, external sends, tool execution, restarts, or process-root broadening from request payloads; n8n bridge access intentionally disabled unless separately approved |
|
||||
|
||||
These sidecars bind to `127.0.0.1` by default and must not be wired into live Atlas/Hermes routing, memory writes, broad private document processing, or primary model paths without explicit Will approval. Any NPU claim requires a positive `/sys/class/accel/accel0/device/npu_busy_time_us` delta before/after inference. HTTP 200 alone is not proof.
|
||||
These sidecars bind to `127.0.0.1` by default and must not be wired into live Atlas/Hermes routing, memory writes, broad private document processing, or primary model paths without explicit Will approval. Any NPU claim requires a positive `/sys/class/accel/accel0/device/npu_busy_time_us` delta before/after inference or service-reported equivalent. HTTP 200 alone is not proof.
|
||||
|
||||
### 5. Obsidian and RAG
|
||||
|
||||
|
||||
@@ -0,0 +1,85 @@
|
||||
# OpenVINO NPU advisory gateway
|
||||
|
||||
Local-only bounded wrapper for the classifier, GenAI worker, and doc/image triage sidecars.
|
||||
|
||||
- HTTP bind: `127.0.0.1:18830` only; Docker/n8n bridge access is intentionally not enabled by default
|
||||
- Service: `openvino-advisory-gateway.service`
|
||||
- Mode: advisory/shadow/draft only
|
||||
- Metadata log: `~/.local/state/openvino-advisory-gateway/events.sqlite`
|
||||
|
||||
## Authority boundary
|
||||
|
||||
Every response includes an explicit authority block:
|
||||
|
||||
```json
|
||||
{
|
||||
"may_route": false,
|
||||
"may_write_memory": false,
|
||||
"may_send_external": false,
|
||||
"may_process_private_dirs": false,
|
||||
"may_execute_tools": false,
|
||||
"may_restart_services": false
|
||||
}
|
||||
```
|
||||
|
||||
This service may provide hints and drafts. It must not become the live Atlas/Hermes router, memory writer, primary chat model, external sender, tool executor, service restarter, or broad private document processor without a separate approved integration.
|
||||
|
||||
## Endpoints
|
||||
|
||||
```text
|
||||
GET /healthz
|
||||
POST /v1/advisory/classify
|
||||
POST /v1/advisory/generate
|
||||
POST /v1/advisory/triage
|
||||
```
|
||||
|
||||
### Classifier shadow call
|
||||
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:18830/v1/advisory/classify \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"trace_id":"smoke","text":"Urgent: inspect service health and systemd status."}' | jq .
|
||||
```
|
||||
|
||||
### Bounded GenAI draft
|
||||
|
||||
Allowed jobs: `title`, `summary`, `notification`, `memory_candidate`.
|
||||
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:18830/v1/advisory/generate \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"job":"title","input":"Summarize a local health check.","max_new_tokens":24}' | jq .
|
||||
```
|
||||
|
||||
### Explicit-file doc/image triage
|
||||
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:18830/v1/advisory/triage \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"path":"/home/will/lab/swarm/openvino-doc-image-triage-npu/samples/synthetic_invoice.png","allowed_roots":["/home/will/lab/swarm/openvino-doc-image-triage-npu"]}' | jq .
|
||||
```
|
||||
|
||||
The gateway requires the path to be inside both:
|
||||
|
||||
1. a configured allowed root on the gateway process; and
|
||||
2. the request's explicit `allowed_roots` list, if one is provided.
|
||||
|
||||
Requests cannot broaden the process-configured roots. Do not broaden configured roots to private folders without explicit approval for that root and task.
|
||||
|
||||
## Install / run
|
||||
|
||||
```bash
|
||||
install -m 0644 openvino-advisory-gateway.service ~/.config/systemd/user/openvino-advisory-gateway.service
|
||||
systemctl --user daemon-reload
|
||||
systemctl --user enable --now openvino-advisory-gateway.service
|
||||
systemctl --user status openvino-advisory-gateway.service --no-pager
|
||||
```
|
||||
|
||||
`--allowed-root` may be repeated in the systemd unit when additional non-private fixture/review directories are approved. Keep the service bound to `127.0.0.1` unless Will explicitly approves a Docker-bridge exposure plan.
|
||||
|
||||
## Tests
|
||||
|
||||
```bash
|
||||
cd /home/will/lab/swarm/openvino-advisory-gateway
|
||||
python -m pytest tests/test_gateway.py -q
|
||||
```
|
||||
@@ -0,0 +1,350 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Local-only advisory gateway for OpenVINO NPU sidecars.
|
||||
|
||||
This service deliberately returns bounded advisory envelopes. It never routes,
|
||||
writes memory, sends external messages, executes tools, restarts services, or
|
||||
broadens document processing authority. Atlas/Hermes may use these outputs as
|
||||
hints only.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import hashlib
|
||||
import json
|
||||
import os
|
||||
import sqlite3
|
||||
import time
|
||||
import urllib.request
|
||||
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
|
||||
from pathlib import Path
|
||||
from typing import Any, Callable
|
||||
from urllib.parse import urlparse
|
||||
|
||||
HOST = "127.0.0.1"
|
||||
PORT = 18830
|
||||
CLASSIFIER_URL = "http://127.0.0.1:18819/v1/classify"
|
||||
GENAI_URL = "http://127.0.0.1:18820/v1/worker/generate"
|
||||
DOC_TRIAGE_URL = "http://127.0.0.1:18829/triage"
|
||||
DEFAULT_LOG_DB = Path(os.environ.get("NPU_ADVISORY_LOG_DB", "/home/will/.local/state/openvino-advisory-gateway/events.sqlite"))
|
||||
DEFAULT_ALLOWED_ROOT = Path("/home/will/lab/swarm/openvino-doc-image-triage-npu")
|
||||
DEFAULT_ALLOWED_ROOTS = [Path(p) for p in os.environ.get("NPU_ADVISORY_ALLOWED_ROOTS", str(DEFAULT_ALLOWED_ROOT)).split(os.pathsep) if p]
|
||||
ALLOWED_GENAI_JOBS = {"title", "summary", "notification", "memory_candidate"}
|
||||
|
||||
AUTHORITY = {
|
||||
"may_route": False,
|
||||
"may_write_memory": False,
|
||||
"may_send_external": False,
|
||||
"may_process_private_dirs": False,
|
||||
"may_execute_tools": False,
|
||||
"may_restart_services": False,
|
||||
}
|
||||
|
||||
|
||||
def sha256_text(text: str) -> str:
|
||||
return hashlib.sha256(text.encode("utf-8")).hexdigest()
|
||||
|
||||
|
||||
def http_post_json(url: str, payload: dict[str, Any], timeout_s: float = 20.0) -> dict[str, Any]:
|
||||
req = urllib.request.Request(url, data=json.dumps(payload).encode("utf-8"), headers={"Content-Type": "application/json"}, method="POST")
|
||||
with urllib.request.urlopen(req, timeout=timeout_s) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def http_get_json(url: str, timeout_s: float = 8.0) -> dict[str, Any]:
|
||||
with urllib.request.urlopen(url, timeout=timeout_s) as resp:
|
||||
body = resp.read().decode("utf-8")
|
||||
try:
|
||||
return json.loads(body)
|
||||
except json.JSONDecodeError:
|
||||
return {"ok": True, "raw_text": body[:120]}
|
||||
|
||||
|
||||
def _npu_delta_from(result: dict[str, Any], fallback: int | None = None) -> int | None:
|
||||
for key in ("npu_busy_delta_us", "sysfs_npu_busy_delta_us"):
|
||||
value = result.get(key)
|
||||
if isinstance(value, int):
|
||||
return value
|
||||
if isinstance(value, float):
|
||||
return int(value)
|
||||
return fallback
|
||||
|
||||
|
||||
def _doc_triage_npu_delta(result: dict[str, Any]) -> int | None:
|
||||
pages = ((result.get("result") or {}).get("pages") or []) if isinstance(result, dict) else []
|
||||
best: int | None = None
|
||||
for page in pages:
|
||||
emb = ((page.get("needs_attention") or {}).get("embedding") or {}) if isinstance(page, dict) else {}
|
||||
delta = emb.get("npu_busy_delta_us")
|
||||
if isinstance(delta, int):
|
||||
best = max(best or 0, delta)
|
||||
return best
|
||||
|
||||
|
||||
def build_envelope(
|
||||
*,
|
||||
service: str,
|
||||
operation: str,
|
||||
result: dict[str, Any],
|
||||
mode: str = "advisory",
|
||||
input_scope: str,
|
||||
npu_busy_delta_us: int | None,
|
||||
trace_id: str | None = None,
|
||||
warnings: list[str] | None = None,
|
||||
) -> dict[str, Any]:
|
||||
npu_ok = bool(isinstance(npu_busy_delta_us, int) and npu_busy_delta_us > 0)
|
||||
return {
|
||||
"ok": True,
|
||||
"schema": "openvino_advisory_v1",
|
||||
"service": service,
|
||||
"operation": operation,
|
||||
"mode": mode,
|
||||
"trace_id": trace_id,
|
||||
"input_scope": input_scope,
|
||||
"result": result,
|
||||
"npu_proof": {"required": True, "ok": npu_ok, "npu_busy_delta_us": npu_busy_delta_us},
|
||||
"authority": dict(AUTHORITY),
|
||||
"warnings": warnings or [],
|
||||
}
|
||||
|
||||
|
||||
class AdvisoryLogger:
|
||||
def __init__(self, db_path: str | Path = DEFAULT_LOG_DB):
|
||||
self.db_path = Path(db_path)
|
||||
self.db_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
self._init()
|
||||
|
||||
def _init(self) -> None:
|
||||
with sqlite3.connect(self.db_path) as con:
|
||||
con.execute(
|
||||
"""
|
||||
CREATE TABLE IF NOT EXISTS advisory_events (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
created_at REAL NOT NULL,
|
||||
service TEXT NOT NULL,
|
||||
operation TEXT NOT NULL,
|
||||
mode TEXT NOT NULL,
|
||||
input_scope TEXT NOT NULL,
|
||||
input_ref TEXT NOT NULL,
|
||||
npu_busy_delta_us INTEGER,
|
||||
ok INTEGER NOT NULL,
|
||||
raw_payload TEXT
|
||||
)
|
||||
"""
|
||||
)
|
||||
|
||||
def log(self, envelope: dict[str, Any], *, input_ref: str) -> None:
|
||||
proof = envelope.get("npu_proof") or {}
|
||||
with sqlite3.connect(self.db_path) as con:
|
||||
con.execute(
|
||||
"""
|
||||
INSERT INTO advisory_events(created_at, service, operation, mode, input_scope, input_ref,
|
||||
npu_busy_delta_us, ok, raw_payload)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, NULL)
|
||||
""",
|
||||
(
|
||||
time.time(),
|
||||
str(envelope.get("service")),
|
||||
str(envelope.get("operation")),
|
||||
str(envelope.get("mode")),
|
||||
str(envelope.get("input_scope")),
|
||||
input_ref,
|
||||
proof.get("npu_busy_delta_us"),
|
||||
1 if envelope.get("ok") else 0,
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
def classify_text(
|
||||
text: str,
|
||||
*,
|
||||
trace_id: str | None = None,
|
||||
http_post_json: Callable[[str, dict[str, Any], float], dict[str, Any]] = http_post_json,
|
||||
logger: AdvisoryLogger | None = None,
|
||||
timeout_s: float = 20.0,
|
||||
) -> dict[str, Any]:
|
||||
if not isinstance(text, str) or not text.strip():
|
||||
raise ValueError("text must be a non-empty string")
|
||||
payload = {"id": trace_id or "advisory", "text": text, "options": {"include_evidence": False, "dry_run": True}}
|
||||
result = http_post_json(CLASSIFIER_URL, payload, timeout_s)
|
||||
envelope = build_envelope(
|
||||
service="classifier",
|
||||
operation="classify",
|
||||
mode="shadow",
|
||||
input_scope="explicit_text",
|
||||
trace_id=trace_id,
|
||||
result={"labels": result.get("labels", {}), "model": result.get("model"), "service_mode": result.get("mode", "dry_run")},
|
||||
npu_busy_delta_us=_npu_delta_from(result),
|
||||
)
|
||||
if logger:
|
||||
logger.log(envelope, input_ref="text:sha256:" + sha256_text(text))
|
||||
return envelope
|
||||
|
||||
|
||||
def generate_bounded(
|
||||
job: str,
|
||||
text: str,
|
||||
*,
|
||||
max_new_tokens: int | None = None,
|
||||
trace_id: str | None = None,
|
||||
http_post_json: Callable[[str, dict[str, Any], float], dict[str, Any]] = http_post_json,
|
||||
logger: AdvisoryLogger | None = None,
|
||||
timeout_s: float = 180.0,
|
||||
) -> dict[str, Any]:
|
||||
if job not in ALLOWED_GENAI_JOBS:
|
||||
raise ValueError("unsupported advisory generation job")
|
||||
if not isinstance(text, str) or not text.strip():
|
||||
raise ValueError("input must be a non-empty string")
|
||||
payload: dict[str, Any] = {"job": job, "input": text}
|
||||
if max_new_tokens is not None:
|
||||
payload["max_new_tokens"] = max_new_tokens
|
||||
result = http_post_json(GENAI_URL, payload, timeout_s)
|
||||
envelope = build_envelope(
|
||||
service="genai",
|
||||
operation=f"generate:{job}",
|
||||
mode="draft",
|
||||
input_scope="explicit_text",
|
||||
trace_id=trace_id,
|
||||
result={"draft_text": result.get("text", ""), "json": result.get("json"), "timing_ms": result.get("timing_ms"), "final_authority": False},
|
||||
npu_busy_delta_us=_npu_delta_from(result),
|
||||
)
|
||||
if logger:
|
||||
logger.log(envelope, input_ref="text:sha256:" + sha256_text(text))
|
||||
return envelope
|
||||
|
||||
|
||||
def _resolve_allowed(path: str, allowed_roots: list[str] | None, configured_roots: list[Path] | None = None) -> tuple[Path, list[Path]]:
|
||||
configured = [p.expanduser().resolve() for p in (configured_roots or DEFAULT_ALLOWED_ROOTS)]
|
||||
if not configured:
|
||||
raise ValueError("at least one configured allowed root is required")
|
||||
requested = [Path(p).expanduser().resolve() for p in (allowed_roots or [str(p) for p in configured])]
|
||||
if not requested:
|
||||
raise ValueError("at least one requested allowed root is required")
|
||||
for root in requested:
|
||||
if not any(root == base or root.is_relative_to(base) for base in configured):
|
||||
raise ValueError("requested allowed root is outside configured roots")
|
||||
roots = requested
|
||||
candidate = Path(path).expanduser().resolve()
|
||||
if not any(candidate == root or candidate.is_relative_to(root) for root in roots):
|
||||
raise ValueError("path must be inside an allowed root")
|
||||
if not candidate.exists() or not candidate.is_file():
|
||||
raise ValueError("path must be an existing file")
|
||||
return candidate, roots
|
||||
|
||||
|
||||
def triage_file(
|
||||
path: str,
|
||||
*,
|
||||
allowed_roots: list[str] | None = None,
|
||||
configured_roots: list[Path] | None = None,
|
||||
trace_id: str | None = None,
|
||||
http_post_json: Callable[[str, dict[str, Any], float], dict[str, Any]] = http_post_json,
|
||||
logger: AdvisoryLogger | None = None,
|
||||
timeout_s: float = 60.0,
|
||||
) -> dict[str, Any]:
|
||||
candidate, roots = _resolve_allowed(path, allowed_roots, configured_roots)
|
||||
payload = {"path": str(candidate), "options": {"allowed_roots": [str(r) for r in roots], "max_pages": 3}}
|
||||
result = http_post_json(DOC_TRIAGE_URL, payload, timeout_s)
|
||||
delta = _doc_triage_npu_delta(result)
|
||||
envelope = build_envelope(
|
||||
service="doc_triage",
|
||||
operation="triage_file",
|
||||
mode="reviewable_artifact",
|
||||
input_scope="explicit_file",
|
||||
trace_id=trace_id,
|
||||
result={"triage": result.get("result"), "final_authority": False},
|
||||
npu_busy_delta_us=delta,
|
||||
)
|
||||
if logger:
|
||||
envelope["warnings"].append("metadata-only log; raw file contents are not logged")
|
||||
logger.log(envelope, input_ref="file:sha256path:" + sha256_text(str(candidate)))
|
||||
return envelope
|
||||
|
||||
|
||||
def health(*, http_get_json: Callable[[str, float], dict[str, Any]] = http_get_json) -> dict[str, Any]:
|
||||
deps = {
|
||||
"classifier": "http://127.0.0.1:18819/healthz",
|
||||
"genai": "http://127.0.0.1:18820/healthz",
|
||||
"doc_triage": "http://127.0.0.1:18829/healthz",
|
||||
}
|
||||
out: dict[str, Any] = {"ok": True, "service": "openvino-advisory-gateway", "mode": "advisory_only", "authority": dict(AUTHORITY), "dependencies": {}}
|
||||
for name, url in deps.items():
|
||||
try:
|
||||
data = http_get_json(url, 8.0)
|
||||
out["dependencies"][name] = {"ok": bool(data.get("ok", data.get("status") == "ok")), "service": data.get("service"), "device": data.get("device")}
|
||||
except Exception as exc:
|
||||
out["ok"] = False
|
||||
out["dependencies"][name] = {"ok": False, "error": str(exc)}
|
||||
return out
|
||||
|
||||
|
||||
def _read_json(handler: BaseHTTPRequestHandler, max_bytes: int = 256 * 1024) -> dict[str, Any]:
|
||||
length = int(handler.headers.get("Content-Length", "0"))
|
||||
if length > max_bytes:
|
||||
raise ValueError("request JSON too large")
|
||||
raw = handler.rfile.read(length)
|
||||
if not raw:
|
||||
return {}
|
||||
return json.loads(raw.decode("utf-8"))
|
||||
|
||||
|
||||
def make_handler(logger: AdvisoryLogger, configured_roots: list[Path]):
|
||||
class Handler(BaseHTTPRequestHandler):
|
||||
server_version = "openvino-advisory-gateway/0.1"
|
||||
|
||||
def log_message(self, format: str, *args: Any) -> None: # noqa: A002 - stdlib override name
|
||||
# Do not log request bodies or private paths.
|
||||
print(f"{self.client_address[0]} {format % args}")
|
||||
|
||||
def send_json(self, status: int, payload: Any) -> None:
|
||||
body = json.dumps(payload, indent=2, sort_keys=True).encode("utf-8")
|
||||
self.send_response(status)
|
||||
self.send_header("Content-Type", "application/json")
|
||||
self.send_header("Content-Length", str(len(body)))
|
||||
self.end_headers()
|
||||
self.wfile.write(body)
|
||||
|
||||
def do_GET(self) -> None: # noqa: N802
|
||||
if urlparse(self.path).path in ("/", "/health", "/healthz"):
|
||||
self.send_json(200, health())
|
||||
return
|
||||
self.send_json(404, {"ok": False, "error": "not_found"})
|
||||
|
||||
def do_POST(self) -> None: # noqa: N802
|
||||
path = urlparse(self.path).path
|
||||
try:
|
||||
payload = _read_json(self)
|
||||
if path == "/v1/advisory/classify":
|
||||
self.send_json(200, classify_text(str(payload.get("text", "")), trace_id=payload.get("trace_id"), logger=logger))
|
||||
return
|
||||
if path == "/v1/advisory/generate":
|
||||
self.send_json(200, generate_bounded(str(payload.get("job", "summary")), str(payload.get("input", "")), max_new_tokens=payload.get("max_new_tokens"), trace_id=payload.get("trace_id"), logger=logger))
|
||||
return
|
||||
if path == "/v1/advisory/triage":
|
||||
self.send_json(200, triage_file(str(payload.get("path", "")), allowed_roots=payload.get("allowed_roots"), configured_roots=configured_roots, trace_id=payload.get("trace_id"), logger=logger))
|
||||
return
|
||||
self.send_json(404, {"ok": False, "error": "not_found"})
|
||||
except Exception as exc:
|
||||
self.send_json(400, {"ok": False, "error": type(exc).__name__, "message": str(exc), "authority": dict(AUTHORITY)})
|
||||
|
||||
return Handler
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
parser = argparse.ArgumentParser(description="Local-only OpenVINO NPU advisory gateway")
|
||||
parser.add_argument("--host", default=os.environ.get("NPU_ADVISORY_HOST", HOST))
|
||||
parser.add_argument("--port", type=int, default=int(os.environ.get("NPU_ADVISORY_PORT", str(PORT))))
|
||||
parser.add_argument("--log-db", default=str(DEFAULT_LOG_DB))
|
||||
parser.add_argument("--allowed-root", action="append", dest="allowed_roots", default=None, help="Configured file root allowed for advisory doc/image triage. May be repeated.")
|
||||
args = parser.parse_args(argv)
|
||||
if args.host != "127.0.0.1":
|
||||
raise SystemExit("refusing non-local bind")
|
||||
configured_roots = [Path(p).expanduser().resolve() for p in (args.allowed_roots or DEFAULT_ALLOWED_ROOTS)]
|
||||
logger = AdvisoryLogger(args.log_db)
|
||||
server = ThreadingHTTPServer((args.host, args.port), make_handler(logger, configured_roots))
|
||||
print(json.dumps({"service": "openvino-advisory-gateway", "host": args.host, "port": args.port, "mode": "advisory_only"}), flush=True)
|
||||
server.serve_forever()
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
@@ -0,0 +1,17 @@
|
||||
[Unit]
|
||||
Description=OpenVINO NPU advisory gateway (local-only, port 18830)
|
||||
After=network.target openvino-router-classifier.service openvino-genai-npu-worker.service openvino-doc-image-triage.service
|
||||
Wants=openvino-router-classifier.service openvino-genai-npu-worker.service openvino-doc-image-triage.service
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
WorkingDirectory=/home/will/lab/swarm/openvino-advisory-gateway
|
||||
Environment=NPU_ADVISORY_HOST=127.0.0.1
|
||||
Environment=NPU_ADVISORY_PORT=18830
|
||||
Environment=NPU_ADVISORY_LOG_DB=/home/will/.local/state/openvino-advisory-gateway/events.sqlite
|
||||
ExecStart=/home/will/.venvs/npu/bin/python /home/will/lab/swarm/openvino-advisory-gateway/gateway.py --host 127.0.0.1 --port 18830 --allowed-root /home/will/lab/swarm/openvino-doc-image-triage-npu
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
@@ -0,0 +1,134 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import sqlite3
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
|
||||
import gateway
|
||||
|
||||
|
||||
def test_authority_envelope_is_advisory_and_forbids_side_effects() -> None:
|
||||
env = gateway.build_envelope(
|
||||
service="classifier",
|
||||
operation="classify",
|
||||
mode="shadow",
|
||||
result={"labels": {"workflow_category": {"value": "devops"}}},
|
||||
npu_busy_delta_us=123,
|
||||
input_scope="explicit_text",
|
||||
)
|
||||
|
||||
assert env["ok"] is True
|
||||
assert env["mode"] == "shadow"
|
||||
assert env["authority"] == {
|
||||
"may_route": False,
|
||||
"may_write_memory": False,
|
||||
"may_send_external": False,
|
||||
"may_process_private_dirs": False,
|
||||
"may_execute_tools": False,
|
||||
"may_restart_services": False,
|
||||
}
|
||||
assert env["npu_proof"] == {"required": True, "ok": True, "npu_busy_delta_us": 123}
|
||||
|
||||
|
||||
def test_classify_calls_sidecar_and_logs_metadata_only(tmp_path: Path) -> None:
|
||||
calls: list[tuple[str, dict]] = []
|
||||
|
||||
def fake_post(url: str, payload: dict, timeout_s: float) -> dict:
|
||||
calls.append((url, payload))
|
||||
return {
|
||||
"labels": {"tool_needed": {"value": True}},
|
||||
"npu_busy_delta_us": 55,
|
||||
"sysfs_npu_busy_delta_us": 55,
|
||||
}
|
||||
|
||||
logger = gateway.AdvisoryLogger(tmp_path / "events.sqlite")
|
||||
env = gateway.classify_text(
|
||||
"Inspect live service status",
|
||||
trace_id="t1",
|
||||
http_post_json=fake_post,
|
||||
logger=logger,
|
||||
)
|
||||
|
||||
assert calls[0][0].endswith(":18819/v1/classify")
|
||||
assert calls[0][1]["options"]["dry_run"] is True
|
||||
assert env["service"] == "classifier"
|
||||
assert env["authority"]["may_route"] is False
|
||||
assert env["npu_proof"]["ok"] is True
|
||||
|
||||
with sqlite3.connect(tmp_path / "events.sqlite") as con:
|
||||
row = con.execute("select service, operation, input_ref, raw_payload from advisory_events").fetchone()
|
||||
assert row == ("classifier", "classify", "text:sha256:" + gateway.sha256_text("Inspect live service status"), None)
|
||||
|
||||
|
||||
def test_generate_allows_only_bounded_jobs() -> None:
|
||||
with pytest.raises(ValueError, match="unsupported advisory generation job"):
|
||||
gateway.generate_bounded("primary_chat", "hello", http_post_json=lambda *_: {})
|
||||
|
||||
|
||||
def test_generate_wraps_draft_without_final_authority() -> None:
|
||||
def fake_post(url: str, payload: dict, timeout_s: float) -> dict:
|
||||
return {"text": "Short title", "npu_busy_delta_us": 99, "timing_ms": {"total": 10}}
|
||||
|
||||
env = gateway.generate_bounded("title", "Summarize this local health check", http_post_json=fake_post)
|
||||
|
||||
assert env["service"] == "genai"
|
||||
assert env["operation"] == "generate:title"
|
||||
assert env["result"]["draft_text"] == "Short title"
|
||||
assert env["result"]["final_authority"] is False
|
||||
assert env["authority"]["may_send_external"] is False
|
||||
|
||||
|
||||
def test_doc_triage_requires_explicit_file_under_allowed_root(tmp_path: Path) -> None:
|
||||
allowed = tmp_path / "allowed"
|
||||
allowed.mkdir()
|
||||
target = allowed / "synthetic.png"
|
||||
target.write_bytes(b"not real image for unit test")
|
||||
|
||||
def fake_post(url: str, payload: dict, timeout_s: float) -> dict:
|
||||
assert payload["path"] == str(target.resolve())
|
||||
assert payload["options"]["allowed_roots"] == [str(allowed.resolve())]
|
||||
return {"ok": True, "result": {"pages": [{"needs_attention": {"embedding": {"verified_npu": True, "npu_busy_delta_us": 42}}}]}}
|
||||
|
||||
env = gateway.triage_file(str(target), allowed_roots=[str(allowed)], configured_roots=[allowed], http_post_json=fake_post)
|
||||
|
||||
assert env["service"] == "doc_triage"
|
||||
assert env["input_scope"] == "explicit_file"
|
||||
assert env["npu_proof"]["ok"] is True
|
||||
|
||||
|
||||
def test_doc_triage_rejects_private_root_broadening(tmp_path: Path) -> None:
|
||||
allowed = tmp_path / "allowed"
|
||||
allowed.mkdir()
|
||||
with pytest.raises(ValueError, match="path must be inside an allowed root"):
|
||||
gateway.triage_file(str(tmp_path / "outside.png"), allowed_roots=[str(allowed)], configured_roots=[allowed], http_post_json=lambda *_: {})
|
||||
|
||||
|
||||
def test_doc_triage_rejects_requested_root_outside_configured_roots(tmp_path: Path) -> None:
|
||||
configured = tmp_path / "configured"
|
||||
requested = tmp_path / "private"
|
||||
requested.mkdir()
|
||||
target = requested / "file.png"
|
||||
target.write_bytes(b"synthetic")
|
||||
|
||||
with pytest.raises(ValueError, match="requested allowed root is outside configured roots"):
|
||||
gateway.triage_file(
|
||||
str(target),
|
||||
allowed_roots=[str(requested)],
|
||||
configured_roots=[configured],
|
||||
http_post_json=lambda *_: {},
|
||||
)
|
||||
|
||||
|
||||
def test_health_aggregates_dependencies_without_raw_private_data() -> None:
|
||||
def fake_get(url: str, timeout_s: float) -> dict:
|
||||
return {"ok": True, "service": url.rsplit(":", 1)[-1]}
|
||||
|
||||
health = gateway.health(http_get_json=fake_get)
|
||||
|
||||
assert health["ok"] is True
|
||||
assert set(health["dependencies"]) == {"classifier", "genai", "doc_triage"}
|
||||
assert "raw" not in json.dumps(health).lower()
|
||||
@@ -46,10 +46,10 @@ printf 'busy_time_us=%s\n' "$(busy_value)"
|
||||
|
||||
section "Listeners"
|
||||
# Required OpenVINO/NPU program ports: live baseline 18810/18816/18817,
|
||||
# approved prototypes 18818/18819/18820, and optional doc/image triage 18829.
|
||||
# reranker 18818, local-only specialists 18819/18820/18829, and advisory gateway 18830.
|
||||
# 18814 is the existing RAG/embedding health wrapper; 18828 is a review-only
|
||||
# alternate used to avoid collisions during prior smoke tests.
|
||||
ss -ltnp | grep -E ':(18810|18814|18816|18817|18818|18819|18820|18828|18829)\b' || true
|
||||
ss -ltnp | grep -E ':(18810|18814|18816|18817|18818|18819|18820|18828|18829|18830)\b' || true
|
||||
|
||||
section "User service states"
|
||||
for unit in \
|
||||
|
||||
Reference in New Issue
Block a user