# OpenVINO NPU router classifier prototype Dry-run Atlas/Hermes message classifier/router prototype. The detailed dry-run contract is in [`CONTRACT.md`](./CONTRACT.md), including the recommended model/runtime, HTTP/CLI schema, smoke-test plan, NPU busy-time proof, docs/diagram implications, and no-go/defer criteria. It reuses the existing OpenVINO NPU embeddings service on `127.0.0.1:18817` and serves an inspectable stdlib HTTP API on `127.0.0.1:18819`. It does not change live Hermes/Atlas routing, write memory, mutate vector collections, restart services, or send external messages. ## Runtime shape - Service: `atlas-router-classifier` - Default port: `18819` - Default bind: `127.0.0.1` - Upstream: `http://127.0.0.1:18817/v1/embeddings` - Batch limit: `OPENVINO_CLASSIFIER_MAX_BATCH_SIZE`, default `32` - Model label: `bge-base-en-v1.5-int8-ov/prototype-router-v0` - NPU proof: `/sys/class/accel/accel0/device/npu_busy_time_us` before/after plus upstream `npu_busy_delta_us` The classifier uses deterministic high-precision rules for safety/urgency/tool signals plus cosine similarity against curated embedding prototypes for workflow and memory recommendations. This is intentionally tunable without model training. ## API ### GET `/healthz` Returns service metadata, labels, prototype count, NPU sysfs counter, and warmup NPU delta. ### GET `/v1/labels` Returns label enum values, thresholds, and prototype IDs without dumping private fixtures. ### POST `/v1/classify` Request: ```json { "id": "optional trace id", "text": "User message or task body to classify.", "context": {"platform": "cli", "source": "user"}, "options": { "include_evidence": true, "include_embedding_debug": false, "dry_run": true } } ``` Response includes: - `labels.tool_needed`: boolean, confidence, threshold, reason codes - `labels.memory_candidate`: `none | user_preference | durable_user_fact | environment_fact | workflow_convention | skill_candidate` - `labels.urgency`: `low | normal | high | critical` - `labels.workflow_category`: `chat | research | coding | debugging | devops | smart_home | media | note_taking | productivity | kanban | unknown` - `labels.safety_confirmation_required`: boolean, confidence, reason codes - `npu_busy_delta_us` and `sysfs_npu_busy_delta_us` - `evidence` when requested ### POST `/v1/batch_classify` Request: ```json { "items": [{"id": "m1", "text": "What time is it?"}], "options": {"include_evidence": false, "dry_run": true} } ``` ## Local smoke test Check that the proposed port is free first: ```bash ss -ltnp | grep ':18819' || true ``` Run without installing anything extra; `/home/will/.venvs/npu` already has the stdlib plus requests/openvino stack used by the upstream embeddings service: ```bash cd /home/will/lab/swarm/openvino-classifier-npu /home/will/.venvs/npu/bin/python router_classifier.py --host 127.0.0.1 --port 18819 ``` Environment variables mirror the flags: `OPENVINO_CLASSIFIER_HOST`, `OPENVINO_CLASSIFIER_PORT`, `OPENVINO_CLASSIFIER_EMBED_URL`, `OPENVINO_CLASSIFIER_TIMEOUT_S`, and `OPENVINO_CLASSIFIER_MAX_BATCH_SIZE`. Then from another shell: ```bash curl -fsS http://127.0.0.1:18819/healthz | jq . curl -fsS http://127.0.0.1:18819/v1/classify \ -H 'Content-Type: application/json' \ -d '{"id":"smoke","text":"Urgent: check whether port 18817 is listening and inspect systemd logs.","options":{"include_evidence":true}}' | jq . ``` A valid NPU-backed response must have positive `npu_busy_delta_us`; HTTP 200 by itself is not considered proof. Synthetic fixture smoke helper, after the foreground service is running: ```bash /home/will/.venvs/npu/bin/python smoke_classifier.py --base-url http://127.0.0.1:18819 ``` The helper refuses non-local URLs, checks fixture label expectations, and prints response plus outer sysfs NPU busy deltas. ## Tests Unit tests use a fake embedding client and do not touch the NPU: ```bash /home/will/.venvs/npu/bin/python -m unittest discover -s openvino-classifier-npu/tests -v ``` Fixture messages live at `fixtures/atlas_hermes_messages.jsonl`. ## Optional systemd user unit A draft unit is included as `openvino-router-classifier.service`. Install only after review/approval: ```bash cp openvino-router-classifier.service ~/.config/systemd/user/openvino-router-classifier.service systemctl --user daemon-reload systemctl --user start openvino-router-classifier.service systemctl --user status openvino-router-classifier.service --no-pager ``` Do not enable it at boot or connect it to live Atlas/Hermes routing as part of this prototype task without explicit approval. Keep classifier decisions dry-run until a separate approved routing change lands.