[verified] refresh OpenVINO NPU reranker prototype

This commit is contained in:
William Valentin
2026-06-04 12:16:15 -07:00
parent 4dc77bb0c7
commit 418be69f96
4 changed files with 93 additions and 3 deletions
+13 -2
View File
@@ -13,8 +13,9 @@ This service is intentionally not wired into live RAG by default.
## Files
- `SPEC.md` — endpoint/CLI contract, model/runtime recommendation, smoke/NPU proof plan, RAG integration plan, docs implications, and no-go criteria.
- `server.py` — stdlib HTTP OpenVINO Runtime service.
- `server.py` — stdlib HTTP OpenVINO Runtime service with fail-fast localhost listener conflict checks and request validation.
- `smoke.py` — non-private API/ranking/NPU busy-time smoke test.
- `tests/test_server_validation.py` — stdlib unit checks for request validation and listener conflict detection.
- `openvino-reranker.service` — optional user-systemd unit.
## One-time setup
@@ -62,7 +63,7 @@ OPENVINO_RERANKER_MODEL_DIR=/home/will/.cache/openvino-models/rerankers/ms-marco
python /home/will/lab/swarm/openvino-reranker-npu/server.py
```
Startup performs a non-private smoke inference and fails closed when `OPENVINO_RERANKER_DEVICE=NPU` but `npu_busy_time_us` does not increase.
Startup performs a non-private smoke inference and fails closed when `OPENVINO_RERANKER_DEVICE=NPU` but `npu_busy_time_us` does not increase. It also checks whether the requested listener can bind before compiling the OpenVINO model, so obvious port conflicts fail fast; the real server bind still happens immediately after model load.
## API
@@ -110,6 +111,16 @@ Expected:
- The top result matches the non-private fixture expectation.
- Response and sysfs `npu_busy_delta_us` are positive.
## Validation checks
```bash
source /home/will/.venvs/openvino-reranker/bin/activate
PYTHONPATH=/home/will/lab/swarm/openvino-reranker-npu \
python -m unittest discover -s /home/will/lab/swarm/openvino-reranker-npu/tests
```
These checks do not compile the OpenVINO model; they cover request validation and fail-fast listener conflict detection.
## Optional systemd user service
Install the unit only after the foreground command and smoke test pass: