[verified] refresh OpenVINO router classifier prototype
This commit is contained in:
@@ -17,6 +17,7 @@ services, or send external messages.
|
||||
- Default port: `18819`
|
||||
- Default bind: `127.0.0.1`
|
||||
- Upstream: `http://127.0.0.1:18817/v1/embeddings`
|
||||
- Batch limit: `OPENVINO_CLASSIFIER_MAX_BATCH_SIZE`, default `32`
|
||||
- Model label: `bge-base-en-v1.5-int8-ov/prototype-router-v0`
|
||||
- NPU proof: `/sys/class/accel/accel0/device/npu_busy_time_us` before/after plus upstream `npu_busy_delta_us`
|
||||
|
||||
@@ -90,6 +91,10 @@ cd /home/will/lab/swarm/openvino-classifier-npu
|
||||
/home/will/.venvs/npu/bin/python router_classifier.py --host 127.0.0.1 --port 18819
|
||||
```
|
||||
|
||||
Environment variables mirror the flags: `OPENVINO_CLASSIFIER_HOST`,
|
||||
`OPENVINO_CLASSIFIER_PORT`, `OPENVINO_CLASSIFIER_EMBED_URL`,
|
||||
`OPENVINO_CLASSIFIER_TIMEOUT_S`, and `OPENVINO_CLASSIFIER_MAX_BATCH_SIZE`.
|
||||
|
||||
Then from another shell:
|
||||
|
||||
```bash
|
||||
@@ -102,6 +107,15 @@ curl -fsS http://127.0.0.1:18819/v1/classify \
|
||||
A valid NPU-backed response must have positive `npu_busy_delta_us`; HTTP 200 by
|
||||
itself is not considered proof.
|
||||
|
||||
Synthetic fixture smoke helper, after the foreground service is running:
|
||||
|
||||
```bash
|
||||
/home/will/.venvs/npu/bin/python smoke_classifier.py --base-url http://127.0.0.1:18819
|
||||
```
|
||||
|
||||
The helper refuses non-local URLs, checks fixture label expectations, and prints
|
||||
response plus outer sysfs NPU busy deltas.
|
||||
|
||||
## Tests
|
||||
|
||||
Unit tests use a fake embedding client and do not touch the NPU:
|
||||
|
||||
Reference in New Issue
Block a user