6.2 KiB
OpenVINO NPU document/image triage prototype
Local-only, CLI-first prototype for triaging screenshots, photos/scans, and PDF page images.
It returns structured JSON metadata and explicitly reports CPU vs NPU stages.
Optional HTTP is a localhost/loopback-only prototype on 127.0.0.1:18829 when explicitly started; non-loopback binds are rejected and it is not a live Atlas/Hermes/RAG integration.
Location: /home/will/lab/swarm/openvino-doc-image-triage-npu/
Privacy and safety
- No external uploads.
- The only network call is optional localhost-only embeddings at
127.0.0.1:18817. - Raw OCR/sidecar text is redacted by default and is not logged.
- Full source paths are omitted by default; responses include basename and SHA-256.
- Allowed roots are enforced for CLI/server requests.
- This prototype does not mutate Obsidian, RAG, Chroma, vector collections, routing, or gateway services.
- Do not process broad private document/image directories; use generated synthetic fixtures unless Will explicitly approves a narrow source root.
- See
SPEC.mdfor the full CLI contract, smoke-test plan, NPU verification plan, docs implications, and no-go/defer criteria.
CPU vs NPU stages
CPU:
- file intake, allowed-root checks, size checks, hashing
- image/PDF decoding/rendering and normalization
- optional local text extraction from sidecars or PDF text libraries
- regex metadata extraction and rule-based category fallback
- final needs-attention rules
NPU:
- needs-attention semantic embedding, via existing local OpenVINO embeddings service on
:18817 - verified with
/sys/class/accel/accel0/device/npu_busy_time_usbefore/after each embedding call
Not configured in v1:
- image category classifier on NPU. The JSON reports this as
CPU rule fallback (NPU model not configured in prototype v1). A future task can add a static-shape MobileNet/EfficientNet/ResNet OpenVINO IR model. - OCR on NPU. OCR remains CPU/local plumbing in v1.
Files
triage.py— core library and CLI.server.py— stdlib HTTP server with/healthz,/models,/triage,/triage/batch.make_samples.py— creates synthetic non-private image/PDF samples.tests/smoke_test.py— end-to-end smoke test, including NPU busy-time verification when:18817is reachable.samples/— generated synthetic fixtures.
Requirements
Use the existing NPU venv when available:
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python -m pip install pillow
pillow is already present in the discovered /home/will/.venvs/npu. Optional local PDF text/rendering improves PDF support:
/home/will/.venvs/npu/bin/python -m pip install pypdf pypdfium2
The smoke tests do not require external services except the existing localhost :18817 embeddings service for positive NPU verification.
CLI usage
Generate synthetic samples:
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python make_samples.py
Triage local files:
/home/will/.venvs/npu/bin/python triage.py \
--allowed-root /home/will/lab/swarm/openvino-doc-image-triage-npu \
--pretty \
samples/synthetic_invoice.png samples/synthetic_invoice.pdf
Disable the local NPU embeddings call if needed:
/home/will/.venvs/npu/bin/python triage.py --no-embeddings --allowed-root "$PWD" samples/synthetic_receipt.png
Include OCR/sidecar text in a single response only when explicitly requested:
/home/will/.venvs/npu/bin/python triage.py --include-ocr-text --allowed-root "$PWD" samples/synthetic_invoice.png
HTTP usage
The prototype is CLI-first. HTTP is optional and not enabled by default. If a foreground HTTP server is needed for review, prefer optional port 18829 so it does not collide with the GenAI worker prototype on 18820. Check the port first:
ss -ltnp | grep ':18829\b' || true
Start a local-only server and stop it after the smoke:
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python server.py --host 127.0.0.1 --port 18829 --allowed-root "$PWD"
Call it with synthetic/non-private fixtures only:
curl -sS http://127.0.0.1:18829/healthz | jq
curl -sS http://127.0.0.1:18829/models | jq
curl -sS -X POST http://127.0.0.1:18829/triage \
-H 'Content-Type: application/json' \
-d '{"path":"/home/will/lab/swarm/openvino-doc-image-triage-npu/samples/synthetic_invoice.png","options":{"allowed_roots":["/home/will/lab/swarm/openvino-doc-image-triage-npu"]}}' | jq
Do not install or enable a persistent service for this prototype without explicit approval, and do not point it at private document/image directories during smoke tests.
Smoke test
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python tests/smoke_test.py
Expected: JSON ending with "ok": true. The smoke test generates only synthetic fixtures, verifies non-loopback HTTP binds are rejected, starts its temporary server on a preflighted free localhost port, and terminates it before exit. If the embeddings service is up, the result should show positive NPU busy-time delta and each embedded page should report verified_npu: true.
Example output shape
{
"file_id": "sha256:...",
"source_path_basename": "synthetic_invoice.png",
"media_type": "image",
"page_count": 1,
"pages": [
{
"page_index": 0,
"classification": {
"label": "bill_or_invoice",
"confidence": 0.71,
"device": "CPU",
"method": "rule_based_fallback"
},
"needs_attention": {
"value": true,
"device": "NPU+CPU",
"reasons": ["amount_due", "due_date_present"],
"embedding": {"verified_npu": true, "npu_busy_delta_us": 12345}
},
"metadata": {"dates_count": 1, "amounts_count": 1, "raw_values_redacted": true},
"ocr": {"available": true, "device": "CPU"}
}
],
"processing_device_summary": {
"file_intake": "CPU",
"image_category_classification": "CPU rule fallback (NPU model not configured in prototype v1)",
"needs_attention_embedding": "NPU via local :18817",
"metadata_extraction": "CPU",
"npu_verified": true
},
"privacy": {"external_uploads": false, "raw_text_logged": false}
}