Files
swarm-master/openvino-doc-image-triage-npu/README.md
T
2026-06-04 15:32:32 -07:00

175 lines
6.8 KiB
Markdown

# OpenVINO NPU document/image triage prototype
Local-only, CLI-first prototype for triaging screenshots, photos/scans, and PDF page images.
It returns structured JSON metadata and explicitly reports CPU vs NPU stages.
Optional HTTP is a localhost/loopback-only prototype on `127.0.0.1:18829` when explicitly started; non-loopback binds are rejected and it is not a live Atlas/Hermes/RAG integration.
Location: `/home/will/lab/swarm/openvino-doc-image-triage-npu/`
## Privacy and safety
- No external uploads.
- The only network call is optional localhost-only embeddings at `127.0.0.1:18817`.
- Raw OCR/sidecar text is redacted by default and is not logged.
- Full source paths are omitted by default; responses include basename and SHA-256.
- Allowed roots are enforced for CLI/server requests.
- This prototype does not mutate Obsidian, RAG, Chroma, vector collections, routing, or gateway services.
- Do not process broad private document/image directories; use generated synthetic fixtures unless Will explicitly approves a narrow source root.
- See `SPEC.md` for the full CLI contract, smoke-test plan, NPU verification plan, docs implications, and no-go/defer criteria.
## CPU vs NPU stages
CPU:
- file intake, allowed-root checks, size checks, hashing
- image/PDF decoding/rendering and normalization
- optional local text extraction from sidecars or PDF text libraries
- regex metadata extraction and rule-based category fallback
- final needs-attention rules
NPU:
- needs-attention semantic embedding, via existing local OpenVINO embeddings service on `:18817`
- verified with `/sys/class/accel/accel0/device/npu_busy_time_us` before/after each embedding call
Not configured in v1:
- image category classifier on NPU. The JSON reports this as `CPU rule fallback (NPU model not configured in prototype v1)`. A future task can add a static-shape MobileNet/EfficientNet/ResNet OpenVINO IR model.
- OCR on NPU. OCR remains CPU/local plumbing in v1.
## Files
- `triage.py` — core library and CLI.
- `server.py` — stdlib HTTP server with `/healthz`, `/models`, `/triage`, `/triage/batch`.
- `openvino-doc-image-triage.service` — local-only user-systemd service template for `127.0.0.1:18829`, limited to this prototype directory as its default allowed root.
- `make_samples.py` — creates synthetic non-private image/PDF samples.
- `tests/smoke_test.py` — end-to-end smoke test, including NPU busy-time verification when `:18817` is reachable.
- `samples/` — generated synthetic fixtures.
## Requirements
Use the existing NPU venv when available:
```bash
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python -m pip install pillow
```
`pillow` is already present in the discovered `/home/will/.venvs/npu`. Optional local PDF text/rendering improves PDF support:
```bash
/home/will/.venvs/npu/bin/python -m pip install pypdf pypdfium2
```
The smoke tests do not require external services except the existing localhost `:18817` embeddings service for positive NPU verification.
## CLI usage
Generate synthetic samples:
```bash
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python make_samples.py
```
Triage local files:
```bash
/home/will/.venvs/npu/bin/python triage.py \
--allowed-root /home/will/lab/swarm/openvino-doc-image-triage-npu \
--pretty \
samples/synthetic_invoice.png samples/synthetic_invoice.pdf
```
Disable the local NPU embeddings call if needed:
```bash
/home/will/.venvs/npu/bin/python triage.py --no-embeddings --allowed-root "$PWD" samples/synthetic_receipt.png
```
Include OCR/sidecar text in a single response only when explicitly requested:
```bash
/home/will/.venvs/npu/bin/python triage.py --include-ocr-text --allowed-root "$PWD" samples/synthetic_invoice.png
```
## HTTP usage
The prototype is CLI-first, and the local HTTP wrapper can be run as a reviewed user-systemd service on `127.0.0.1:18829` with an allowlist rooted at this prototype directory. Keep it local-only and do not broaden allowed roots to private document/image directories without explicit approval. Check the port first:
```bash
ss -ltnp | grep ':18829\b' || true
```
Start a local-only server and stop it after the smoke:
```bash
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python server.py --host 127.0.0.1 --port 18829 --allowed-root "$PWD"
```
Install/enable the reviewed local-only service template when the HTTP wrapper should persist across logins:
```bash
install -m 0644 openvino-doc-image-triage.service ~/.config/systemd/user/openvino-doc-image-triage.service
systemctl --user daemon-reload
systemctl --user enable --now openvino-doc-image-triage.service
systemctl --user status openvino-doc-image-triage.service --no-pager
```
Call it with synthetic/non-private fixtures only:
```bash
curl -sS http://127.0.0.1:18829/healthz | jq
curl -sS http://127.0.0.1:18829/models | jq
curl -sS -X POST http://127.0.0.1:18829/triage \
-H 'Content-Type: application/json' \
-d '{"path":"/home/will/lab/swarm/openvino-doc-image-triage-npu/samples/synthetic_invoice.png","options":{"allowed_roots":["/home/will/lab/swarm/openvino-doc-image-triage-npu"]}}' | jq
```
Do not point it at private document/image directories during smoke tests unless Will explicitly approves the exact source root.
## Smoke test
```bash
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python tests/smoke_test.py
```
Expected: JSON ending with `"ok": true`. The smoke test generates only synthetic fixtures, verifies non-loopback HTTP binds are rejected, starts its temporary server on a preflighted free localhost port, and terminates it before exit. If the embeddings service is up, the result should show positive NPU busy-time delta and each embedded page should report `verified_npu: true`.
## Example output shape
```json
{
"file_id": "sha256:...",
"source_path_basename": "synthetic_invoice.png",
"media_type": "image",
"page_count": 1,
"pages": [
{
"page_index": 0,
"classification": {
"label": "bill_or_invoice",
"confidence": 0.71,
"device": "CPU",
"method": "rule_based_fallback"
},
"needs_attention": {
"value": true,
"device": "NPU+CPU",
"reasons": ["amount_due", "due_date_present"],
"embedding": {"verified_npu": true, "npu_busy_delta_us": 12345}
},
"metadata": {"dates_count": 1, "amounts_count": 1, "raw_values_redacted": true},
"ocr": {"available": true, "device": "CPU"}
}
],
"processing_device_summary": {
"file_intake": "CPU",
"image_category_classification": "CPU rule fallback (NPU model not configured in prototype v1)",
"needs_attention_embedding": "NPU via local :18817",
"metadata_extraction": "CPU",
"npu_verified": true
},
"privacy": {"external_uploads": false, "raw_text_logged": false}
}
```