165 lines
6.2 KiB
Markdown
165 lines
6.2 KiB
Markdown
# OpenVINO NPU document/image triage prototype
|
|
|
|
Local-only, CLI-first prototype for triaging screenshots, photos/scans, and PDF page images.
|
|
It returns structured JSON metadata and explicitly reports CPU vs NPU stages.
|
|
Optional HTTP is a localhost/loopback-only prototype on `127.0.0.1:18829` when explicitly started; non-loopback binds are rejected and it is not a live Atlas/Hermes/RAG integration.
|
|
|
|
Location: `/home/will/lab/swarm/openvino-doc-image-triage-npu/`
|
|
|
|
## Privacy and safety
|
|
|
|
- No external uploads.
|
|
- The only network call is optional localhost-only embeddings at `127.0.0.1:18817`.
|
|
- Raw OCR/sidecar text is redacted by default and is not logged.
|
|
- Full source paths are omitted by default; responses include basename and SHA-256.
|
|
- Allowed roots are enforced for CLI/server requests.
|
|
- This prototype does not mutate Obsidian, RAG, Chroma, vector collections, routing, or gateway services.
|
|
- Do not process broad private document/image directories; use generated synthetic fixtures unless Will explicitly approves a narrow source root.
|
|
- See `SPEC.md` for the full CLI contract, smoke-test plan, NPU verification plan, docs implications, and no-go/defer criteria.
|
|
|
|
## CPU vs NPU stages
|
|
|
|
CPU:
|
|
- file intake, allowed-root checks, size checks, hashing
|
|
- image/PDF decoding/rendering and normalization
|
|
- optional local text extraction from sidecars or PDF text libraries
|
|
- regex metadata extraction and rule-based category fallback
|
|
- final needs-attention rules
|
|
|
|
NPU:
|
|
- needs-attention semantic embedding, via existing local OpenVINO embeddings service on `:18817`
|
|
- verified with `/sys/class/accel/accel0/device/npu_busy_time_us` before/after each embedding call
|
|
|
|
Not configured in v1:
|
|
- image category classifier on NPU. The JSON reports this as `CPU rule fallback (NPU model not configured in prototype v1)`. A future task can add a static-shape MobileNet/EfficientNet/ResNet OpenVINO IR model.
|
|
- OCR on NPU. OCR remains CPU/local plumbing in v1.
|
|
|
|
## Files
|
|
|
|
- `triage.py` — core library and CLI.
|
|
- `server.py` — stdlib HTTP server with `/healthz`, `/models`, `/triage`, `/triage/batch`.
|
|
- `make_samples.py` — creates synthetic non-private image/PDF samples.
|
|
- `tests/smoke_test.py` — end-to-end smoke test, including NPU busy-time verification when `:18817` is reachable.
|
|
- `samples/` — generated synthetic fixtures.
|
|
|
|
## Requirements
|
|
|
|
Use the existing NPU venv when available:
|
|
|
|
```bash
|
|
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
|
|
/home/will/.venvs/npu/bin/python -m pip install pillow
|
|
```
|
|
|
|
`pillow` is already present in the discovered `/home/will/.venvs/npu`. Optional local PDF text/rendering improves PDF support:
|
|
|
|
```bash
|
|
/home/will/.venvs/npu/bin/python -m pip install pypdf pypdfium2
|
|
```
|
|
|
|
The smoke tests do not require external services except the existing localhost `:18817` embeddings service for positive NPU verification.
|
|
|
|
## CLI usage
|
|
|
|
Generate synthetic samples:
|
|
|
|
```bash
|
|
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
|
|
/home/will/.venvs/npu/bin/python make_samples.py
|
|
```
|
|
|
|
Triage local files:
|
|
|
|
```bash
|
|
/home/will/.venvs/npu/bin/python triage.py \
|
|
--allowed-root /home/will/lab/swarm/openvino-doc-image-triage-npu \
|
|
--pretty \
|
|
samples/synthetic_invoice.png samples/synthetic_invoice.pdf
|
|
```
|
|
|
|
Disable the local NPU embeddings call if needed:
|
|
|
|
```bash
|
|
/home/will/.venvs/npu/bin/python triage.py --no-embeddings --allowed-root "$PWD" samples/synthetic_receipt.png
|
|
```
|
|
|
|
Include OCR/sidecar text in a single response only when explicitly requested:
|
|
|
|
```bash
|
|
/home/will/.venvs/npu/bin/python triage.py --include-ocr-text --allowed-root "$PWD" samples/synthetic_invoice.png
|
|
```
|
|
|
|
## HTTP usage
|
|
|
|
The prototype is CLI-first. HTTP is optional and not enabled by default. If a foreground HTTP server is needed for review, prefer optional port `18829` so it does not collide with the GenAI worker prototype on `18820`. Check the port first:
|
|
|
|
```bash
|
|
ss -ltnp | grep ':18829\b' || true
|
|
```
|
|
|
|
Start a local-only server and stop it after the smoke:
|
|
|
|
```bash
|
|
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
|
|
/home/will/.venvs/npu/bin/python server.py --host 127.0.0.1 --port 18829 --allowed-root "$PWD"
|
|
```
|
|
|
|
Call it with synthetic/non-private fixtures only:
|
|
|
|
```bash
|
|
curl -sS http://127.0.0.1:18829/healthz | jq
|
|
curl -sS http://127.0.0.1:18829/models | jq
|
|
curl -sS -X POST http://127.0.0.1:18829/triage \
|
|
-H 'Content-Type: application/json' \
|
|
-d '{"path":"/home/will/lab/swarm/openvino-doc-image-triage-npu/samples/synthetic_invoice.png","options":{"allowed_roots":["/home/will/lab/swarm/openvino-doc-image-triage-npu"]}}' | jq
|
|
```
|
|
|
|
Do not install or enable a persistent service for this prototype without explicit approval, and do not point it at private document/image directories during smoke tests.
|
|
|
|
## Smoke test
|
|
|
|
```bash
|
|
cd /home/will/lab/swarm/openvino-doc-image-triage-npu
|
|
/home/will/.venvs/npu/bin/python tests/smoke_test.py
|
|
```
|
|
|
|
Expected: JSON ending with `"ok": true`. The smoke test generates only synthetic fixtures, verifies non-loopback HTTP binds are rejected, starts its temporary server on a preflighted free localhost port, and terminates it before exit. If the embeddings service is up, the result should show positive NPU busy-time delta and each embedded page should report `verified_npu: true`.
|
|
|
|
## Example output shape
|
|
|
|
```json
|
|
{
|
|
"file_id": "sha256:...",
|
|
"source_path_basename": "synthetic_invoice.png",
|
|
"media_type": "image",
|
|
"page_count": 1,
|
|
"pages": [
|
|
{
|
|
"page_index": 0,
|
|
"classification": {
|
|
"label": "bill_or_invoice",
|
|
"confidence": 0.71,
|
|
"device": "CPU",
|
|
"method": "rule_based_fallback"
|
|
},
|
|
"needs_attention": {
|
|
"value": true,
|
|
"device": "NPU+CPU",
|
|
"reasons": ["amount_due", "due_date_present"],
|
|
"embedding": {"verified_npu": true, "npu_busy_delta_us": 12345}
|
|
},
|
|
"metadata": {"dates_count": 1, "amounts_count": 1, "raw_values_redacted": true},
|
|
"ocr": {"available": true, "device": "CPU"}
|
|
}
|
|
],
|
|
"processing_device_summary": {
|
|
"file_intake": "CPU",
|
|
"image_category_classification": "CPU rule fallback (NPU model not configured in prototype v1)",
|
|
"needs_attention_embedding": "NPU via local :18817",
|
|
"metadata_extraction": "CPU",
|
|
"npu_verified": true
|
|
},
|
|
"privacy": {"external_uploads": false, "raw_text_logged": false}
|
|
}
|
|
```
|