Files
swarm-master/openvino-doc-image-triage-npu

OpenVINO NPU document/image triage prototype

Local-only, CLI-first prototype for triaging screenshots, photos/scans, and PDF page images. It returns structured JSON metadata and explicitly reports CPU vs NPU stages. Optional HTTP is a localhost/loopback-only prototype on 127.0.0.1:18829 when explicitly started; non-loopback binds are rejected and it is not a live Atlas/Hermes/RAG integration.

Location: /home/will/lab/swarm/openvino-doc-image-triage-npu/

Privacy and safety

  • No external uploads.
  • The only network call is optional localhost-only embeddings at 127.0.0.1:18817.
  • Raw OCR/sidecar text is redacted by default and is not logged.
  • Full source paths are omitted by default; responses include basename and SHA-256.
  • Allowed roots are enforced for CLI/server requests.
  • This prototype does not mutate Obsidian, RAG, Chroma, vector collections, routing, or gateway services.
  • Do not process broad private document/image directories; use generated synthetic fixtures unless Will explicitly approves a narrow source root.
  • See SPEC.md for the full CLI contract, smoke-test plan, NPU verification plan, docs implications, and no-go/defer criteria.

CPU vs NPU stages

CPU:

  • file intake, allowed-root checks, size checks, hashing
  • image/PDF decoding/rendering and normalization
  • optional local text extraction from sidecars or PDF text libraries
  • regex metadata extraction and rule-based category fallback
  • final needs-attention rules

NPU:

  • needs-attention semantic embedding, via existing local OpenVINO embeddings service on :18817
  • verified with /sys/class/accel/accel0/device/npu_busy_time_us before/after each embedding call

Not configured in v1:

  • image category classifier on NPU. The JSON reports this as CPU rule fallback (NPU model not configured in prototype v1). A future task can add a static-shape MobileNet/EfficientNet/ResNet OpenVINO IR model.
  • OCR on NPU. OCR remains CPU/local plumbing in v1.

Files

  • triage.py — core library and CLI.
  • server.py — stdlib HTTP server with /healthz, /models, /triage, /triage/batch.
  • make_samples.py — creates synthetic non-private image/PDF samples.
  • tests/smoke_test.py — end-to-end smoke test, including NPU busy-time verification when :18817 is reachable.
  • samples/ — generated synthetic fixtures.

Requirements

Use the existing NPU venv when available:

cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python -m pip install pillow

pillow is already present in the discovered /home/will/.venvs/npu. Optional local PDF text/rendering improves PDF support:

/home/will/.venvs/npu/bin/python -m pip install pypdf pypdfium2

The smoke tests do not require external services except the existing localhost :18817 embeddings service for positive NPU verification.

CLI usage

Generate synthetic samples:

cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python make_samples.py

Triage local files:

/home/will/.venvs/npu/bin/python triage.py \
  --allowed-root /home/will/lab/swarm/openvino-doc-image-triage-npu \
  --pretty \
  samples/synthetic_invoice.png samples/synthetic_invoice.pdf

Disable the local NPU embeddings call if needed:

/home/will/.venvs/npu/bin/python triage.py --no-embeddings --allowed-root "$PWD" samples/synthetic_receipt.png

Include OCR/sidecar text in a single response only when explicitly requested:

/home/will/.venvs/npu/bin/python triage.py --include-ocr-text --allowed-root "$PWD" samples/synthetic_invoice.png

HTTP usage

The prototype is CLI-first. HTTP is optional and not enabled by default. If a foreground HTTP server is needed for review, prefer optional port 18829 so it does not collide with the GenAI worker prototype on 18820. Check the port first:

ss -ltnp | grep ':18829\b' || true

Start a local-only server and stop it after the smoke:

cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python server.py --host 127.0.0.1 --port 18829 --allowed-root "$PWD"

Call it with synthetic/non-private fixtures only:

curl -sS http://127.0.0.1:18829/healthz | jq
curl -sS http://127.0.0.1:18829/models | jq
curl -sS -X POST http://127.0.0.1:18829/triage \
  -H 'Content-Type: application/json' \
  -d '{"path":"/home/will/lab/swarm/openvino-doc-image-triage-npu/samples/synthetic_invoice.png","options":{"allowed_roots":["/home/will/lab/swarm/openvino-doc-image-triage-npu"]}}' | jq

Do not install or enable a persistent service for this prototype without explicit approval, and do not point it at private document/image directories during smoke tests.

Smoke test

cd /home/will/lab/swarm/openvino-doc-image-triage-npu
/home/will/.venvs/npu/bin/python tests/smoke_test.py

Expected: JSON ending with "ok": true. The smoke test generates only synthetic fixtures, verifies non-loopback HTTP binds are rejected, starts its temporary server on a preflighted free localhost port, and terminates it before exit. If the embeddings service is up, the result should show positive NPU busy-time delta and each embedded page should report verified_npu: true.

Example output shape

{
  "file_id": "sha256:...",
  "source_path_basename": "synthetic_invoice.png",
  "media_type": "image",
  "page_count": 1,
  "pages": [
    {
      "page_index": 0,
      "classification": {
        "label": "bill_or_invoice",
        "confidence": 0.71,
        "device": "CPU",
        "method": "rule_based_fallback"
      },
      "needs_attention": {
        "value": true,
        "device": "NPU+CPU",
        "reasons": ["amount_due", "due_date_present"],
        "embedding": {"verified_npu": true, "npu_busy_delta_us": 12345}
      },
      "metadata": {"dates_count": 1, "amounts_count": 1, "raw_values_redacted": true},
      "ocr": {"available": true, "device": "CPU"}
    }
  ],
  "processing_device_summary": {
    "file_intake": "CPU",
    "image_category_classification": "CPU rule fallback (NPU model not configured in prototype v1)",
    "needs_attention_embedding": "NPU via local :18817",
    "metadata_extraction": "CPU",
    "npu_verified": true
  },
  "privacy": {"external_uploads": false, "raw_text_logged": false}
}