Compare commits
7 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 08fb9ca686 | |||
| 9e5ffa0fd0 | |||
| d2bad88596 | |||
| 6906c2079b | |||
| 6155b54ab5 | |||
| 5a14adaf58 | |||
| b7b4edf0f5 |
@@ -37,6 +37,8 @@ For the current host-side AI/search/voice automation stack, n8n watchdogs, and a
|
||||
- [`docs/swarm-infrastructure.md`](docs/swarm-infrastructure.md) — operational overview and quick checks
|
||||
- [`docs/swarm-infrastructure.html`](docs/swarm-infrastructure.html) — dark SVG architecture diagram
|
||||
- [`docs/diagram-maintenance.md`](docs/diagram-maintenance.md) — diagram upkeep conventions
|
||||
- [`docs/npu-utilization-digest.md`](docs/npu-utilization-digest.md) — compact on-demand NPU proof/utilization digest runbook
|
||||
- [`docs/npu-integrated-health-ops.md`](docs/npu-integrated-health-ops.md) — integrated operator health-check workflow combining `npu-service-health.sh` and the utilization digest
|
||||
- OpenVINO NPU services and prototypes are documented in `swarm-common/obsidian-vault/will/will-shared-zap/Runbooks/OpenVINO NPU Services Runbook.md` and the component READMEs under `openvino-*-npu*/`. Live baseline ports are RAG `:18810`, Whisper NPU `:18816`, and embeddings `:18817`; sidecar ports `:18818`, `:18819`, `:18820`, and optional doc/image triage `:18829` are approved prototypes only, not live Atlas/Hermes routing.
|
||||
|
||||
## VM: zap
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
version: 1
|
||||
policy:
|
||||
default_mode: dry_run
|
||||
require_explicit_root: true
|
||||
allow_external_uploads: false
|
||||
allow_mutations: false
|
||||
log_raw_text: false
|
||||
include_full_paths_default: false
|
||||
npu_proof_path: /sys/class/accel/accel0/device/npu_busy_time_us
|
||||
|
||||
# Copy to config/triage-roots.local.yaml and approve exactly one narrow,
|
||||
# lane-specific staging root. The committed template is intentionally
|
||||
# unapproved/fail-closed; do not point any lane at broad home, Downloads,
|
||||
# vault, screenshot, photo-library, or historical audio roots without explicit
|
||||
# approval for that exact lane/root.
|
||||
roots:
|
||||
screenshots:
|
||||
approved: false
|
||||
root: null
|
||||
allowed_extensions: [.png, .jpg, .jpeg, .webp, .heic]
|
||||
max_files: 50
|
||||
max_file_mb: 25
|
||||
receipts:
|
||||
approved: false
|
||||
root: null
|
||||
allowed_extensions: [.png, .jpg, .jpeg, .pdf, .webp]
|
||||
max_files: 50
|
||||
max_file_mb: 25
|
||||
downloads:
|
||||
approved: false
|
||||
root: null
|
||||
allowed_extensions: [.pdf, .png, .jpg, .jpeg, .webp]
|
||||
max_files: 50
|
||||
max_file_mb: 25
|
||||
obsidian_attachments:
|
||||
approved: false
|
||||
root: null
|
||||
allowed_extensions: [.pdf, .png, .jpg, .jpeg, .webp, .mp3, .m4a, .wav, .ogg]
|
||||
max_files: 50
|
||||
max_file_mb: 50
|
||||
voice_memos:
|
||||
approved: false
|
||||
root: null
|
||||
allowed_extensions: [.mp3, .m4a, .wav, .ogg, .opus]
|
||||
max_files: 25
|
||||
max_file_mb: 100
|
||||
meeting_snippets:
|
||||
approved: false
|
||||
root: null
|
||||
allowed_extensions: [.mp3, .m4a, .wav, .ogg, .opus]
|
||||
max_files: 25
|
||||
max_file_mb: 200
|
||||
@@ -0,0 +1,46 @@
|
||||
version: 1
|
||||
policy:
|
||||
default_mode: dry_run
|
||||
require_explicit_root: true
|
||||
allow_external_uploads: false
|
||||
allow_mutations: false
|
||||
log_raw_text: false
|
||||
include_full_paths_default: false
|
||||
npu_proof_path: /sys/class/accel/accel0/device/npu_busy_time_us
|
||||
roots:
|
||||
screenshots:
|
||||
approved: true
|
||||
root: ../openvino-doc-image-triage-npu/samples
|
||||
allowed_extensions: [.png, .jpg, .jpeg, .webp, .heic]
|
||||
max_files: 50
|
||||
max_file_mb: 25
|
||||
receipts:
|
||||
approved: true
|
||||
root: ../openvino-doc-image-triage-npu/samples
|
||||
allowed_extensions: [.png, .jpg, .jpeg, .pdf, .webp]
|
||||
max_files: 50
|
||||
max_file_mb: 25
|
||||
downloads:
|
||||
approved: true
|
||||
root: ../openvino-doc-image-triage-npu/samples
|
||||
allowed_extensions: [.pdf, .png, .jpg, .jpeg, .webp]
|
||||
max_files: 50
|
||||
max_file_mb: 25
|
||||
obsidian_attachments:
|
||||
approved: true
|
||||
root: ../openvino-doc-image-triage-npu/samples
|
||||
allowed_extensions: [.pdf, .png, .jpg, .jpeg, .webp, .mp3, .m4a, .wav, .ogg]
|
||||
max_files: 50
|
||||
max_file_mb: 50
|
||||
voice_memos:
|
||||
approved: true
|
||||
root: ../tmp/synthetic-voice-memos
|
||||
allowed_extensions: [.mp3, .m4a, .wav, .ogg, .opus]
|
||||
max_files: 25
|
||||
max_file_mb: 100
|
||||
meeting_snippets:
|
||||
approved: true
|
||||
root: ../tmp/synthetic-meeting-snippets
|
||||
allowed_extensions: [.mp3, .m4a, .wav, .ogg, .opus]
|
||||
max_files: 25
|
||||
max_file_mb: 200
|
||||
@@ -0,0 +1,65 @@
|
||||
# Explicit-root NPU batch triage dry-run examples
|
||||
|
||||
These examples are wrappers only. They do not install cron jobs, enable services,
|
||||
change Atlas/Hermes routing, write Obsidian/RAG/vector DBs, move/delete files, or
|
||||
send outbound messages.
|
||||
|
||||
The committed manifest template at `config/triage-roots.example.yaml` is
|
||||
intentionally unapproved. For real private data, copy it to
|
||||
`config/triage-roots.local.yaml` and approve exactly one narrow lane-specific
|
||||
staging folder. Request-level `--root` may narrow that manifest root but cannot
|
||||
broaden it.
|
||||
|
||||
Synthetic document/image smoke, CPU-only/no NPU claim:
|
||||
|
||||
```bash
|
||||
python scripts/npu-batch-triage-dry-run.py \
|
||||
--manifest config/triage-roots.test.yaml \
|
||||
--lane screenshots \
|
||||
--root openvino-doc-image-triage-npu/samples \
|
||||
--limit 5 \
|
||||
--dry-run \
|
||||
--no-npu \
|
||||
--json
|
||||
```
|
||||
|
||||
Synthetic document/image smoke with the existing local embeddings NPU service,
|
||||
if `127.0.0.1:18817` is healthy. Treat NPU as proven only when `npu.proof_ok` is
|
||||
true and `npu.busy_delta_us` (or item-level delta) is positive:
|
||||
|
||||
```bash
|
||||
python scripts/npu-batch-triage-dry-run.py \
|
||||
--manifest config/triage-roots.test.yaml \
|
||||
--lane receipts \
|
||||
--root openvino-doc-image-triage-npu/samples \
|
||||
--limit 5 \
|
||||
--dry-run \
|
||||
--json
|
||||
```
|
||||
|
||||
Audio smoke should use generated/public synthetic audio only until a private
|
||||
audio staging root is approved:
|
||||
|
||||
```bash
|
||||
python scripts/npu-batch-triage-dry-run.py \
|
||||
--manifest config/triage-roots.test.yaml \
|
||||
--lane voice_memos \
|
||||
--root tmp/synthetic-voice-memos \
|
||||
--limit 3 \
|
||||
--dry-run \
|
||||
--no-npu \
|
||||
--json
|
||||
```
|
||||
|
||||
Cron/n8n shape (disabled example only):
|
||||
|
||||
```text
|
||||
Manual Trigger / disabled cron
|
||||
-> Execute Command: python /home/will/lab/swarm/scripts/npu-batch-triage-dry-run.py --manifest /home/will/lab/swarm/config/triage-roots.local.yaml --lane receipts --limit 25 --dry-run --json
|
||||
-> IF ok && npu.proof_ok && files_processed > 0
|
||||
-> local dashboard/report only
|
||||
```
|
||||
|
||||
Do not connect this output to Telegram/Discord/email sends, Obsidian writes,
|
||||
RAG/vector reindex, file moves/deletes, Kanban mutation, service restarts, or
|
||||
Atlas/Hermes routing without a separate reviewed approval gate.
|
||||
@@ -0,0 +1,201 @@
|
||||
# NPU integrated health checks — operator runbook notes
|
||||
|
||||
Compact, read-only operator workflow that combines the existing
|
||||
`scripts/npu-service-health.sh` listener/systemd/embedding-proof probe with the
|
||||
reviewer-approved `scripts/npu-utilization-digest.py` per-service utilization
|
||||
and fallback report. Together they form a single safe daily / on-demand NPU
|
||||
health pass.
|
||||
|
||||
Scope:
|
||||
|
||||
- Read-only against live services. No restarts, route changes, vector mutation,
|
||||
advisory POSTs, outbound sends, or memory writes.
|
||||
- No new persistent services, timers, sockets, compose services, or Dockerfiles
|
||||
are introduced by this integration. Both scripts are foreground / on-demand.
|
||||
- Binds verified local-only or on the approved Docker bridge (`172.19.0.1:18830`).
|
||||
Pre-existing broader binds on the live baseline ports (`18810`, `18814`,
|
||||
`18816`, `18817`) are noted in the runbook and unchanged here.
|
||||
- NPU proof requires real inference plus a positive
|
||||
`/sys/class/accel/accel0/device/npu_busy_time_us` delta. HTTP 200 alone is
|
||||
not sufficient.
|
||||
|
||||
## When to run
|
||||
|
||||
- Daily / on-demand ops check.
|
||||
- After upgrades that touch the NPU stack, OpenVINO, or any of the live
|
||||
specialists.
|
||||
- Before any approval-gated change that depends on the NPU reflex layer.
|
||||
- As the read-only verification step of a deploy or recovery runbook.
|
||||
|
||||
## Required artifacts on the branch
|
||||
|
||||
| Path | Role |
|
||||
| --- | --- |
|
||||
| `scripts/npu-service-health.sh` | Listener / systemd / Docker / health endpoint / single embedding proof. Existing baseline script. |
|
||||
| `scripts/npu-utilization-digest.py` | Per-service utilization digest with NPU proof per probe, compact text or JSONL output, optional JSONL artifact. |
|
||||
| `docs/npu-utilization-digest.md` | Per-service digest reference. |
|
||||
| `tests/test_npu_utilization_digest.py` | Offline unit tests for the digest (no live services required). |
|
||||
|
||||
## Integrated workflow
|
||||
|
||||
### Step 1 — Listener and service-state snapshot
|
||||
|
||||
```bash
|
||||
cd ~/lab/swarm
|
||||
./scripts/npu-service-health.sh
|
||||
```
|
||||
|
||||
What it verifies, in order:
|
||||
|
||||
1. `npu_busy_time_us` counter is readable.
|
||||
2. Required listeners are present on `18810 / 18814 / 18816 / 18817 / 18818 /
|
||||
18819 / 18820 / 18829 / 18830`.
|
||||
3. User systemd services are active/enabled for embeddings, RAG health,
|
||||
reranker, router/classifier, and the small GenAI worker.
|
||||
4. Docker Compose `whisper-server-npu` is up.
|
||||
5. Health endpoints return JSON for the live baseline and local specialists.
|
||||
6. A single non-private embeddings request to `:18817` produces a positive
|
||||
sysfs `npu_busy_time_us` delta; the script exits nonzero if there is no
|
||||
positive delta.
|
||||
|
||||
Read the last block (`== Embeddings NPU busy-time proof ==`) first. If
|
||||
`result=ok` and `sysfs_delta_us > 0`, the central NPU path is healthy. If not,
|
||||
do not run the digest; triage the embeddings service first.
|
||||
|
||||
### Step 2 — Per-service utilization digest
|
||||
|
||||
```bash
|
||||
scripts/npu-utilization-digest.py --no-write --include-genai-smoke false --format text
|
||||
```
|
||||
|
||||
Compact output shape:
|
||||
|
||||
```text
|
||||
NPU utilization digest <timestamp>
|
||||
counter=/sys/class/accel/accel0/device/npu_busy_time_us delta_us=<total>
|
||||
services_ok=<ok>/<total> proof_ok=<ok>/<proof-capable> fallbacks=<n> gates_closed=<n>
|
||||
- embeddings: ok=true calls=1 avg_ms=... npu_delta_us=... proof=true mode=NPU
|
||||
- rerank: ok=true calls=1 docs=2 avg_ms=... npu_delta_us=... proof=true mode=NPU
|
||||
- whisper: ok=true calls=1 jobs=1 avg_ms=... npu_delta_us=... proof=true mode=NPU
|
||||
- classifier: ok=true calls=1 events=1 avg_ms=... npu_delta_us=... proof=true dry_run=true ...
|
||||
- genai: ok=true jobs=0 loaded=false mode=loaded=false reason=skipped_cold_load
|
||||
- doc_triage: ok=true calls=1 files=1 avg_ms=... npu_delta_us=... proof=true gate=closed:private-root
|
||||
- rag_endpoint: ok=true mode=health_only gate=closed:vector-mutation
|
||||
- rag_health: ok=true mode=health_only
|
||||
- advisory_gateway: ok=true mode=health_only gate=closed:advisory-post
|
||||
fallbacks: skipped_cold_load=1
|
||||
```
|
||||
|
||||
Read order for ops:
|
||||
|
||||
1. `services_ok` row — anything below `9/9` means a service is down or unhealthy.
|
||||
2. `proof_ok` row — `proof_ok=5/5` means every probe that ran with a real
|
||||
inference request produced a positive sysfs NPU delta.
|
||||
3. `fallbacks:` line — `skipped_cold_load=1` is expected (GenAI worker is
|
||||
intentionally not cold-loaded). Any other fallback label is a triage signal.
|
||||
4. `gate=` labels — closed gates that remain closed by design.
|
||||
|
||||
### Step 3 — Optional artifact for trend tracking
|
||||
|
||||
```bash
|
||||
scripts/npu-utilization-digest.py --format jsonl
|
||||
```
|
||||
|
||||
Writes a single JSONL line per digest under
|
||||
`/home/will/.local/state/npu-utilization/digests/<timestamp>.jsonl`. The first
|
||||
line is the summary; subsequent lines are per-service rows. No JSONL write
|
||||
happens with `--no-write`.
|
||||
|
||||
### Step 4 — Offline unit tests
|
||||
|
||||
```bash
|
||||
python -m pytest tests/test_npu_utilization_digest.py -q
|
||||
```
|
||||
|
||||
Does not require live services. Use to validate digest logic after edits or
|
||||
before merging.
|
||||
|
||||
## Compact proof interpretation
|
||||
|
||||
For each proof-capable service, both the response-level `npu_busy_delta_us`
|
||||
(when the service reports it) and the script's own sysfs before/after delta
|
||||
must agree and be `> 0`. The proof is only valid when an actual inference
|
||||
request ran. If a probe was skipped (`reason=skipped_cold_load` or
|
||||
`reason=smoke_disabled`), `proof_ok` for that row is `None` and the row
|
||||
contributes a labeled fallback instead of a proof failure.
|
||||
|
||||
Proof currently runs on:
|
||||
|
||||
- `embeddings` (`:18817`)
|
||||
- `rerank` (`:18818`)
|
||||
- `whisper` (`:18816`) when `--include-whisper-smoke=true` (default)
|
||||
- `classifier` (`:18819`)
|
||||
- `doc_triage` (`:18829`) when `--include-doc-triage-smoke=true` (default);
|
||||
proof is via the embeddings service, not directly on the NPU device, so the
|
||||
row reports `mode=NPU-via-embedding-service`.
|
||||
|
||||
Intentionally health-only (no proof row):
|
||||
|
||||
- `rag_endpoint` (`:18810`) — closed:vector-mutation
|
||||
- `rag_health` (`:18814`)
|
||||
- `advisory_gateway` (`172.19.0.1:18830`) — closed:advisory-post
|
||||
|
||||
Intentionally skipped by default:
|
||||
|
||||
- `genai` (`:18820`) — `loaded=false` until first use; cold-loading just to
|
||||
prove the NPU is not free, so it is treated as a labeled fallback rather
|
||||
than a proof failure. Opt in with `--include-genai-smoke=true` only when the
|
||||
task actually needs a generation smoke.
|
||||
|
||||
## Exit codes and triage gates
|
||||
|
||||
`scripts/npu-service-health.sh`:
|
||||
|
||||
| Exit | Meaning | Next |
|
||||
| ---: | --- | --- |
|
||||
| 0 | All checks passed including embeddings proof. | Continue to digest. |
|
||||
| 2 | `npu_busy_time_us` not readable. | Check kernel/driver; do not run digest. |
|
||||
| 3 | Embedding request failed. | Triage `openvino-embeddings.service` and port `:18817`. |
|
||||
| 4 | Embedding request succeeded but sysfs delta `<= 0`. | Service reachable but not on the NPU; check service logs and device bind. |
|
||||
|
||||
`scripts/npu-utilization-digest.py`:
|
||||
|
||||
| Exit | Meaning | Next |
|
||||
| ---: | --- | --- |
|
||||
| 0 | All reachable services handled; proof/fallback accounting completed. | Inspect `proof_ok` and `fallbacks:` for any unexpected labels. |
|
||||
| 2 | `--strict-proof` was set and at least one proof-required probe ran without a positive sysfs delta. | Triage the named service's NPU path. |
|
||||
|
||||
## Approval gates left closed
|
||||
|
||||
The integrated workflow intentionally does not:
|
||||
|
||||
- start, stop, restart, enable, or disable any user systemd unit or Docker
|
||||
Compose service;
|
||||
- write to or mutate the Chroma collection `obsidian_bge_npu` or any other
|
||||
vector store;
|
||||
- change Atlas/Hermes routing or model defaults;
|
||||
- post classification/generation/triage events to the advisory gateway;
|
||||
- broaden private document, image, or audio roots;
|
||||
- bind any new listener, including on `0.0.0.0`;
|
||||
- write memory, send messages, execute tools, or mutate Kanban state.
|
||||
|
||||
These remain approval-gated and are tracked on the `npu-maximization` board.
|
||||
|
||||
## Quick reference
|
||||
|
||||
```bash
|
||||
# Single-pass NPU health check (listener + systemd + embeddings proof).
|
||||
cd ~/lab/swarm && ./scripts/npu-service-health.sh
|
||||
|
||||
# Compact digest with per-service proof and fallback accounting.
|
||||
scripts/npu-utilization-digest.py --no-write --include-genai-smoke false --format text
|
||||
|
||||
# Same, with a JSONL artifact for trend tracking.
|
||||
scripts/npu-utilization-digest.py --format jsonl
|
||||
|
||||
# Strict mode for CI / pre-merge.
|
||||
scripts/npu-utilization-digest.py --no-write --strict-proof
|
||||
|
||||
# Offline digest logic tests.
|
||||
python -m pytest tests/test_npu_utilization_digest.py -q
|
||||
```
|
||||
@@ -0,0 +1,49 @@
|
||||
# NPU utilization digest
|
||||
|
||||
Compact on-demand observability for Will's local OpenVINO/NPU specialists.
|
||||
|
||||
Script:
|
||||
|
||||
```bash
|
||||
/home/will/lab/swarm/scripts/npu-utilization-digest.py --format text
|
||||
```
|
||||
|
||||
Safe defaults:
|
||||
|
||||
- read-only for services; no service starts/stops/restarts, routing changes, vector DB mutation, advisory POSTs, outbound sends, or memory writes;
|
||||
- writes only a compact JSONL artifact under `/home/will/.local/state/npu-utilization/digests` unless `--no-write` is passed;
|
||||
- uses synthetic/non-private requests for embeddings, rerank, classifier dry-run, and doc triage;
|
||||
- keeps GenAI generation disabled by default when the worker is not loaded, to avoid cold-load side effects;
|
||||
- advisory gateway remains health-only because POSTs write metadata/events;
|
||||
- NPU proof is only true when an inference probe ran and `/sys/class/accel/accel0/device/npu_busy_time_us` increased around that probe.
|
||||
|
||||
Common commands:
|
||||
|
||||
```bash
|
||||
# Compact CLI digest, plus JSONL artifact.
|
||||
scripts/npu-utilization-digest.py --format text
|
||||
|
||||
# No artifact write; useful during reviews.
|
||||
scripts/npu-utilization-digest.py --no-write --include-genai-smoke false
|
||||
|
||||
# Machine-readable stdout.
|
||||
scripts/npu-utilization-digest.py --format jsonl --no-write
|
||||
|
||||
# CI/unit tests; live services not required.
|
||||
python -m pytest tests/test_npu_utilization_digest.py -q
|
||||
```
|
||||
|
||||
Output shape is intentionally small: service booleans, counts, average probe ms, sysfs deltas, proof flags, fallback warning counts, artifact path, and closed gates. `fallbacks` includes unavailable services, failed/missing proof, and skipped proof-capable smokes such as disabled Whisper/doc-triage probes or GenAI cold-load skips; intentionally health-only RAG/advisory rows are not fallbacks unless unavailable. It does not print raw embeddings, transcripts, OCR text, model completions, request headers, or full upstream JSON.
|
||||
|
||||
Covered rows:
|
||||
|
||||
- `embeddings`: `/v1/embeddings` synthetic string, positive sysfs delta required.
|
||||
- `rerank`: `/rerank` with two synthetic docs, positive sysfs delta required.
|
||||
- `whisper`: health-only unless the bounded generated-WAV smoke is enabled.
|
||||
- `classifier`: `/v1/classify` with `dry_run=true` and `include_evidence=false`, positive sysfs delta required.
|
||||
- `genai`: health-only by default; skips when `loaded=false` unless explicitly opted in.
|
||||
- `doc_triage`: one approved synthetic sample under the service sample root, with `allowed_roots` narrowed to that sample directory; NPU proof is via embeddings.
|
||||
- `rag_endpoint` and `rag_health`: health-only; no vector mutation.
|
||||
- `advisory_gateway`: health-only; `closed:advisory-post` gate remains closed.
|
||||
|
||||
Closed gates left for later approval: sending/delivery, recurring timer, GenAI cold-load smoke, advisory POSTs, Atlas/Hermes routing changes, vector mutation/reindex, and broad private document/audio/image roots.
|
||||
@@ -0,0 +1,135 @@
|
||||
# NPU voice/audio local-file pipeline
|
||||
|
||||
This is the first-slice local-file voice/audio path for the NPU maximization program:
|
||||
|
||||
```text
|
||||
local audio file or already-staged attachment
|
||||
-> OpenVINO NPU Whisper (:18816)
|
||||
-> OpenVINO NPU classifier (:18819)
|
||||
-> explicit advisory gate
|
||||
-> Atlas/Hermes only after separate approval
|
||||
```
|
||||
|
||||
The implementation is `scripts/npu_voice_audio_pipeline.py`. It is a CLI wrapper only; it starts no listener and performs no outbound sends, Obsidian writes, memory writes, vector DB mutations, Kanban mutations, service restarts, platform API calls, or live Atlas/Hermes routing changes.
|
||||
|
||||
## Safety gates
|
||||
|
||||
Closed unless explicitly approved later:
|
||||
|
||||
- Telegram/Discord fetching by bot token or attachment URL.
|
||||
- Outbound messages or auto-sends.
|
||||
- Obsidian/vault writes.
|
||||
- Memory writes.
|
||||
- Vector DB mutation or reindex.
|
||||
- Automatic Kanban mutation.
|
||||
- Service restarts or new persistent listeners.
|
||||
- Private-directory root broadening.
|
||||
- Live Atlas/Hermes routing authority changes.
|
||||
|
||||
HTTP success is not NPU proof. For NPU claims, require real inference plus positive `/sys/class/accel/accel0/device/npu_busy_time_us` deltas. The CLI reports response deltas and observed sysfs deltas for Whisper and classifier calls.
|
||||
|
||||
## Example: synthetic local WAV smoke
|
||||
|
||||
```bash
|
||||
cd /home/will/lab/swarm
|
||||
python - <<'PY'
|
||||
import math, struct, wave
|
||||
path = '/tmp/npu-voice-smoke.wav'
|
||||
sr = 16000
|
||||
with wave.open(path, 'wb') as w:
|
||||
w.setnchannels(1)
|
||||
w.setsampwidth(2)
|
||||
w.setframerate(sr)
|
||||
frames = bytearray()
|
||||
for i in range(int(sr * 0.6)):
|
||||
frames.extend(struct.pack('<h', int(12000 * math.sin(2 * math.pi * 440 * i / sr))))
|
||||
w.writeframes(frames)
|
||||
print(path)
|
||||
PY
|
||||
```
|
||||
|
||||
Run the local-file wrapper:
|
||||
|
||||
```bash
|
||||
/home/will/.venvs/npu/bin/python scripts/npu_voice_audio_pipeline.py \
|
||||
--audio /tmp/npu-voice-smoke.wav \
|
||||
--title "synthetic smoke" \
|
||||
--source manual_smoke \
|
||||
--json
|
||||
```
|
||||
|
||||
Compact output shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": true,
|
||||
"source": "manual_smoke",
|
||||
"transcript_chars": 3,
|
||||
"action_worthy": false,
|
||||
"atlas_gate": "suppressed_not_action_worthy",
|
||||
"whisper_npu_delta_us": 85441,
|
||||
"whisper_sysfs_delta_us": 85441,
|
||||
"classifier_npu_delta_us": 85908,
|
||||
"classifier_sysfs_delta_us": 85908,
|
||||
"classifier_observed_sysfs_delta_us": 85908,
|
||||
"external_sends": 0,
|
||||
"writes": 0
|
||||
}
|
||||
```
|
||||
|
||||
A non-actionable smoke should stay `suppressed_not_action_worthy`. A transcript with a reminder, task, follow-up, explicit question, or classifier `tool_needed=true` should become `advisory_only_not_sent`, not sent.
|
||||
|
||||
## Example: already-staged platform voice file
|
||||
|
||||
This example assumes another approved process has already placed the audio file locally. The wrapper does not fetch from Telegram/Discord and does not read bot tokens.
|
||||
|
||||
```bash
|
||||
/home/will/.venvs/npu/bin/python scripts/npu_voice_audio_pipeline.py \
|
||||
--audio /tmp/staged-voice-message.ogg \
|
||||
--source staged_telegram \
|
||||
--title "staged local Telegram voice memo" \
|
||||
--json
|
||||
```
|
||||
|
||||
## Compact fields
|
||||
|
||||
The CLI always reports:
|
||||
|
||||
- `ok`
|
||||
- `id`
|
||||
- `source`
|
||||
- `transcript_chars`
|
||||
- `action_worthy`
|
||||
- `atlas_gate`
|
||||
- `next_gate`
|
||||
- `whisper_npu_delta_us`
|
||||
- `whisper_sysfs_delta_us`
|
||||
- `classifier_npu_delta_us`
|
||||
- `classifier_sysfs_delta_us`
|
||||
- `classifier_observed_sysfs_delta_us`
|
||||
- `labels.workflow_category`
|
||||
- `labels.tool_needed`
|
||||
- `labels.urgency`
|
||||
- `labels.safety_confirmation_required`
|
||||
- `external_sends`
|
||||
- `writes`
|
||||
|
||||
Transcript text is omitted by default. Use `--include-transcript` or `--include-transcript-preview-chars N` only for explicit local debugging.
|
||||
|
||||
## Input limits
|
||||
|
||||
- `--audio` must be an absolute local path.
|
||||
- Symlinks, directories, missing files, empty files, unsupported extensions, and files over `--max-bytes` are refused.
|
||||
- WAV duration is capped by `--max-audio-seconds`; other codecs remain size-capped in this first slice.
|
||||
- Classifier transcript payload is bounded by `--max-transcript-chars`.
|
||||
|
||||
## Health prerequisites
|
||||
|
||||
Read-only checks:
|
||||
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:18816/health
|
||||
curl -fsS http://127.0.0.1:18819/healthz
|
||||
```
|
||||
|
||||
Do not restart services from this runbook. If either endpoint is unhealthy, stop and request an ops/remediation task.
|
||||
@@ -33,6 +33,18 @@ POST /v1/advisory/generate
|
||||
POST /v1/advisory/triage
|
||||
```
|
||||
|
||||
## Cron and n8n advisory dry-run contract
|
||||
|
||||
For cron/n8n event classification, use the dry-run contract in `docs/cron-n8n-advisory-classifier.md`.
|
||||
It defines the normalized event envelope, decision envelope, `suppress|log|summarize|escalate` recommendation mapping, and duplicate/stale/no-op/action-required examples.
|
||||
|
||||
Example artifacts:
|
||||
|
||||
- `examples/cron-advisory-dry-run.sh` — host-local cron wrapper that prints one compact decision line and performs no side effects.
|
||||
- `examples/n8n-advisory-dry-run-fragment.json` — sanitized inactive n8n node fragment for Set -> HTTP Request -> Code decision mapping.
|
||||
|
||||
Both examples preserve the gateway authority boundary: advisory only, no send/restart/memory/tool/routing authority.
|
||||
|
||||
### Classifier shadow call
|
||||
|
||||
```bash
|
||||
|
||||
@@ -0,0 +1,256 @@
|
||||
# Cron and n8n advisory classifier contract
|
||||
|
||||
Status: dry-run specification and integration examples
|
||||
Scope: cron and n8n alert/event classification through the OpenVINO advisory gateway
|
||||
Gateway: `http://172.19.0.1:18830` from `n8n-agent` and host-local cron on the current bridge-bound service. Override `NPU_ADVISORY_GATEWAY_URL=http://127.0.0.1:18830` only if a localhost-bound instance is explicitly running.
|
||||
|
||||
## Authority boundary
|
||||
|
||||
This contract is advisory only. It may recommend one of `suppress`, `log`, `summarize`, or `escalate`, but it must not perform the action itself.
|
||||
|
||||
Every integration must preserve these authority flags:
|
||||
|
||||
```json
|
||||
{
|
||||
"may_route": false,
|
||||
"may_write_memory": false,
|
||||
"may_send_external": false,
|
||||
"may_process_private_dirs": false,
|
||||
"may_execute_tools": false,
|
||||
"may_restart_services": false
|
||||
}
|
||||
```
|
||||
|
||||
Allowed side effects in dry-run mode:
|
||||
|
||||
- read an explicit cron/n8n event payload;
|
||||
- call the advisory gateway classifier/generator;
|
||||
- write compact local stdout or n8n execution logs;
|
||||
- store metadata-only advisory counters if an existing log sink already does so.
|
||||
|
||||
Forbidden without separate explicit approval:
|
||||
|
||||
- outbound sends/pages/Discord/Telegram/email;
|
||||
- service restarts, command execution, or tool calls;
|
||||
- Hermes/Atlas routing changes;
|
||||
- memory writes;
|
||||
- broad private-directory processing;
|
||||
- vector database mutation or reindexing.
|
||||
|
||||
## Input event envelope
|
||||
|
||||
Cron and n8n producers should normalize events before classification. Keep this input small and avoid raw private payloads.
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "cron_n8n_event_v1",
|
||||
"trace_id": "cron:service-health:2026-06-05T14:30:00Z",
|
||||
"source": "cron",
|
||||
"workflow": "npu-service-health",
|
||||
"event_kind": "health_check",
|
||||
"severity": "warning",
|
||||
"subject": "openvino-reranker health check repeated warning",
|
||||
"summary": "Two consecutive health probes reported timeout, no restart attempted.",
|
||||
"dedupe_key": "service:openvino-reranker:timeout",
|
||||
"observed_at": "2026-06-05T14:30:00Z",
|
||||
"stale_after_s": 900,
|
||||
"action_requested": false,
|
||||
"dry_run": true
|
||||
}
|
||||
```
|
||||
|
||||
Field rules:
|
||||
|
||||
- `source`: `cron` or `n8n`.
|
||||
- `workflow`: compact job/workflow name, not a private URL.
|
||||
- `subject` + `summary`: the only text sent to the classifier.
|
||||
- `dedupe_key`: stable non-secret key for duplicate detection by the caller.
|
||||
- `stale_after_s`: caller-side freshness gate; stale events should not page.
|
||||
- `action_requested`: true only when an upstream job is asking a human/Atlas to consider action.
|
||||
- `dry_run`: must remain true for this phase.
|
||||
|
||||
## Gateway classifier call
|
||||
|
||||
The current gateway `/v1/advisory/classify` accepts explicit text and wraps the classifier response in `openvino_advisory_v1` with NPU proof and authority fields.
|
||||
|
||||
Host cron example for the current bridge-bound service:
|
||||
|
||||
```bash
|
||||
curl -fsS http://172.19.0.1:18830/v1/advisory/classify \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"trace_id":"cron:service-health:sample",
|
||||
"text":"source=cron workflow=npu-service-health severity=warning kind=health_check subject=openvino-reranker repeated timeout summary=Two consecutive health probes reported timeout; no restart attempted; dry_run=true"
|
||||
}' | jq '{schema, mode, trace_id, npu_ok: .npu_proof.ok, npu_delta: .npu_proof.npu_busy_delta_us, authority, labels: .result.labels}'
|
||||
```
|
||||
|
||||
n8n Docker-bridge example:
|
||||
|
||||
```bash
|
||||
curl -fsS http://172.19.0.1:18830/v1/advisory/classify \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"trace_id":"n8n:swarm-health:sample","text":"source=n8n workflow=swarm-health-watchdog severity=critical kind=health_check subject=multiple services unhealthy summary=Health probe failed for three services; dry_run=true"}' \
|
||||
| jq '{mode, npu_ok: .npu_proof.ok, npu_delta: .npu_proof.npu_busy_delta_us, may_send_external: .authority.may_send_external}'
|
||||
```
|
||||
|
||||
NPU proof gate: an HTTP 200 is not enough. Treat the classifier as NPU-backed only when `.npu_proof.ok == true` and `.npu_proof.npu_busy_delta_us > 0` for real inference.
|
||||
|
||||
## Advisory decision envelope
|
||||
|
||||
Cron/n8n wrappers should map the gateway response plus caller-side freshness/deduplication state into this compact decision envelope:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "cron_n8n_advisory_decision_v1",
|
||||
"trace_id": "cron:service-health:2026-06-05T14:30:00Z",
|
||||
"source": "cron",
|
||||
"workflow": "npu-service-health",
|
||||
"dry_run": true,
|
||||
"recommendation": "summarize",
|
||||
"classification": "action_required",
|
||||
"confidence": 0.84,
|
||||
"reason_codes": ["warning_or_high_urgency", "fresh_event", "not_duplicate"],
|
||||
"npu_proof": {"required": true, "ok": true, "npu_busy_delta_us": 1234},
|
||||
"authority": {
|
||||
"may_route": false,
|
||||
"may_write_memory": false,
|
||||
"may_send_external": false,
|
||||
"may_process_private_dirs": false,
|
||||
"may_execute_tools": false,
|
||||
"may_restart_services": false
|
||||
},
|
||||
"next_gate": "human_or_atlas_review_required_before_any_side_effect"
|
||||
}
|
||||
```
|
||||
|
||||
Decision fields:
|
||||
|
||||
- `recommendation`: `suppress`, `log`, `summarize`, or `escalate`.
|
||||
- `classification`: `duplicate`, `stale`, `no_op`, or `action_required` for v1 examples.
|
||||
- `confidence`: use classifier urgency/category confidence when available; otherwise use a conservative wrapper score.
|
||||
- `reason_codes`: compact machine-readable rationale, not raw payload text.
|
||||
- `next_gate`: always a review/approval gate before side effects.
|
||||
|
||||
## Recommendation mapping
|
||||
|
||||
This is the v1 dry-run mapping. It is intentionally conservative and caller-side; the NPU classifier advises, the wrapper chooses a recommendation, and humans/Atlas retain authority.
|
||||
|
||||
| Caller/classifier signal | Classification | Recommendation | Dry-run behavior |
|
||||
|---|---|---|---|
|
||||
| Same `dedupe_key` observed inside caller cooldown | `duplicate` | `suppress` | Log compact duplicate count only. Do not send. |
|
||||
| `observed_at + stale_after_s` is older than now | `stale` | `log` | Log stale event and age. Do not summarize/page. |
|
||||
| Severity low/normal, no action requested, classifier urgency low/normal | `no_op` | `log` | Keep normal execution log only. |
|
||||
| Warning/high urgency or action requested, NPU proof ok | `action_required` | `summarize` | Draft a local summary for review; no send/restart. |
|
||||
| Critical severity or repeated failures and NPU proof ok | `action_required` | `escalate` | Recommend escalation to Atlas/human; wrapper still must not send/restart. |
|
||||
| NPU proof missing or false | `action_required` or caller-specific | `log` | Log `npu_proof_failed`; do not claim NPU-backed advice. |
|
||||
|
||||
## Required examples
|
||||
|
||||
### Duplicate -> suppress
|
||||
|
||||
Input summary:
|
||||
|
||||
```json
|
||||
{"source":"cron","workflow":"npu-service-health","severity":"warning","dedupe_key":"service:reranker:timeout","summary":"Same timeout as prior run inside cooldown.","dry_run":true}
|
||||
```
|
||||
|
||||
Decision:
|
||||
|
||||
```json
|
||||
{"classification":"duplicate","recommendation":"suppress","reason_codes":["dedupe_key_in_cooldown"],"next_gate":"none_in_dry_run"}
|
||||
```
|
||||
|
||||
### Stale -> log
|
||||
|
||||
Input summary:
|
||||
|
||||
```json
|
||||
{"source":"n8n","workflow":"swarm-health-watchdog","severity":"warning","observed_at":"older_than_stale_after","stale_after_s":900,"summary":"Delayed webhook replay for an old probe.","dry_run":true}
|
||||
```
|
||||
|
||||
Decision:
|
||||
|
||||
```json
|
||||
{"classification":"stale","recommendation":"log","reason_codes":["event_stale"],"next_gate":"none_in_dry_run"}
|
||||
```
|
||||
|
||||
### No-op -> log
|
||||
|
||||
Input summary:
|
||||
|
||||
```json
|
||||
{"source":"cron","workflow":"backup-check","severity":"normal","action_requested":false,"summary":"Backup completed and all expected files are present.","dry_run":true}
|
||||
```
|
||||
|
||||
Decision:
|
||||
|
||||
```json
|
||||
{"classification":"no_op","recommendation":"log","reason_codes":["normal_severity","no_action_requested"],"next_gate":"none_in_dry_run"}
|
||||
```
|
||||
|
||||
### Action required -> summarize/escalate
|
||||
|
||||
Input summary:
|
||||
|
||||
```json
|
||||
{"source":"n8n","workflow":"swarm-health-watchdog","severity":"critical","action_requested":true,"summary":"RAG and embeddings health failed repeatedly; no restart attempted.","dry_run":true}
|
||||
```
|
||||
|
||||
Decision:
|
||||
|
||||
```json
|
||||
{"classification":"action_required","recommendation":"escalate","reason_codes":["critical_severity","action_requested","fresh_event"],"next_gate":"human_or_atlas_review_required_before_any_side_effect"}
|
||||
```
|
||||
|
||||
## Optional local summary draft
|
||||
|
||||
If the decision is `summarize` or `escalate`, a wrapper may request a bounded draft from `/v1/advisory/generate`:
|
||||
|
||||
```bash
|
||||
curl -fsS http://172.19.0.1:18830/v1/advisory/generate \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"trace_id":"cron:service-health:sample","job":"summary","input":"Health check warning: openvino-reranker timed out twice; no restart attempted.","max_new_tokens":48}' \
|
||||
| jq '{mode, trace_id, npu_ok: .npu_proof.ok, authority, draft: .result.draft_text, final_authority: .result.final_authority}'
|
||||
```
|
||||
|
||||
The draft remains non-authoritative. It must not be automatically sent externally or written to memory.
|
||||
|
||||
## n8n integration pattern
|
||||
|
||||
Recommended node chain for dry-run workflows:
|
||||
|
||||
```text
|
||||
Schedule/Webhook/Failure Trigger
|
||||
-> Set normalized event envelope
|
||||
-> HTTP Request POST /v1/advisory/classify
|
||||
-> Code node maps decision envelope
|
||||
-> IF node on recommendation
|
||||
suppress/log: execution log only
|
||||
summarize/escalate: optional local summary draft, then execution log only
|
||||
```
|
||||
|
||||
The IF node must not connect to outbound messaging, service restart, memory write, or Hermes routing nodes until a separate approval changes the authority boundary.
|
||||
|
||||
See `../examples/n8n-advisory-dry-run-fragment.json` for a sanitized node fragment.
|
||||
|
||||
## Cron integration pattern
|
||||
|
||||
Cron jobs should call a wrapper script that prints one compact line and exits successfully unless the wrapper itself fails. The wrapper should not page or restart.
|
||||
|
||||
Example crontab shape:
|
||||
|
||||
```text
|
||||
*/15 * * * * /home/will/lab/swarm/openvino-advisory-gateway/examples/cron-advisory-dry-run.sh npu-service-health warning health_check "openvino-reranker timeout twice" "service:openvino-reranker:timeout" >> /home/will/.local/state/npu-advisory/cron.log 2>&1
|
||||
```
|
||||
|
||||
See `../examples/cron-advisory-dry-run.sh`.
|
||||
|
||||
## Verification checklist
|
||||
|
||||
- Gateway health is reachable on the intended interface.
|
||||
- Classifier response includes `schema=openvino_advisory_v1`.
|
||||
- `.authority.*` flags are all false for side-effect authority.
|
||||
- `.npu_proof.ok` is true and `npu_busy_delta_us > 0` before claiming NPU-backed advice.
|
||||
- Decision envelope is compact and contains only booleans/counts/paths/deltas/gates.
|
||||
- Duplicate/stale/no-op/action-required examples remain dry-run only.
|
||||
- No n8n workflow activation, outbound send, service restart, memory write, routing change, private-dir broadening, or vector DB mutation occurred.
|
||||
@@ -0,0 +1,48 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
# Dry-run cron/n8n advisory wrapper.
|
||||
# It calls the advisory classifier and prints one compact decision line.
|
||||
# It does not send, restart, route, execute tools, or write memory.
|
||||
|
||||
GATEWAY_URL="${NPU_ADVISORY_GATEWAY_URL:-http://172.19.0.1:18830}"
|
||||
WORKFLOW="${1:-cron-advisory-sample}"
|
||||
SEVERITY="${2:-normal}"
|
||||
EVENT_KIND="${3:-health_check}"
|
||||
SUBJECT="${4:-sample advisory event}"
|
||||
DEDUPE_KEY="${5:-sample}"
|
||||
TRACE_ID="${NPU_ADVISORY_TRACE_ID:-cron:${WORKFLOW}:$(date -u +%Y%m%dT%H%M%SZ)}"
|
||||
|
||||
TEXT="source=cron workflow=${WORKFLOW} severity=${SEVERITY} kind=${EVENT_KIND} subject=${SUBJECT} dedupe_key=${DEDUPE_KEY} dry_run=true authority=no-send,no-restart,no-memory"
|
||||
|
||||
payload=$(jq -nc --arg trace_id "$TRACE_ID" --arg text "$TEXT" '{trace_id:$trace_id,text:$text}')
|
||||
response=$(curl -fsS "${GATEWAY_URL%/}/v1/advisory/classify" -H 'Content-Type: application/json' -d "$payload")
|
||||
|
||||
printf '%s\n' "$response" | jq -c --arg source cron --arg workflow "$WORKFLOW" --arg severity "$SEVERITY" --arg dedupe_key "$DEDUPE_KEY" '
|
||||
. as $env
|
||||
| ($env.result.labels.urgency.value // "normal") as $urgency
|
||||
| ($env.result.labels.urgency.confidence // 0) as $confidence
|
||||
| ($env.npu_proof.ok == true and (($env.npu_proof.npu_busy_delta_us // 0) > 0)) as $npu_ok
|
||||
| (if ($npu_ok | not) then "log"
|
||||
elif ($severity == "critical") then "escalate"
|
||||
elif ($severity == "warning" or $urgency == "high" or $urgency == "critical") then "summarize"
|
||||
else "log" end) as $recommendation
|
||||
| (if ($recommendation == "log" and $severity == "normal") then "no_op" else "action_required" end) as $classification
|
||||
| {
|
||||
schema: "cron_n8n_advisory_decision_v1",
|
||||
trace_id: $env.trace_id,
|
||||
source: $source,
|
||||
workflow: $workflow,
|
||||
dry_run: true,
|
||||
recommendation: $recommendation,
|
||||
classification: $classification,
|
||||
confidence: $confidence,
|
||||
reason_codes: ([
|
||||
(if $npu_ok then "npu_proof_ok" else "npu_proof_failed" end),
|
||||
("severity_" + $severity),
|
||||
("urgency_" + $urgency)
|
||||
]),
|
||||
npu_proof: $env.npu_proof,
|
||||
authority: $env.authority,
|
||||
next_gate: (if $recommendation == "escalate" or $recommendation == "summarize" then "human_or_atlas_review_required_before_any_side_effect" else "none_in_dry_run" end)
|
||||
}'
|
||||
@@ -0,0 +1,70 @@
|
||||
{
|
||||
"name": "OpenVINO Advisory Dry-Run Fragment",
|
||||
"active": false,
|
||||
"nodes": [
|
||||
{
|
||||
"parameters": {
|
||||
"values": {
|
||||
"string": [
|
||||
{"name": "schema", "value": "cron_n8n_event_v1"},
|
||||
{"name": "source", "value": "n8n"},
|
||||
{"name": "workflow", "value": "swarm-health-watchdog"},
|
||||
{"name": "event_kind", "value": "health_check"},
|
||||
{"name": "severity", "value": "warning"},
|
||||
{"name": "subject", "value": "OpenVINO service health warning"},
|
||||
{"name": "summary", "value": "Health probe reported a warning; no restart or send is authorized."},
|
||||
{"name": "dedupe_key", "value": "service:openvino:warning"},
|
||||
{"name": "dry_run", "value": "true"}
|
||||
]
|
||||
},
|
||||
"options": {}
|
||||
},
|
||||
"id": "set-normalized-event",
|
||||
"name": "Set normalized advisory event",
|
||||
"type": "n8n-nodes-base.set",
|
||||
"typeVersion": 2,
|
||||
"position": [260, 300]
|
||||
},
|
||||
{
|
||||
"parameters": {
|
||||
"method": "POST",
|
||||
"url": "http://172.19.0.1:18830/v1/advisory/classify",
|
||||
"sendBody": true,
|
||||
"contentType": "json",
|
||||
"jsonBody": "={{ JSON.stringify({ trace_id: 'n8n:' + $json.workflow + ':' + $now.toISO(), text: 'source=n8n workflow=' + $json.workflow + ' severity=' + $json.severity + ' kind=' + $json.event_kind + ' subject=' + $json.subject + ' summary=' + $json.summary + ' dedupe_key=' + $json.dedupe_key + ' dry_run=true authority=no-send,no-restart,no-memory' }) }}",
|
||||
"options": {
|
||||
"timeout": 20000
|
||||
}
|
||||
},
|
||||
"id": "http-advisory-classify",
|
||||
"name": "HTTP advisory classify dry-run",
|
||||
"type": "n8n-nodes-base.httpRequest",
|
||||
"typeVersion": 4,
|
||||
"position": [520, 300]
|
||||
},
|
||||
{
|
||||
"parameters": {
|
||||
"jsCode": "const env = $json;\nconst labels = env.result?.labels || {};\nconst urgency = labels.urgency?.value || 'normal';\nconst severity = $('Set normalized advisory event').first().json.severity || 'normal';\nconst npuOk = env.npu_proof?.ok === true && (env.npu_proof?.npu_busy_delta_us || 0) > 0;\nlet recommendation = 'log';\nlet classification = 'no_op';\nconst reason_codes = [npuOk ? 'npu_proof_ok' : 'npu_proof_failed', `severity_${severity}`, `urgency_${urgency}`];\nif (npuOk && severity === 'critical') { recommendation = 'escalate'; classification = 'action_required'; }\nelse if (npuOk && (severity === 'warning' || urgency === 'high' || urgency === 'critical')) { recommendation = 'summarize'; classification = 'action_required'; }\nif (!npuOk) reason_codes.push('log_only_no_npu_claim');\nreturn [{ json: { schema: 'cron_n8n_advisory_decision_v1', trace_id: env.trace_id, source: 'n8n', workflow: $('Set normalized advisory event').first().json.workflow, dry_run: true, recommendation, classification, confidence: labels.urgency?.confidence || 0, reason_codes, npu_proof: env.npu_proof, authority: env.authority, next_gate: (recommendation === 'summarize' || recommendation === 'escalate') ? 'human_or_atlas_review_required_before_any_side_effect' : 'none_in_dry_run' } } }];"
|
||||
},
|
||||
"id": "map-dry-run-decision",
|
||||
"name": "Map dry-run decision (no side effects)",
|
||||
"type": "n8n-nodes-base.code",
|
||||
"typeVersion": 2,
|
||||
"position": [780, 300]
|
||||
}
|
||||
],
|
||||
"connections": {
|
||||
"Set normalized advisory event": {
|
||||
"main": [[{"node": "HTTP advisory classify dry-run", "type": "main", "index": 0}]]
|
||||
},
|
||||
"HTTP advisory classify dry-run": {
|
||||
"main": [[{"node": "Map dry-run decision (no side effects)", "type": "main", "index": 0}]]
|
||||
}
|
||||
},
|
||||
"settings": {
|
||||
"executionOrder": "v1"
|
||||
},
|
||||
"pinData": {},
|
||||
"staticData": null,
|
||||
"tags": ["dry-run", "openvino", "advisory"]
|
||||
}
|
||||
@@ -0,0 +1,89 @@
|
||||
# OpenVINO Context Gate
|
||||
|
||||
Local-only Atlas/Hermes context-gate advisory prototype.
|
||||
|
||||
This first slice is CLI-only and dry-run by design. It takes a non-private query,
|
||||
optionally asks the localhost classifier on `127.0.0.1:18819` for advisory labels,
|
||||
and emits a compact typed context bundle plan. It does not retrieve private
|
||||
content or change live Atlas/Hermes behavior.
|
||||
|
||||
## Safety invariants
|
||||
|
||||
Closed in v1:
|
||||
|
||||
- live Atlas/Hermes routing changes
|
||||
- memory writes
|
||||
- outbound sends
|
||||
- tool execution by the sidecar
|
||||
- service restarts
|
||||
- vector DB mutation or reindexing
|
||||
- private root broadening
|
||||
- live config changes
|
||||
|
||||
The CLI only plans which source classes an authoritative Atlas/Hermes agent might
|
||||
use later: `durable_memory`, `session_search`, `rag_search`, `repo_files`,
|
||||
`live_system`, `web`, or `no_retrieval`.
|
||||
|
||||
NPU proof is strict: `npu_verified=true` is only emitted when a live classifier
|
||||
request reports a positive endpoint NPU delta and a positive sysfs/endpoint sysfs
|
||||
busy delta. HTTP 200 alone is never treated as proof. Offline and fallback modes
|
||||
set `npu_verified=false` and include a warning.
|
||||
|
||||
## Usage
|
||||
|
||||
Live classifier path, with compact terminal output:
|
||||
|
||||
```bash
|
||||
python scripts/context-gate-advisory.py \
|
||||
--query "How do I check whether the RAG reranker is using the NPU?" \
|
||||
--format compact
|
||||
```
|
||||
|
||||
Deterministic offline smoke, safe for unit-test hosts without NPU services:
|
||||
|
||||
```bash
|
||||
python scripts/context-gate-advisory.py \
|
||||
--offline \
|
||||
--query "Write a haiku about Seattle rain." \
|
||||
--format compact-json
|
||||
```
|
||||
|
||||
Fallback plan if the classifier is down:
|
||||
|
||||
```bash
|
||||
python scripts/context-gate-advisory.py \
|
||||
--allow-offline-fallback \
|
||||
--query "Where did we leave the NPU context gate implementation plan?" \
|
||||
--context platform=kanban \
|
||||
--context repo_path=/home/will/lab/swarm \
|
||||
--format compact-json
|
||||
```
|
||||
|
||||
## Output shape
|
||||
|
||||
Full JSON includes:
|
||||
|
||||
- `schema=atlas_context_gate_plan_v1`
|
||||
- `dry_run=true`
|
||||
- `query_class`
|
||||
- `source_plan`
|
||||
- `bundle_plan`
|
||||
- `npu_proof`
|
||||
- closed `authority`
|
||||
- closed approval `gates`
|
||||
- compact `warnings`
|
||||
|
||||
Compact output intentionally avoids raw private snippets and raw JSON dumps:
|
||||
|
||||
```text
|
||||
ok=true schema=atlas_context_gate_plan_v1 bundle=OpsDebugBundle sources=live_system,repo_files,rag_search source_count=3 npu_verified=false classifier_delta_us=None outer_sysfs_delta_us=None gates=closed:route,memory,send,tools,restart,vector,private_roots,config warnings=offline_heuristic_classifier_no_npu_claim,npu_proof_inconclusive
|
||||
```
|
||||
|
||||
## Notes for reviewers
|
||||
|
||||
- No HTTP service or systemd unit is added in this slice.
|
||||
- The prototype does not call RAG, memory, session search, web, filesystem tools,
|
||||
or the advisory gateway. It only emits a plan.
|
||||
- Unit tests use fake/offline classifier results and do not require live NPU.
|
||||
- Optional live smoke may call only the local classifier endpoint and read
|
||||
`/sys/class/accel/accel0/device/npu_busy_time_us` for positive delta proof.
|
||||
@@ -0,0 +1,5 @@
|
||||
"""Atlas/Hermes local advisory context-gate prototype."""
|
||||
|
||||
from .context_gate import SCHEMA, ContextGateError, build_plan, compact_json, compact_line, validate_plan
|
||||
|
||||
__all__ = ["SCHEMA", "ContextGateError", "build_plan", "compact_json", "compact_line", "validate_plan"]
|
||||
@@ -0,0 +1,90 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import sys
|
||||
from typing import Any
|
||||
|
||||
from .context_gate import (
|
||||
DEFAULT_CLASSIFIER_URL,
|
||||
ContextGateError,
|
||||
build_plan,
|
||||
classify_live,
|
||||
classify_offline,
|
||||
compact_json,
|
||||
compact_line,
|
||||
)
|
||||
|
||||
|
||||
def _parse_context(raw_items: list[str]) -> dict[str, Any]:
|
||||
context: dict[str, Any] = {}
|
||||
for item in raw_items:
|
||||
if "=" not in item:
|
||||
raise ContextGateError(f"invalid_context_item:{item}")
|
||||
key, value = item.split("=", 1)
|
||||
if not key:
|
||||
raise ContextGateError("invalid_context_key")
|
||||
if value.lower() == "true":
|
||||
parsed: Any = True
|
||||
elif value.lower() == "false":
|
||||
parsed = False
|
||||
else:
|
||||
parsed = value
|
||||
context[key] = parsed
|
||||
return context
|
||||
|
||||
|
||||
def build_arg_parser() -> argparse.ArgumentParser:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Emit a local-only Atlas/Hermes advisory context bundle plan. No routing, retrieval, memory writes, sends, restarts, or vector mutations are performed.",
|
||||
)
|
||||
parser.add_argument("--query", required=True, help="Non-private query to plan for")
|
||||
parser.add_argument("--format", choices=["compact", "compact-json", "json"], default="compact")
|
||||
parser.add_argument("--context", action="append", default=[], metavar="KEY=VALUE", help="Optional compact request context, e.g. platform=kanban repo_path=/path")
|
||||
parser.add_argument("--max-sources", type=int, default=4)
|
||||
parser.add_argument("--trace-id")
|
||||
parser.add_argument("--classifier-url", default=DEFAULT_CLASSIFIER_URL)
|
||||
parser.add_argument("--classifier-timeout", type=float, default=8.0)
|
||||
parser.add_argument("--offline", action="store_true", help="Use deterministic heuristic labels; makes no NPU claim")
|
||||
parser.add_argument("--allow-offline-fallback", action="store_true", help="If live classifier is unavailable, emit an advisory fallback plan with npu_verified=false")
|
||||
parser.add_argument("--no-require-npu-proof", action="store_true", help="Do not add npu_proof_inconclusive warning when running offline/fallback")
|
||||
return parser
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
parser = build_arg_parser()
|
||||
args = parser.parse_args(argv)
|
||||
try:
|
||||
context = _parse_context(args.context)
|
||||
options = {
|
||||
"dry_run": True,
|
||||
"max_sources": args.max_sources,
|
||||
"include_private_text": False,
|
||||
"require_npu_proof": not args.no_require_npu_proof,
|
||||
"trace_id": args.trace_id,
|
||||
}
|
||||
if args.offline:
|
||||
classifier = classify_offline(args.query, context)
|
||||
else:
|
||||
try:
|
||||
classifier = classify_live(args.query, context, classifier_url=args.classifier_url, timeout=args.classifier_timeout)
|
||||
except ContextGateError as exc:
|
||||
if not args.allow_offline_fallback:
|
||||
raise
|
||||
classifier = classify_offline(args.query, context, warning=str(exc))
|
||||
plan = build_plan(args.query, context=context, options=options, classifier=classifier)
|
||||
except ContextGateError as exc:
|
||||
print(f"error={exc}", file=sys.stderr)
|
||||
return 2
|
||||
|
||||
if args.format == "json":
|
||||
print(json.dumps(plan, indent=2, sort_keys=True))
|
||||
elif args.format == "compact-json":
|
||||
print(compact_json(plan))
|
||||
else:
|
||||
print(compact_line(plan))
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__": # pragma: no cover
|
||||
raise SystemExit(main())
|
||||
@@ -0,0 +1,482 @@
|
||||
"""Local-only advisory context bundle planner for Atlas/Hermes.
|
||||
|
||||
This module intentionally emits a retrieval/authority plan only. It does not call
|
||||
Hermes memory/session/RAG/web tools, mutate vector stores, broaden private roots,
|
||||
or change live routing.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import ipaddress
|
||||
import re
|
||||
import time
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Any, Mapping, Sequence
|
||||
|
||||
SCHEMA = "atlas_context_gate_plan_v1"
|
||||
NPU_BUSY_PATH = Path("/sys/class/accel/accel0/device/npu_busy_time_us")
|
||||
DEFAULT_CLASSIFIER_URL = "http://127.0.0.1:18819/v1/classify"
|
||||
|
||||
AUTHORITY = {
|
||||
"may_route": False,
|
||||
"may_write_memory": False,
|
||||
"may_send_external": False,
|
||||
"may_process_private_dirs": False,
|
||||
"may_execute_tools": False,
|
||||
"may_restart_services": False,
|
||||
"may_mutate_vector_db": False,
|
||||
"may_change_live_config": False,
|
||||
}
|
||||
|
||||
GATES = {
|
||||
"live_routing_change": "closed_requires_explicit_approval",
|
||||
"memory_write": "closed_requires_explicit_approval",
|
||||
"outbound_send": "closed_requires_explicit_approval",
|
||||
"tool_execution": "closed_requires_explicit_approval",
|
||||
"service_restart": "closed_requires_explicit_approval",
|
||||
"vector_mutation": "closed_requires_explicit_approval",
|
||||
"private_root_broadening": "closed_requires_explicit_approval",
|
||||
}
|
||||
|
||||
_ALLOWED_SOURCES = {
|
||||
"durable_memory",
|
||||
"session_search",
|
||||
"rag_search",
|
||||
"repo_files",
|
||||
"live_system",
|
||||
"web",
|
||||
"no_retrieval",
|
||||
}
|
||||
|
||||
|
||||
class ContextGateError(ValueError):
|
||||
"""Raised for invalid requests or unavailable required local stages."""
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ClassifierResult:
|
||||
labels: Mapping[str, Any]
|
||||
npu_busy_delta_us: int | None
|
||||
sysfs_npu_busy_delta_us: int | None
|
||||
outer_sysfs_delta_us: int | None
|
||||
live: bool
|
||||
warning: str | None = None
|
||||
|
||||
|
||||
def read_npu_busy_time_us(path: Path = NPU_BUSY_PATH) -> int | None:
|
||||
try:
|
||||
return int(path.read_text(encoding="utf-8").strip())
|
||||
except (FileNotFoundError, PermissionError, ValueError, OSError):
|
||||
return None
|
||||
|
||||
|
||||
def _label_value(labels: Mapping[str, Any], name: str, default: Any) -> Any:
|
||||
value = labels.get(name, default)
|
||||
if isinstance(value, Mapping) and "value" in value:
|
||||
return value.get("value", default)
|
||||
return value
|
||||
|
||||
|
||||
def _label_confidence(labels: Mapping[str, Any], name: str, default: float = 0.5) -> float:
|
||||
value = labels.get(name)
|
||||
if isinstance(value, Mapping):
|
||||
try:
|
||||
return float(value.get("confidence", default))
|
||||
except (TypeError, ValueError):
|
||||
return default
|
||||
return default
|
||||
|
||||
|
||||
def heuristic_labels(query: str, context: Mapping[str, Any] | None = None) -> dict[str, Any]:
|
||||
"""Small transparent fallback used by tests and explicit offline smoke mode."""
|
||||
text = query.lower()
|
||||
platform = str((context or {}).get("platform", "unknown")).lower()
|
||||
|
||||
current_words = ["current", "now", "health", "port", "process", "systemd", "status", "npu", "listening", "logs"]
|
||||
prior_words = ["where did we leave", "what did we decide", "previous", "earlier", "handoff", "plan"]
|
||||
coding_words = ["implement", "code", "repo", "test", "pytest", "diff", "branch", "hermes"]
|
||||
research_words = ["research", "compare", "summarize", "explain", "what is", "how do i"]
|
||||
unsafe_words = ["change live routing", "live routing", "restart", "send", "write memory", "reindex", "mutate", "delete"]
|
||||
|
||||
safety = any(w in text for w in unsafe_words)
|
||||
tool_needed = any(w in text for w in current_words + coding_words) or safety
|
||||
|
||||
if platform == "kanban" or "kanban" in text or any(w in text for w in coding_words):
|
||||
category = "coding"
|
||||
elif any(w in text for w in current_words):
|
||||
category = "devops"
|
||||
elif any(w in text for w in research_words + prior_words):
|
||||
category = "research"
|
||||
else:
|
||||
category = "chat"
|
||||
|
||||
if "remember" in text or "preference" in text:
|
||||
memory_candidate = "durable_user_fact"
|
||||
elif "convention" in text or "workflow" in text:
|
||||
memory_candidate = "workflow_convention"
|
||||
else:
|
||||
memory_candidate = "none"
|
||||
|
||||
urgency = "high" if any(w in text for w in ["urgent", "critical", "down", "broken"]) else "normal"
|
||||
return {
|
||||
"tool_needed": {"value": tool_needed, "confidence": 0.76 if tool_needed else 0.68},
|
||||
"memory_candidate": {"value": memory_candidate, "confidence": 0.8 if memory_candidate != "none" else 0.35},
|
||||
"urgency": {"value": urgency, "confidence": 0.8 if urgency == "high" else 0.65},
|
||||
"workflow_category": {"value": category, "confidence": 0.78 if category != "chat" else 0.7},
|
||||
"safety_confirmation_required": {"value": safety, "confidence": 0.9 if safety else 0.2},
|
||||
}
|
||||
|
||||
|
||||
class _NoClassifierRedirectHandler(urllib.request.HTTPRedirectHandler):
|
||||
"""Fail closed instead of following redirects away from a validated local URL."""
|
||||
|
||||
def redirect_request(self, req, fp, code, msg, headers, newurl): # type: ignore[no-untyped-def]
|
||||
return None
|
||||
|
||||
|
||||
_CLASSIFIER_OPENER = urllib.request.build_opener(_NoClassifierRedirectHandler)
|
||||
|
||||
|
||||
def classify_live(
|
||||
query: str,
|
||||
context: Mapping[str, Any] | None = None,
|
||||
classifier_url: str = DEFAULT_CLASSIFIER_URL,
|
||||
timeout: float = 8.0,
|
||||
) -> ClassifierResult:
|
||||
classifier_url = validate_classifier_url(classifier_url)
|
||||
before = read_npu_busy_time_us()
|
||||
payload = {
|
||||
"id": f"context-gate-{int(time.time())}",
|
||||
"text": query,
|
||||
"context": {"platform": (context or {}).get("platform", "cli"), "source": "context_gate"},
|
||||
"options": {"include_evidence": False, "include_embedding_debug": False, "dry_run": True},
|
||||
}
|
||||
req = urllib.request.Request(
|
||||
classifier_url,
|
||||
data=json.dumps(payload).encode("utf-8"),
|
||||
headers={"Content-Type": "application/json"},
|
||||
method="POST",
|
||||
)
|
||||
try:
|
||||
with _CLASSIFIER_OPENER.open(req, timeout=timeout) as resp: # noqa: S310 - local configured endpoint only
|
||||
raw = resp.read(256_000)
|
||||
except (urllib.error.URLError, TimeoutError, OSError) as exc:
|
||||
raise ContextGateError(f"classifier_unavailable: {exc}") from exc
|
||||
after = read_npu_busy_time_us()
|
||||
try:
|
||||
data = json.loads(raw.decode("utf-8"))
|
||||
except json.JSONDecodeError as exc:
|
||||
raise ContextGateError("classifier_invalid_json") from exc
|
||||
labels = data.get("labels")
|
||||
if not isinstance(labels, Mapping):
|
||||
raise ContextGateError("classifier_missing_labels")
|
||||
outer = after - before if before is not None and after is not None else None
|
||||
return ClassifierResult(
|
||||
labels=labels,
|
||||
npu_busy_delta_us=_as_int_or_none(data.get("npu_busy_delta_us")),
|
||||
sysfs_npu_busy_delta_us=_as_int_or_none(data.get("sysfs_npu_busy_delta_us")),
|
||||
outer_sysfs_delta_us=outer,
|
||||
live=True,
|
||||
)
|
||||
|
||||
|
||||
def validate_classifier_url(classifier_url: str) -> str:
|
||||
"""Validate the local-only classifier endpoint before any POST is attempted."""
|
||||
parsed = urllib.parse.urlparse(classifier_url)
|
||||
if parsed.scheme not in {"http", "https"}:
|
||||
raise ContextGateError("invalid_classifier_url:scheme_must_be_http_or_https")
|
||||
host = parsed.hostname
|
||||
if not host:
|
||||
raise ContextGateError("invalid_classifier_url:missing_host")
|
||||
host_normalized = host.lower().rstrip(".")
|
||||
if host_normalized == "localhost":
|
||||
return classifier_url
|
||||
try:
|
||||
address = ipaddress.ip_address(host_normalized)
|
||||
except ValueError as exc:
|
||||
raise ContextGateError("invalid_classifier_url:host_must_be_loopback") from exc
|
||||
if not address.is_loopback:
|
||||
raise ContextGateError("invalid_classifier_url:host_must_be_loopback")
|
||||
return classifier_url
|
||||
|
||||
|
||||
def _as_int_or_none(value: Any) -> int | None:
|
||||
try:
|
||||
return int(value)
|
||||
except (TypeError, ValueError):
|
||||
return None
|
||||
|
||||
|
||||
def classify_offline(query: str, context: Mapping[str, Any] | None = None, warning: str | None = None) -> ClassifierResult:
|
||||
return ClassifierResult(
|
||||
labels=heuristic_labels(query, context),
|
||||
npu_busy_delta_us=None,
|
||||
sysfs_npu_busy_delta_us=None,
|
||||
outer_sysfs_delta_us=None,
|
||||
live=False,
|
||||
warning=warning or "offline_heuristic_classifier_no_npu_claim",
|
||||
)
|
||||
|
||||
|
||||
def _has_any(text: str, needles: list[str]) -> bool:
|
||||
return any(n in text for n in needles)
|
||||
|
||||
|
||||
def _source(source: str, action: str, reason: str, priority: int, freshness: str, confidence: float) -> dict[str, Any]:
|
||||
assert source in _ALLOWED_SOURCES
|
||||
return {
|
||||
"source": source,
|
||||
"action": action,
|
||||
"reason": reason,
|
||||
"priority": priority,
|
||||
"freshness": freshness,
|
||||
"permission": "tool_required_by_authoritative_agent" if source != "no_retrieval" else "none",
|
||||
"missing_behavior": "retrieve_or_mark_missing" if source != "no_retrieval" else "skip_retrieval",
|
||||
"confidence": round(confidence, 2),
|
||||
}
|
||||
|
||||
|
||||
def select_sources(query: str, labels: Mapping[str, Any], context: Mapping[str, Any], max_sources: int) -> list[dict[str, Any]]:
|
||||
text = query.lower()
|
||||
sources: list[dict[str, Any]] = []
|
||||
category = str(_label_value(labels, "workflow_category", "unknown"))
|
||||
memory_candidate = str(_label_value(labels, "memory_candidate", "none"))
|
||||
tool_needed = bool(_label_value(labels, "tool_needed", False))
|
||||
|
||||
if tool_needed or _has_any(text, ["current", "now", "health", "port", "process", "systemd", "status", "npu", "listening", "logs", "time", "date"]):
|
||||
sources.append(_source("live_system", "inspect_with_terminal_or_domain_tool", "current service/system state requested", 1, "live_required", 0.9))
|
||||
|
||||
if context.get("repo_path") or category == "coding" or _has_any(text, ["repo", "code", "file", "test", "pytest", "diff", "implementation", "hermes", "atlas"]):
|
||||
sources.append(_source("repo_files", "inspect_explicit_repo_paths", "repo-specific implementation or config context", 2, "current_filesystem", 0.84))
|
||||
|
||||
if _has_any(text, ["where did we leave", "what did we decide", "previous", "earlier", "handoff", "prior", "last time"]):
|
||||
sources.append(_source("session_search", "search_prior_sessions_or_kanban_handoffs", "prior decision or handoff requested", 3, "session-era", 0.82))
|
||||
|
||||
if _has_any(text, ["runbook", "note", "obsidian", "rag", "docs", "knowledge", "plan"]):
|
||||
sources.append(_source("rag_search", "query_local_index_read_only", "local docs or indexed knowledge likely useful", 4, "cached_index", 0.76))
|
||||
|
||||
if memory_candidate != "none" or _has_any(text, ["preference", "remember", "profile", "durable fact"]):
|
||||
sources.append(_source("durable_memory", "read_stable_facts_only", "stable preference/environment facts may be relevant", 5, "static", 0.72))
|
||||
|
||||
if _has_any(text, ["latest", "news", "version", "release", "public", "web"]):
|
||||
sources.append(_source("web", "search_public_current_sources", "current external public fact requested", 6, "live_external", 0.7))
|
||||
|
||||
if not sources:
|
||||
sources.append(_source("no_retrieval", "answer_directly", "no factual retrieval dependency detected", 1, "none", 0.78))
|
||||
|
||||
# Stable priority order and bounded compact plan.
|
||||
seen: set[str] = set()
|
||||
deduped = []
|
||||
for item in sorted(sources, key=lambda x: x["priority"]):
|
||||
if item["source"] not in seen:
|
||||
seen.add(item["source"])
|
||||
deduped.append(item)
|
||||
return deduped[:max_sources]
|
||||
|
||||
|
||||
def select_bundle_name(query: str, labels: Mapping[str, Any], context: Mapping[str, Any]) -> str:
|
||||
text = query.lower()
|
||||
category = str(_label_value(labels, "workflow_category", "unknown"))
|
||||
if context.get("platform") == "kanban" or context.get("task_id") or category == "coding":
|
||||
return "CodingTaskBundle"
|
||||
if category in {"devops", "debugging"} or _has_any(text, ["health", "port", "systemd", "npu", "service", "logs"]):
|
||||
return "OpsDebugBundle"
|
||||
if category in {"note_taking", "productivity"} or _has_any(text, ["preference", "remember", "profile"]):
|
||||
return "PersonalAssistantBundle"
|
||||
if "no_retrieval" in [s["source"] for s in select_sources(query, labels, context, 1)]:
|
||||
return "SimpleResponseBundle"
|
||||
return "ResearchBundle"
|
||||
|
||||
|
||||
def _field(field: str, shape: str, source: str, freshness: str, missing: str, privacy: str, confidence: float = 0.8) -> dict[str, Any]:
|
||||
return {
|
||||
"field": field,
|
||||
"shape": shape,
|
||||
"source_of_truth": source,
|
||||
"freshness": freshness,
|
||||
"provenance_required": True,
|
||||
"missing_behavior": missing,
|
||||
"privacy": privacy,
|
||||
"confidence": round(confidence, 2),
|
||||
}
|
||||
|
||||
|
||||
def build_bundle_plan(bundle_name: str, sources: Sequence[Mapping[str, Any]], query: str, labels: Mapping[str, Any]) -> dict[str, Any]:
|
||||
safety_required = bool(_label_value(labels, "safety_confirmation_required", False))
|
||||
source_names = {s["source"] for s in sources}
|
||||
if bundle_name == "OpsDebugBundle":
|
||||
required = [
|
||||
_field("problem_statement", "compact_text", "user", "request", "mark_missing", "query_text_only"),
|
||||
_field("target_scope", "service_repo_or_host", "query_or_classifier", "request", "ask_or_infer_low_confidence", "no_private_paths_beyond_explicit"),
|
||||
_field("live_state", "status_table", "live_system", "live_required", "retrieve_or_fail_closed", "no_raw_logs_by_default"),
|
||||
_field("safety_gates", "closed_gate_map", "policy", "static", "fail_closed", "no_private_data"),
|
||||
_field("provenance", "tool_names_and_paths", "executing_agent", "run", "mark_missing", "paths_only"),
|
||||
]
|
||||
elif bundle_name == "CodingTaskBundle":
|
||||
required = [
|
||||
_field("repo_root", "absolute_path", "task_or_context", "current", "ask_or_fail", "explicit_path_only"),
|
||||
_field("git_state", "branch_dirty_counts", "live_system", "live_required", "retrieve_or_fail_closed", "no_diff_dump_by_default"),
|
||||
_field("requirements", "bullet_summary", "user_kanban_files", "current", "retrieve_or_mark_missing", "no_private_snippets"),
|
||||
_field("relevant_paths", "path_list", "repo_files", "current_filesystem", "search_narrowly", "paths_only"),
|
||||
_field("tests_or_smokes", "command_list", "repo_files", "current_filesystem", "mark_missing", "commands_only"),
|
||||
_field("review_gates", "closed_gate_map", "policy", "static", "fail_closed", "no_private_data"),
|
||||
]
|
||||
elif bundle_name == "PersonalAssistantBundle":
|
||||
required = [
|
||||
_field("user_intent", "compact_text", "user", "request", "mark_missing", "query_text_only"),
|
||||
_field("durable_facts_needed", "fact_keys", "durable_memory", "static", "retrieve_or_mark_missing", "no_raw_memory_dump"),
|
||||
_field("prior_decisions_needed", "session_refs", "session_search", "session-era", "retrieve_or_mark_missing", "summaries_only"),
|
||||
_field("privacy_boundary", "closed_gate_map", "policy", "static", "fail_closed", "no_private_data"),
|
||||
_field("action_authority", "closed_gate_map", "policy", "static", "fail_closed", "no_private_data"),
|
||||
]
|
||||
elif bundle_name == "SimpleResponseBundle":
|
||||
required = []
|
||||
else:
|
||||
required = [
|
||||
_field("research_question", "compact_text", "user", "request", "mark_missing", "query_text_only"),
|
||||
_field("source_plan", "ordered_source_list", "context_gate", "run", "mark_missing", "no_private_snippets"),
|
||||
_field("evidence_requirements", "provenance_rules", "policy", "static", "fail_closed", "no_private_data"),
|
||||
_field("freshness_cutoff", "freshness_policy", "classifier_query", "request", "mark_missing", "no_private_data"),
|
||||
_field("missing_data_behavior", "policy_enum", "policy", "static", "fail_closed", "no_private_data"),
|
||||
]
|
||||
|
||||
blocked = []
|
||||
if safety_required or re.search(r"\b(route|routing|restart|send|write memory|reindex|delete|mutate)\b", query.lower()):
|
||||
blocked.append(_field("authority_side_effect", "approval_required", "policy", "static", "fail_closed", "no_side_effects_in_v1", 0.95))
|
||||
if "rag_search" in source_names:
|
||||
blocked.append(_field("vector_db_mutation", "not_allowed", "policy", "static", "fail_closed", "read_only_query_plan", 0.95))
|
||||
return {"bundle_name": bundle_name, "required_fields": required, "optional_fields": [], "blocked_fields": blocked}
|
||||
|
||||
|
||||
def summarize_query_class(labels: Mapping[str, Any]) -> dict[str, Any]:
|
||||
return {
|
||||
"workflow_category": _label_value(labels, "workflow_category", "unknown"),
|
||||
"urgency": _label_value(labels, "urgency", "normal"),
|
||||
"tool_needed": bool(_label_value(labels, "tool_needed", False)),
|
||||
"memory_candidate": _label_value(labels, "memory_candidate", "none"),
|
||||
"safety_confirmation_required": bool(_label_value(labels, "safety_confirmation_required", False)),
|
||||
"confidence": round(max(
|
||||
_label_confidence(labels, "workflow_category", 0.5),
|
||||
_label_confidence(labels, "tool_needed", 0.5),
|
||||
_label_confidence(labels, "safety_confirmation_required", 0.5),
|
||||
), 2),
|
||||
}
|
||||
|
||||
|
||||
def npu_proof_from_classifier(result: ClassifierResult, require_npu_proof: bool) -> tuple[dict[str, Any], list[str]]:
|
||||
endpoint_delta = result.npu_busy_delta_us
|
||||
endpoint_sysfs_delta = result.sysfs_npu_busy_delta_us
|
||||
outer_delta = result.outer_sysfs_delta_us
|
||||
positive_endpoint_sysfs = endpoint_sysfs_delta is not None and endpoint_sysfs_delta > 0
|
||||
positive_outer = outer_delta is not None and outer_delta > 0
|
||||
verified = bool(result.live and (positive_endpoint_sysfs or positive_outer))
|
||||
warnings: list[str] = []
|
||||
if result.warning:
|
||||
warnings.append(result.warning)
|
||||
if require_npu_proof and not verified:
|
||||
warnings.append("npu_proof_inconclusive")
|
||||
return {
|
||||
"classifier_delta_us": endpoint_delta,
|
||||
"classifier_sysfs_delta_us": endpoint_sysfs_delta,
|
||||
"outer_sysfs_delta_us": outer_delta,
|
||||
"rerank_delta_us": None,
|
||||
"verified": verified,
|
||||
"required": require_npu_proof,
|
||||
"classifier_live": result.live,
|
||||
}, warnings
|
||||
|
||||
|
||||
def build_plan(
|
||||
query: str,
|
||||
*,
|
||||
context: Mapping[str, Any] | None = None,
|
||||
options: Mapping[str, Any] | None = None,
|
||||
classifier: ClassifierResult | None = None,
|
||||
) -> dict[str, Any]:
|
||||
if not query or not query.strip():
|
||||
raise ContextGateError("query_required")
|
||||
context = dict(context or {})
|
||||
options = dict(options or {})
|
||||
if options.get("dry_run", True) is not True:
|
||||
raise ContextGateError("dry_run_must_remain_true_in_v1")
|
||||
if options.get("include_private_text", False):
|
||||
raise ContextGateError("include_private_text_not_allowed_in_v1")
|
||||
max_sources = max(1, min(6, int(options.get("max_sources", 4))))
|
||||
require_npu = bool(options.get("require_npu_proof", True))
|
||||
if classifier is None:
|
||||
classifier = classify_offline(query, context)
|
||||
labels = classifier.labels
|
||||
source_plan = select_sources(query, labels, context, max_sources)
|
||||
bundle_name = select_bundle_name(query, labels, context)
|
||||
npu_proof, warnings = npu_proof_from_classifier(classifier, require_npu)
|
||||
plan = {
|
||||
"schema": SCHEMA,
|
||||
"trace_id": options.get("trace_id") or context.get("trace_id"),
|
||||
"dry_run": True,
|
||||
"ok": True,
|
||||
"query_class": summarize_query_class(labels),
|
||||
"source_plan": source_plan,
|
||||
"bundle_plan": build_bundle_plan(bundle_name, source_plan, query, labels),
|
||||
"npu_proof": npu_proof,
|
||||
"authority": dict(AUTHORITY),
|
||||
"gates": dict(GATES),
|
||||
"warnings": warnings,
|
||||
}
|
||||
validate_plan(plan)
|
||||
return plan
|
||||
|
||||
|
||||
def validate_plan(plan: Mapping[str, Any]) -> None:
|
||||
if plan.get("schema") != SCHEMA:
|
||||
raise ContextGateError("invalid_schema")
|
||||
if plan.get("dry_run") is not True:
|
||||
raise ContextGateError("dry_run_missing")
|
||||
if plan.get("authority") != AUTHORITY:
|
||||
raise ContextGateError("authority_not_closed")
|
||||
sources = plan.get("source_plan")
|
||||
if not isinstance(sources, list) or not sources:
|
||||
raise ContextGateError("source_plan_required")
|
||||
for item in sources:
|
||||
if item.get("source") not in _ALLOWED_SOURCES:
|
||||
raise ContextGateError(f"invalid_source:{item.get('source')}")
|
||||
required_blocks = ["query_class", "bundle_plan", "npu_proof", "gates"]
|
||||
for block in required_blocks:
|
||||
if block not in plan:
|
||||
raise ContextGateError(f"missing_block:{block}")
|
||||
|
||||
|
||||
def compact_line(plan: Mapping[str, Any]) -> str:
|
||||
sources = ",".join(str(s["source"]) for s in plan["source_plan"])
|
||||
closed = "route,memory,send,tools,restart,vector,private_roots,config"
|
||||
warnings = ",".join(plan.get("warnings") or []) or "none"
|
||||
return (
|
||||
f"ok={str(plan['ok']).lower()} schema={plan['schema']} "
|
||||
f"bundle={plan['bundle_plan']['bundle_name']} sources={sources} "
|
||||
f"source_count={len(plan['source_plan'])} "
|
||||
f"npu_verified={str(plan['npu_proof']['verified']).lower()} "
|
||||
f"classifier_delta_us={plan['npu_proof'].get('classifier_delta_us')} "
|
||||
f"outer_sysfs_delta_us={plan['npu_proof'].get('outer_sysfs_delta_us')} "
|
||||
f"gates=closed:{closed} warnings={warnings}"
|
||||
)
|
||||
|
||||
|
||||
def compact_json(plan: Mapping[str, Any]) -> str:
|
||||
compact = {
|
||||
"schema": plan["schema"],
|
||||
"ok": plan["ok"],
|
||||
"dry_run": plan["dry_run"],
|
||||
"bundle_name": plan["bundle_plan"]["bundle_name"],
|
||||
"sources": [s["source"] for s in plan["source_plan"]],
|
||||
"source_count": len(plan["source_plan"]),
|
||||
"query_class": plan["query_class"],
|
||||
"npu_proof": plan["npu_proof"],
|
||||
"authority": plan["authority"],
|
||||
"gates_closed": list(plan["gates"].keys()),
|
||||
"warnings": plan.get("warnings", []),
|
||||
}
|
||||
return json.dumps(compact, sort_keys=True, separators=(",", ":"))
|
||||
@@ -0,0 +1,16 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Thin repo-local wrapper for the Atlas/Hermes context-gate advisory CLI."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parents[1]
|
||||
if str(REPO_ROOT) not in sys.path:
|
||||
sys.path.insert(0, str(REPO_ROOT))
|
||||
|
||||
from openvino_context_gate.cli import main # noqa: E402
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
Executable
+526
@@ -0,0 +1,526 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Dry-run Kanban hygiene advisory classifier.
|
||||
|
||||
Reads compact board/task summaries and emits bounded labels/next gates without
|
||||
mutating any Hermes Kanban state. Phase 1 is deterministic rules only; it does
|
||||
not call kanban tools, restart services, write memory, or send outbound data.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
SCHEMA = "kanban_hygiene_advisory_v1"
|
||||
AUTHORITY = {
|
||||
"may_mutate_board": False,
|
||||
"may_assign": False,
|
||||
"may_block_or_unblock": False,
|
||||
"may_complete_or_archive": False,
|
||||
"may_create_tasks": False,
|
||||
"may_write_memory": False,
|
||||
"may_send_external": False,
|
||||
"may_restart_services": False,
|
||||
"may_execute_tools": False,
|
||||
}
|
||||
NPU_PROOF = {
|
||||
"required_for_npu_claims": True,
|
||||
"attempted": False,
|
||||
"ok": None,
|
||||
"npu_busy_delta_us": None,
|
||||
}
|
||||
|
||||
REQUIRED_TASK_FIELDS = {"id", "title", "status", "assignee", "created_at", "updated_at"}
|
||||
SUPPORTED_STATUSES = {
|
||||
"triage",
|
||||
"todo",
|
||||
"ready",
|
||||
"running",
|
||||
"blocked",
|
||||
"done",
|
||||
"archived",
|
||||
"failed",
|
||||
"cancelled",
|
||||
}
|
||||
TASK_TYPES = {
|
||||
"charter",
|
||||
"discovery",
|
||||
"spec",
|
||||
"implement",
|
||||
"test",
|
||||
"review",
|
||||
"docs",
|
||||
"ops",
|
||||
"integration",
|
||||
"final",
|
||||
"unknown",
|
||||
}
|
||||
LANES = {
|
||||
"observability_utilization",
|
||||
"cron_n8n_classifier",
|
||||
"rag_context_gate",
|
||||
"doc_image_audio_triage",
|
||||
"voice_audio_pipeline",
|
||||
"kanban_hygiene",
|
||||
"docs_runbook_service_map",
|
||||
"ops_integration",
|
||||
"final_closeout",
|
||||
"general",
|
||||
"unknown",
|
||||
}
|
||||
LIFECYCLE_PREFIXES = {
|
||||
"charter",
|
||||
"discovery",
|
||||
"spec",
|
||||
"implement",
|
||||
"test",
|
||||
"review",
|
||||
"docs",
|
||||
"doc",
|
||||
"ops",
|
||||
"integration",
|
||||
"final",
|
||||
}
|
||||
|
||||
|
||||
def compact_text(task: dict[str, Any]) -> str:
|
||||
parts = [str(task.get("title", "")), str(task.get("body_excerpt", "")), str(task.get("last_run_summary_excerpt", "")), str(task.get("last_comment_excerpt", ""))]
|
||||
return " ".join(part for part in parts if part).lower()
|
||||
|
||||
|
||||
def load_jsonl(raw: str) -> tuple[list[dict[str, Any]], dict[str, Any]]:
|
||||
tasks = []
|
||||
for line_no, line in enumerate(raw.splitlines(), start=1):
|
||||
if not line.strip():
|
||||
continue
|
||||
try:
|
||||
row = json.loads(line)
|
||||
except json.JSONDecodeError as exc:
|
||||
raise ValueError(f"invalid JSONL on line {line_no}: {exc.msg}") from exc
|
||||
if not isinstance(row, dict):
|
||||
raise ValueError(f"JSONL line {line_no} is not an object")
|
||||
tasks.append(row)
|
||||
return tasks, {}
|
||||
|
||||
|
||||
def load_input(path: str | None, fmt: str) -> tuple[list[dict[str, Any]], dict[str, Any]]:
|
||||
raw = sys.stdin.read() if not path or path == "-" else Path(path).read_text(encoding="utf-8")
|
||||
if not raw.strip():
|
||||
raise ValueError("input is empty")
|
||||
|
||||
parse_as_jsonl = fmt == "jsonl" or (fmt == "auto" and "\n" in raw.strip() and not raw.lstrip().startswith(("{", "[")))
|
||||
if parse_as_jsonl:
|
||||
return load_jsonl(raw)
|
||||
|
||||
try:
|
||||
data = json.loads(raw)
|
||||
except json.JSONDecodeError as exc:
|
||||
if fmt == "auto" and "\n" in raw.strip():
|
||||
return load_jsonl(raw)
|
||||
raise ValueError(f"invalid JSON input: {exc.msg}") from exc
|
||||
if isinstance(data, list):
|
||||
if not all(isinstance(item, dict) for item in data):
|
||||
raise ValueError("JSON list must contain task objects")
|
||||
return data, {}
|
||||
if isinstance(data, dict):
|
||||
tasks = data.get("tasks")
|
||||
if tasks is None:
|
||||
# Treat a single object with required task fields as a one-task summary.
|
||||
if REQUIRED_TASK_FIELDS.issubset(data):
|
||||
return [data], {}
|
||||
raise ValueError("JSON object must contain a 'tasks' list")
|
||||
if not isinstance(tasks, list) or not all(isinstance(item, dict) for item in tasks):
|
||||
raise ValueError("'tasks' must be a list of objects")
|
||||
metadata = {key: value for key, value in data.items() if key != "tasks"}
|
||||
return tasks, metadata
|
||||
raise ValueError("input must be JSON object, JSON list, or JSON Lines")
|
||||
|
||||
|
||||
def validate_task(task: dict[str, Any]) -> None:
|
||||
missing = sorted(REQUIRED_TASK_FIELDS - set(task))
|
||||
if missing:
|
||||
task_id = task.get("id", "<unknown>")
|
||||
raise ValueError(f"task {task_id} missing required fields: {', '.join(missing)}")
|
||||
status = str(task.get("status"))
|
||||
if status not in SUPPORTED_STATUSES:
|
||||
raise ValueError(f"task {task.get('id')} has unsupported status: {status}")
|
||||
for field in ("created_at", "updated_at"):
|
||||
if not isinstance(task.get(field), (int, float)):
|
||||
raise ValueError(f"task {task.get('id')} field {field} must be epoch seconds")
|
||||
|
||||
|
||||
def confidence(value: float) -> float:
|
||||
return round(max(0.0, min(1.0, value)), 2)
|
||||
|
||||
|
||||
def classify_task_type(task: dict[str, Any]) -> dict[str, Any]:
|
||||
title = str(task.get("title", "")).strip().lower()
|
||||
body = compact_text(task)
|
||||
prefix = title.split(":", 1)[0].strip() if ":" in title else ""
|
||||
prefix_map = {"doc": "docs"}
|
||||
if prefix in LIFECYCLE_PREFIXES:
|
||||
value = prefix_map.get(prefix, prefix)
|
||||
if value in TASK_TYPES:
|
||||
return {"value": value, "confidence": 0.95, "reason_codes": [f"title_prefix_{value}"]}
|
||||
keyword_rules = [
|
||||
("discovery", ["discover", "inventory", "repo map", "read-only"]),
|
||||
("spec", ["spec", "define", "contract", "schema"]),
|
||||
("implement", ["implement", "engineer", "script", "code", "build"]),
|
||||
("review", ["review", "approve", "findings"]),
|
||||
("docs", ["docs", "runbook", "readme"]),
|
||||
("ops", ["ops", "health", "monitor", "deploy", "cleanup"]),
|
||||
("integration", ["integration", "merge", "cherry-pick", "fan-in"]),
|
||||
("final", ["final", "closeout", "synthesis"]),
|
||||
("test", ["test", "smoke", "validate"]),
|
||||
("charter", ["charter", "program framing"]),
|
||||
]
|
||||
for value, needles in keyword_rules:
|
||||
if any(needle in body for needle in needles):
|
||||
return {"value": value, "confidence": 0.78, "reason_codes": [f"keyword_{value}"]}
|
||||
return {"value": "unknown", "confidence": 0.2, "reason_codes": ["insufficient_signal"]}
|
||||
|
||||
|
||||
def classify_lane(task: dict[str, Any]) -> dict[str, Any]:
|
||||
text = compact_text(task)
|
||||
rules = [
|
||||
("kanban_hygiene", ["kanban", "task hygiene", "board summaries", "review-needed", "next gate"]),
|
||||
("cron_n8n_classifier", ["cron", "n8n", "alert", "event classifier"]),
|
||||
("rag_context_gate", ["rag", "context gate", "retrieval", "bundle"]),
|
||||
("doc_image_audio_triage", ["document", "image", "audio triage", "ocr", "attachments"]),
|
||||
("voice_audio_pipeline", ["voice", "whisper", "memo", "transcribe"]),
|
||||
("docs_runbook_service_map", ["service map", "runbook", "readme"]),
|
||||
("observability_utilization", ["health", "utilization", "metrics", "digest"]),
|
||||
("ops_integration", ["merge", "integration", "cherry-pick", "fan-in"]),
|
||||
("final_closeout", ["final", "closeout", "synthesis"]),
|
||||
]
|
||||
for value, needles in rules:
|
||||
matched = [needle.replace(" ", "_") for needle in needles if needle in text]
|
||||
if matched:
|
||||
return {"value": value, "confidence": 0.9, "reason_codes": [f"mentions_{matched[0]}"]}
|
||||
if text:
|
||||
return {"value": "general", "confidence": 0.45, "reason_codes": ["no_lane_specific_signal"]}
|
||||
return {"value": "unknown", "confidence": 0.1, "reason_codes": ["insufficient_signal"]}
|
||||
|
||||
|
||||
def classify_project(task: dict[str, Any], board: str | None, input_metadata: dict[str, Any]) -> dict[str, Any]:
|
||||
explicit = task.get("project") or input_metadata.get("project")
|
||||
if explicit:
|
||||
return {"value": str(explicit), "confidence": 0.9, "source": "input"}
|
||||
board_name = board or input_metadata.get("board")
|
||||
if board_name:
|
||||
return {"value": str(board_name), "confidence": 0.98, "source": "board_name"}
|
||||
text = compact_text(task)
|
||||
if "npu" in text or "openvino" in text:
|
||||
return {"value": "npu-maximization", "confidence": 0.72, "source": "body"}
|
||||
return {"value": "unknown", "confidence": 0.1, "source": "unknown"}
|
||||
|
||||
|
||||
def classify_blocker(task: dict[str, Any]) -> dict[str, Any]:
|
||||
status = str(task.get("status"))
|
||||
text = compact_text(task)
|
||||
last_outcome = str(task.get("last_run_outcome") or "").lower()
|
||||
reason_codes: list[str] = []
|
||||
value = "none"
|
||||
blocked = False
|
||||
conf = 0.0
|
||||
|
||||
if status == "blocked":
|
||||
blocked = True
|
||||
conf = 0.85
|
||||
if "review-required" in text or "changes requested" in text:
|
||||
value = "review_changes_requested"
|
||||
reason_codes.append("blocked_review_required_or_changes")
|
||||
elif any(word in text for word in ("credential", "token", "path", "spawn_failed")):
|
||||
value = "missing_credentials"
|
||||
reason_codes.append("blocked_missing_credentials_or_path")
|
||||
elif any(word in text for word in ("human", "approval", "decision", "confirm")):
|
||||
value = "human_decision"
|
||||
reason_codes.append("blocked_human_decision")
|
||||
else:
|
||||
value = "unknown"
|
||||
reason_codes.append("status_blocked")
|
||||
elif status == "todo" and task.get("parents"):
|
||||
value = "missing_parent"
|
||||
conf = 0.75
|
||||
reason_codes.append("todo_with_parents")
|
||||
elif last_outcome in {"crashed", "timed_out", "failed"}:
|
||||
value = "failed_tests"
|
||||
conf = 0.65
|
||||
reason_codes.append(f"last_run_{last_outcome}")
|
||||
|
||||
return {"value": value, "blocked": blocked, "confidence": confidence(conf), "reason_codes": reason_codes}
|
||||
|
||||
|
||||
def age_hours(now: float, timestamp: Any) -> float | None:
|
||||
if not isinstance(timestamp, (int, float)):
|
||||
return None
|
||||
return round(max(0.0, now - float(timestamp)) / 3600.0, 2)
|
||||
|
||||
|
||||
def classify_staleness(task: dict[str, Any], now: float) -> dict[str, Any]:
|
||||
status = str(task.get("status"))
|
||||
created = float(task["created_at"])
|
||||
activity_ts = float(task.get("heartbeat_at") or task.get("last_activity_at") or task.get("updated_at") or created)
|
||||
age = age_hours(now, created)
|
||||
last_activity = age_hours(now, activity_ts)
|
||||
threshold = 24
|
||||
value = "fresh"
|
||||
reason_codes: list[str] = []
|
||||
|
||||
if status == "running":
|
||||
threshold = 1
|
||||
if last_activity is not None and last_activity > 1:
|
||||
value = "stale_lock"
|
||||
reason_codes.append("running_no_recent_heartbeat")
|
||||
elif status == "ready":
|
||||
threshold = 24
|
||||
if last_activity is not None and last_activity >= 72:
|
||||
value = "stale"
|
||||
reason_codes.append("ready_over_72h")
|
||||
elif last_activity is not None and last_activity >= 24:
|
||||
value = "aging"
|
||||
reason_codes.append("ready_over_24h")
|
||||
elif status == "blocked":
|
||||
review_required = "review-required" in compact_text(task)
|
||||
threshold = 24 if review_required else 48
|
||||
if last_activity is not None and last_activity >= 168:
|
||||
value = "stale"
|
||||
reason_codes.append("blocked_over_7d")
|
||||
elif review_required and last_activity is not None and last_activity >= 72:
|
||||
value = "stale"
|
||||
reason_codes.append("review_required_over_72h")
|
||||
elif last_activity is not None and last_activity >= threshold:
|
||||
value = "aging"
|
||||
reason_codes.append("blocked_or_review_aging")
|
||||
elif status == "todo" and not task.get("parents") and last_activity is not None and last_activity >= 72:
|
||||
value = "orphaned"
|
||||
threshold = 72
|
||||
reason_codes.append("todo_without_parents_over_72h")
|
||||
|
||||
return {
|
||||
"value": value,
|
||||
"age_hours": age,
|
||||
"last_activity_hours": last_activity,
|
||||
"threshold_hours": threshold,
|
||||
"reason_codes": reason_codes,
|
||||
}
|
||||
|
||||
|
||||
def normalize_title(title: str) -> str:
|
||||
text = title.lower().strip()
|
||||
text = re.sub(r"^(charter|discovery|spec|implement|test|review|docs?|ops|integration|final)\s*:\s*", "", text)
|
||||
text = re.sub(r"[^a-z0-9]+", " ", text)
|
||||
return re.sub(r"\s+", " ", text).strip()
|
||||
|
||||
|
||||
def find_duplicates(tasks: list[dict[str, Any]], labels: dict[str, dict[str, Any]]) -> dict[str, dict[str, Any]]:
|
||||
groups: dict[tuple[str, str, str], list[str]] = {}
|
||||
active_statuses = SUPPORTED_STATUSES - {"done", "archived", "cancelled"}
|
||||
for task in tasks:
|
||||
if str(task.get("status")) not in active_statuses:
|
||||
continue
|
||||
task_id = str(task["id"])
|
||||
key = (
|
||||
normalize_title(str(task.get("title", ""))),
|
||||
labels[task_id]["lane"]["value"],
|
||||
labels[task_id]["task_type"]["value"],
|
||||
)
|
||||
if key[0]:
|
||||
groups.setdefault(key, []).append(task_id)
|
||||
|
||||
result = {
|
||||
str(task["id"]): {
|
||||
"is_duplicate": False,
|
||||
"canonical_task_id": None,
|
||||
"candidate_ids": [],
|
||||
"confidence": 0.0,
|
||||
"reason_codes": [],
|
||||
}
|
||||
for task in tasks
|
||||
}
|
||||
for ids in groups.values():
|
||||
if len(ids) < 2:
|
||||
continue
|
||||
canonical = sorted(ids)[0]
|
||||
for task_id in ids:
|
||||
candidates = [candidate for candidate in ids if candidate != task_id]
|
||||
result[task_id] = {
|
||||
"is_duplicate": task_id != canonical,
|
||||
"canonical_task_id": canonical if task_id != canonical else None,
|
||||
"candidate_ids": candidates,
|
||||
"confidence": 0.86,
|
||||
"reason_codes": ["same_normalized_title_lane_and_task_type"],
|
||||
}
|
||||
return result
|
||||
|
||||
|
||||
def has_non_positive_npu_busy_delta(text: str) -> bool:
|
||||
if "npu" not in text and "busy" not in text:
|
||||
return False
|
||||
patterns = [
|
||||
r"\b(?:npu_)?busy(?:_time)?(?:_delta)?(?:_us)?\s*[=:]\s*([+-]?\d+(?:\.\d+)?)\b",
|
||||
r"\b(?:npu_)?delta(?:_us)?\s*[=:]\s*([+-]?\d+(?:\.\d+)?)\b",
|
||||
]
|
||||
for pattern in patterns:
|
||||
for match in re.finditer(pattern, text):
|
||||
try:
|
||||
if float(match.group(1)) <= 0:
|
||||
return True
|
||||
except ValueError:
|
||||
continue
|
||||
return False
|
||||
|
||||
|
||||
def classify_review_needed(task: dict[str, Any], task_type: str) -> dict[str, Any]:
|
||||
text = compact_text(task)
|
||||
changed_files = task.get("changed_files") or task.get("diff_path") or task.get("tests_run")
|
||||
if has_non_positive_npu_busy_delta(text):
|
||||
return {"value": True, "kind": "npu_proof_gate", "confidence": 0.84, "reason_codes": ["npu_claim_non_positive_busy_delta"]}
|
||||
if "npu" in text and ("http 200" in text or "no busy" in text or "missing busy" in text):
|
||||
return {"value": True, "kind": "npu_proof_gate", "confidence": 0.8, "reason_codes": ["npu_claim_needs_busy_delta"]}
|
||||
if "review-required" in text:
|
||||
kind = "code_change" if task_type == "implement" else "spec_review"
|
||||
return {"value": True, "kind": kind, "confidence": 0.92, "reason_codes": ["review_required_marker"]}
|
||||
if changed_files and task_type in {"implement", "ops", "docs"}:
|
||||
return {"value": True, "kind": "code_change", "confidence": 0.86, "reason_codes": ["reported_changed_files_or_tests"]}
|
||||
if any(needle in text for needle in ("routing authority", "restart service", "write memory", "send outbound", "private root", "wildcard bind", "vector db mutation")):
|
||||
return {"value": True, "kind": "human_approval", "confidence": 0.84, "reason_codes": ["authority_change_requires_approval"]}
|
||||
return {"value": False, "kind": "none", "confidence": 0.2, "reason_codes": []}
|
||||
|
||||
|
||||
def classify_next_gate(task: dict[str, Any], labels: dict[str, Any]) -> dict[str, Any]:
|
||||
task_type = labels["task_type"]["value"]
|
||||
status = str(task.get("status"))
|
||||
reason_codes: list[str] = []
|
||||
|
||||
if labels["duplicate"]["is_duplicate"]:
|
||||
return {"value": "dedupe_review", "confidence": 0.86, "reason_codes": ["duplicate_candidate"]}
|
||||
if labels["staleness"]["value"] == "stale_lock":
|
||||
return {"value": "investigate_stale_lock", "confidence": 0.88, "reason_codes": ["running_stale_lock"]}
|
||||
blocker = labels["blocker"]
|
||||
if blocker["value"] in {"human_decision", "missing_credentials", "unknown"} and blocker["blocked"]:
|
||||
return {"value": "needs_human_decision", "confidence": 0.85, "reason_codes": blocker["reason_codes"] or ["blocked"]}
|
||||
if blocker["value"] == "missing_parent":
|
||||
return {"value": "wait_for_parents", "confidence": 0.82, "reason_codes": ["unfinished_parents"]}
|
||||
if task_type == "implement" and not (task.get("tests_run") or task.get("test_evidence")) and status in {"blocked", "done"}:
|
||||
return {"value": "needs_test_evidence", "confidence": 0.78, "reason_codes": ["implementation_without_test_evidence"]}
|
||||
review_needed = labels["review_needed"]
|
||||
if review_needed["kind"] == "npu_proof_gate":
|
||||
return {"value": "needs_npu_proof", "confidence": 0.8, "reason_codes": review_needed["reason_codes"]}
|
||||
if review_needed["value"]:
|
||||
return {"value": "ready_for_review", "confidence": 0.86, "reason_codes": review_needed["reason_codes"]}
|
||||
|
||||
gate_by_type = {
|
||||
"spec": "ready_for_implementation",
|
||||
"implement": "ready_for_review",
|
||||
"review": "ready_for_integration",
|
||||
"docs": "ready_for_integration",
|
||||
"ops": "ready_for_ops_validation",
|
||||
"integration": "ready_for_closeout",
|
||||
"final": "safe_to_complete",
|
||||
"discovery": "safe_to_complete",
|
||||
"charter": "ready_for_spec",
|
||||
"test": "ready_for_review",
|
||||
}
|
||||
type_gate = gate_by_type.get(task_type, "unknown")
|
||||
if task_type in gate_by_type:
|
||||
reason_codes.append(f"task_type_{task_type}")
|
||||
return {"value": type_gate, "confidence": 0.74 if type_gate != "unknown" else 0.2, "reason_codes": reason_codes}
|
||||
|
||||
|
||||
def advisory(tasks: list[dict[str, Any]], *, board: str | None, now: float, input_metadata: dict[str, Any], include_evidence: bool) -> dict[str, Any]:
|
||||
for task in tasks:
|
||||
validate_task(task)
|
||||
|
||||
prelim: dict[str, dict[str, Any]] = {}
|
||||
for task in tasks:
|
||||
task_id = str(task["id"])
|
||||
prelim[task_id] = {
|
||||
"task_type": classify_task_type(task),
|
||||
"project": classify_project(task, board, input_metadata),
|
||||
"lane": classify_lane(task),
|
||||
"blocker": classify_blocker(task),
|
||||
"staleness": classify_staleness(task, now),
|
||||
}
|
||||
duplicates = find_duplicates(tasks, prelim)
|
||||
|
||||
items = []
|
||||
for task in tasks:
|
||||
task_id = str(task["id"])
|
||||
labels = dict(prelim[task_id])
|
||||
labels["duplicate"] = duplicates[task_id]
|
||||
labels["review_needed"] = classify_review_needed(task, labels["task_type"]["value"])
|
||||
labels["next_gate"] = classify_next_gate(task, labels)
|
||||
item = {
|
||||
"task_id": task_id,
|
||||
**labels,
|
||||
"warnings": [],
|
||||
}
|
||||
if include_evidence:
|
||||
item["evidence"] = {
|
||||
"normalized_title": normalize_title(str(task.get("title", ""))),
|
||||
"status": task.get("status"),
|
||||
"parents_count": len(task.get("parents") or []),
|
||||
"children_count": len(task.get("children") or []),
|
||||
}
|
||||
items.append(item)
|
||||
|
||||
counts = {
|
||||
"tasks": len(items),
|
||||
"duplicates": sum(1 for item in items if item["duplicate"]["is_duplicate"]),
|
||||
"review_needed": sum(1 for item in items if item["review_needed"]["value"]),
|
||||
"stale": sum(1 for item in items if item["staleness"]["value"] in {"stale", "stale_lock", "orphaned"}),
|
||||
"blocked": sum(1 for item in items if item["blocker"]["blocked"]),
|
||||
}
|
||||
return {
|
||||
"schema": SCHEMA,
|
||||
"dry_run": True,
|
||||
"created": int(now),
|
||||
"board": board or input_metadata.get("board") or None,
|
||||
"counts": counts,
|
||||
"authority": AUTHORITY,
|
||||
"npu_proof": NPU_PROOF,
|
||||
"items": items,
|
||||
}
|
||||
|
||||
|
||||
def build_parser() -> argparse.ArgumentParser:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Dry-run Kanban hygiene advisory classifier",
|
||||
epilog="Input: JSON object with tasks[] or JSONL task objects. Required task fields: id,title,status,assignee,created_at,updated_at. Optional compact fields such as body_excerpt, parents, children, changed_files, tests_run, last_run_outcome, and last_comment_excerpt improve labels.",
|
||||
)
|
||||
parser.add_argument("--input", "-i", help="Input JSON/JSONL file; omit or '-' for stdin")
|
||||
parser.add_argument("--format", choices=["auto", "json", "jsonl"], default="auto", help="Input format")
|
||||
parser.add_argument("--board", help="Board/project name to include in output")
|
||||
parser.add_argument("--now", type=float, default=None, help="Epoch seconds for deterministic staleness tests")
|
||||
parser.add_argument("--compact", action="store_true", help="Accepted for compatibility; output is compact JSON by default")
|
||||
parser.add_argument("--include-evidence", action="store_true", help="Include short derived evidence fields")
|
||||
return parser
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
parser = build_parser()
|
||||
args = parser.parse_args(argv)
|
||||
try:
|
||||
tasks, metadata = load_input(args.input, args.format)
|
||||
output = advisory(
|
||||
tasks,
|
||||
board=args.board,
|
||||
now=args.now if args.now is not None else time.time(),
|
||||
input_metadata=metadata,
|
||||
include_evidence=args.include_evidence,
|
||||
)
|
||||
except (OSError, ValueError) as exc:
|
||||
print(f"kanban-hygiene-advisory: {exc}", file=sys.stderr)
|
||||
return 2
|
||||
print(json.dumps(output, sort_keys=True, separators=(",", ":")))
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
Executable
+523
@@ -0,0 +1,523 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Explicit-root dry-run batch triage for local documents, images, and audio.
|
||||
|
||||
This wrapper is intentionally report-only. It requires a lane-scoped approved
|
||||
root in a manifest, rejects request roots that broaden that approval, redacts raw
|
||||
text/transcripts by default, and never mutates Obsidian, RAG/vector DBs, files,
|
||||
routing, memory, services, or sends.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import datetime as dt
|
||||
import hashlib
|
||||
import ipaddress
|
||||
import importlib.util
|
||||
import json
|
||||
import mimetypes
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
try:
|
||||
import yaml # type: ignore
|
||||
except Exception as exc: # pragma: no cover
|
||||
raise SystemExit("PyYAML is required to read triage root manifests") from exc
|
||||
|
||||
LANES = (
|
||||
"screenshots",
|
||||
"receipts",
|
||||
"downloads",
|
||||
"obsidian_attachments",
|
||||
"voice_memos",
|
||||
"meeting_snippets",
|
||||
)
|
||||
AUDIO_LANES = {"voice_memos", "meeting_snippets"}
|
||||
DOC_IMAGE_LANES = {"screenshots", "receipts", "downloads", "obsidian_attachments"}
|
||||
SKIP_DIR_NAMES = {".git", ".obsidian", "__pycache__", ".cache", "cache", "chroma", "chromadb", "vector_db", "vectors"}
|
||||
NPU_BUSY_PATH = Path("/sys/class/accel/accel0/device/npu_busy_time_us")
|
||||
DEFAULT_WHISPER_URL = "http://127.0.0.1:18816/v1/audio/transcriptions"
|
||||
MUTATIONS_FALSE = {
|
||||
"obsidian": False,
|
||||
"rag": False,
|
||||
"vector_db": False,
|
||||
"sends": False,
|
||||
"file_moves": False,
|
||||
"routing": False,
|
||||
"memory": False,
|
||||
"service_restarts": False,
|
||||
}
|
||||
ACTION_PATTERNS = {
|
||||
"follow_up": re.compile(r"\b(follow up|follow-up|circle back|reply|respond)\b", re.I),
|
||||
"date_or_deadline": re.compile(r"\b(deadline|due|by (?:mon|tue|wed|thu|fri|sat|sun)|20\d{2}[-/]\d{1,2}[-/]\d{1,2})\b", re.I),
|
||||
"decision": re.compile(r"\b(decided|decision|approved|rejected|go with|choose)\b", re.I),
|
||||
"task": re.compile(r"\b(todo|to-do|action item|assign|need to|please)\b", re.I),
|
||||
}
|
||||
|
||||
|
||||
class FailClosed(Exception):
|
||||
pass
|
||||
|
||||
|
||||
def sha256_text(text: str) -> str:
|
||||
return "sha256:" + hashlib.sha256(text.encode("utf-8", errors="replace")).hexdigest()
|
||||
|
||||
|
||||
def sha256_file(path: Path) -> str:
|
||||
h = hashlib.sha256()
|
||||
with path.open("rb") as f:
|
||||
for chunk in iter(lambda: f.read(1024 * 1024), b""):
|
||||
h.update(chunk)
|
||||
return "sha256:" + h.hexdigest()
|
||||
|
||||
|
||||
def read_busy(path: Path = NPU_BUSY_PATH) -> int | None:
|
||||
try:
|
||||
return int(path.read_text().strip())
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def validate_local_whisper_url(whisper_url: str) -> str:
|
||||
"""Fail closed unless Whisper transcription stays on the approved loopback service."""
|
||||
try:
|
||||
parsed = urllib.parse.urlparse(whisper_url)
|
||||
port = parsed.port
|
||||
except ValueError as exc:
|
||||
raise FailClosed("whisper_url_invalid") from exc
|
||||
|
||||
if parsed.scheme != "http":
|
||||
raise FailClosed("whisper_url_scheme_not_http")
|
||||
if parsed.username or parsed.password:
|
||||
raise FailClosed("whisper_url_credentials_not_allowed")
|
||||
if port != 18816:
|
||||
raise FailClosed("whisper_url_port_not_approved")
|
||||
|
||||
host = (parsed.hostname or "").strip().lower()
|
||||
if host == "localhost":
|
||||
return whisper_url
|
||||
try:
|
||||
if ipaddress.ip_address(host).is_loopback:
|
||||
return whisper_url
|
||||
except ValueError:
|
||||
pass
|
||||
raise FailClosed("whisper_url_not_loopback")
|
||||
|
||||
|
||||
def is_under(path: Path, root: Path) -> bool:
|
||||
try:
|
||||
path.resolve().relative_to(root.resolve())
|
||||
return True
|
||||
except ValueError:
|
||||
return False
|
||||
|
||||
|
||||
def load_manifest(path: Path) -> dict[str, Any]:
|
||||
if not path.exists():
|
||||
raise FailClosed(f"manifest_missing:{path}")
|
||||
data = yaml.safe_load(path.read_text())
|
||||
if not isinstance(data, dict):
|
||||
raise FailClosed("manifest_invalid:not_mapping")
|
||||
if data.get("version") != 1:
|
||||
raise FailClosed("manifest_invalid:version_must_be_1")
|
||||
policy = data.get("policy") or {}
|
||||
if policy.get("default_mode", "dry_run") != "dry_run":
|
||||
raise FailClosed("policy_invalid:default_mode_not_dry_run")
|
||||
for key, expected in {
|
||||
"require_explicit_root": True,
|
||||
"allow_external_uploads": False,
|
||||
"allow_mutations": False,
|
||||
"log_raw_text": False,
|
||||
}.items():
|
||||
if policy.get(key) is not expected:
|
||||
raise FailClosed(f"policy_invalid:{key}")
|
||||
if not isinstance(data.get("roots"), dict):
|
||||
raise FailClosed("manifest_invalid:roots_missing")
|
||||
return data
|
||||
|
||||
|
||||
def resolve_lane_root(manifest: dict[str, Any], manifest_path: Path, lane: str, requested_root: str | None) -> tuple[dict[str, Any], Path, Path]:
|
||||
lane_cfg = (manifest.get("roots") or {}).get(lane)
|
||||
if not isinstance(lane_cfg, dict):
|
||||
raise FailClosed(f"lane_missing:{lane}")
|
||||
if lane_cfg.get("approved") is not True:
|
||||
raise FailClosed(f"lane_unapproved:{lane}")
|
||||
root_value = lane_cfg.get("root")
|
||||
if not root_value:
|
||||
raise FailClosed(f"root_missing:{lane}")
|
||||
approved_root = Path(str(root_value)).expanduser()
|
||||
if not approved_root.is_absolute():
|
||||
approved_root = (manifest_path.parent / approved_root).resolve()
|
||||
else:
|
||||
approved_root = approved_root.resolve()
|
||||
if not approved_root.exists() or not approved_root.is_dir():
|
||||
raise FailClosed(f"approved_root_unavailable:{lane}")
|
||||
|
||||
selected_root = Path(requested_root).expanduser() if requested_root else approved_root
|
||||
selected_root = selected_root.resolve()
|
||||
if not selected_root.exists() or not selected_root.is_dir():
|
||||
raise FailClosed(f"request_root_unavailable:{lane}")
|
||||
if not is_under(selected_root, approved_root):
|
||||
raise FailClosed(f"request_root_broadens_approval:{lane}")
|
||||
return lane_cfg, approved_root, selected_root
|
||||
|
||||
|
||||
def allowed_exts(lane_cfg: dict[str, Any]) -> set[str]:
|
||||
return {str(e).lower() if str(e).startswith(".") else "." + str(e).lower() for e in lane_cfg.get("allowed_extensions", [])}
|
||||
|
||||
|
||||
def iter_files(root: Path, approved_root: Path, exts: set[str], max_file_mb: float, max_age_days: float | None) -> tuple[list[Path], dict[str, int], int]:
|
||||
skipped = {"extension": 0, "size": 0, "symlink_escape": 0, "not_regular_file": 0, "too_old": 0, "policy": 0}
|
||||
accepted: list[Path] = []
|
||||
files_seen = 0
|
||||
now = time.time()
|
||||
max_bytes = int(max_file_mb * 1024 * 1024)
|
||||
for dirpath, dirnames, filenames in os.walk(root, followlinks=False):
|
||||
dirnames[:] = [d for d in dirnames if d not in SKIP_DIR_NAMES and not d.startswith(".")]
|
||||
current = Path(dirpath)
|
||||
if not is_under(current, approved_root):
|
||||
skipped["symlink_escape"] += 1
|
||||
dirnames[:] = []
|
||||
continue
|
||||
for name in filenames:
|
||||
path = current / name
|
||||
if name.startswith("."):
|
||||
skipped["policy"] += 1
|
||||
continue
|
||||
files_seen += 1
|
||||
try:
|
||||
resolved = path.resolve()
|
||||
except Exception:
|
||||
skipped["symlink_escape"] += 1
|
||||
continue
|
||||
if not is_under(resolved, approved_root):
|
||||
skipped["symlink_escape"] += 1
|
||||
continue
|
||||
if not resolved.is_file():
|
||||
skipped["not_regular_file"] += 1
|
||||
continue
|
||||
if resolved.suffix.lower() not in exts:
|
||||
skipped["extension"] += 1
|
||||
continue
|
||||
try:
|
||||
st = resolved.stat()
|
||||
except OSError:
|
||||
skipped["not_regular_file"] += 1
|
||||
continue
|
||||
if st.st_size > max_bytes:
|
||||
skipped["size"] += 1
|
||||
continue
|
||||
if max_age_days is not None and now - st.st_mtime > max_age_days * 86400:
|
||||
skipped["too_old"] += 1
|
||||
continue
|
||||
accepted.append(resolved)
|
||||
accepted.sort(key=lambda p: p.stat().st_mtime, reverse=True)
|
||||
return accepted, skipped, files_seen
|
||||
|
||||
|
||||
def load_doc_triage_module(repo_root: Path):
|
||||
module_path = repo_root / "openvino-doc-image-triage-npu" / "triage.py"
|
||||
spec = importlib.util.spec_from_file_location("doc_image_triage", module_path)
|
||||
if spec is None or spec.loader is None:
|
||||
raise RuntimeError("doc_image_triage_import_failed")
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
sys.modules.setdefault("doc_image_triage", module)
|
||||
spec.loader.exec_module(module) # type: ignore[union-attr]
|
||||
return module
|
||||
|
||||
|
||||
def fallback_doc_item(path: Path, root: Path, lane: str) -> dict[str, Any]:
|
||||
sidecar = path.with_suffix(path.suffix + ".txt")
|
||||
text = ""
|
||||
if sidecar.exists() and sidecar.is_file():
|
||||
text = sidecar.read_text(errors="replace")[:12000]
|
||||
lower = text.lower()
|
||||
category = "unknown_or_low_confidence"
|
||||
if any(w in lower for w in ("receipt", "subtotal", "store")):
|
||||
category = "receipt"
|
||||
elif any(w in lower for w in ("invoice", "amount due", "payment due")):
|
||||
category = "bill_or_invoice"
|
||||
elif lane == "screenshots":
|
||||
category = "screenshot_web_or_app"
|
||||
reasons = [name for name, rx in ACTION_PATTERNS.items() if rx.search(text)]
|
||||
return {
|
||||
"basename": path.name,
|
||||
"relative_path_hash": sha256_text(path.relative_to(root).as_posix()),
|
||||
"file_id": sha256_file(path),
|
||||
"media_type": infer_media_type(path),
|
||||
"category": category,
|
||||
"needs_attention": bool(reasons),
|
||||
"reasons": sorted(reasons),
|
||||
"raw_text_redacted": True,
|
||||
"full_path_included": False,
|
||||
"metadata": {"dates_count": len(set(re.findall(r"\b20\d{2}[-/]\d{1,2}[-/]\d{1,2}\b", text))), "amounts_count": len(set(re.findall(r"\$\s?\d+(?:\.\d{2})?", text))), "raw_values_redacted": True},
|
||||
"processing": {"doc_image_triage": "fallback_cpu_sidecar_rules", "npu_verified": False},
|
||||
}
|
||||
|
||||
|
||||
def infer_media_type(path: Path) -> str:
|
||||
if path.suffix.lower() == ".pdf":
|
||||
return "pdf"
|
||||
mt, _ = mimetypes.guess_type(path.name)
|
||||
if mt and mt.startswith("image/"):
|
||||
return "image"
|
||||
if mt and mt.startswith("audio/"):
|
||||
return "audio"
|
||||
return "unknown"
|
||||
|
||||
|
||||
def compact_doc_item(path: Path, root: Path, lane: str, triage_result: dict[str, Any]) -> dict[str, Any]:
|
||||
pages = triage_result.get("pages") or []
|
||||
first = pages[0] if pages else {}
|
||||
cls = first.get("classification") or {}
|
||||
attn = first.get("needs_attention") or {}
|
||||
meta = first.get("metadata") or {}
|
||||
device_summary = triage_result.get("processing_device_summary") or {}
|
||||
item = {
|
||||
"basename": path.name,
|
||||
"relative_path_hash": sha256_text(path.relative_to(root).as_posix()),
|
||||
"file_id": triage_result.get("file_id") or sha256_file(path),
|
||||
"media_type": triage_result.get("media_type") or infer_media_type(path),
|
||||
"category": cls.get("label") or "unknown_or_low_confidence",
|
||||
"needs_attention": bool(attn.get("value")),
|
||||
"reasons": attn.get("reasons") or [],
|
||||
"raw_text_redacted": True,
|
||||
"full_path_included": False,
|
||||
"metadata": {
|
||||
"dates_count": meta.get("dates_count", 0),
|
||||
"amounts_count": meta.get("amounts_count", 0),
|
||||
"raw_values_redacted": True,
|
||||
},
|
||||
"processing": {
|
||||
"doc_image_triage": "openvino-doc-image-triage-npu",
|
||||
"image_category_device": (cls.get("device") or "CPU"),
|
||||
"needs_attention_device": attn.get("device") or "CPU",
|
||||
"npu_verified": bool(device_summary.get("npu_verified")),
|
||||
"npu_busy_delta_us": device_summary.get("npu_busy_delta_us"),
|
||||
},
|
||||
}
|
||||
if lane == "receipts":
|
||||
item["receipt_fields"] = {"vendor_present": bool((meta.get("detected_entities") or {}).get("org_present")), "amounts_count": item["metadata"]["amounts_count"], "dates_count": item["metadata"]["dates_count"]}
|
||||
return item
|
||||
|
||||
|
||||
def classify_transcript(text: str, lane: str) -> dict[str, Any]:
|
||||
reasons = [name for name, rx in ACTION_PATTERNS.items() if rx.search(text)]
|
||||
action_count = sum(1 for rx in (ACTION_PATTERNS["follow_up"], ACTION_PATTERNS["task"]) if rx.search(text))
|
||||
decisions = 1 if ACTION_PATTERNS["decision"].search(text) else 0
|
||||
followups = 1 if ACTION_PATTERNS["follow_up"].search(text) else 0
|
||||
return {
|
||||
"category": "meeting_snippet" if lane == "meeting_snippets" else "voice_memo",
|
||||
"action_worthy": bool(reasons),
|
||||
"reasons": sorted(reasons),
|
||||
"action_items_count": action_count,
|
||||
"decisions_count": decisions,
|
||||
"followups_count": followups,
|
||||
}
|
||||
|
||||
|
||||
def multipart_transcribe(path: Path, whisper_url: str, timeout: float) -> dict[str, Any]:
|
||||
whisper_url = validate_local_whisper_url(whisper_url)
|
||||
boundary = "----NpuBatchTriage" + hashlib.sha256(path.name.encode()).hexdigest()[:12]
|
||||
data = path.read_bytes()
|
||||
body = (
|
||||
f"--{boundary}\r\n"
|
||||
f'Content-Disposition: form-data; name="file"; filename="{path.name}"\r\n'
|
||||
"Content-Type: application/octet-stream\r\n\r\n"
|
||||
).encode() + data + (
|
||||
f"\r\n--{boundary}\r\n"
|
||||
'Content-Disposition: form-data; name="model"\r\n\r\n'
|
||||
"whisper-1\r\n"
|
||||
f"--{boundary}--\r\n"
|
||||
).encode()
|
||||
before = read_busy()
|
||||
req = urllib.request.Request(whisper_url, data=body, headers={"Content-Type": f"multipart/form-data; boundary={boundary}"})
|
||||
t0 = time.perf_counter()
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
raw = resp.read(1024 * 1024)
|
||||
status = resp.status
|
||||
parsed = json.loads(raw.decode())
|
||||
after = read_busy()
|
||||
text = str(parsed.get("text") or parsed.get("transcription") or "").strip()
|
||||
service_delta = parsed.get("npu_busy_delta_us")
|
||||
sysfs_delta = None if before is None or after is None else after - before
|
||||
proof_delta = service_delta if isinstance(service_delta, int) else sysfs_delta
|
||||
return {
|
||||
"ok": status == 200 and bool(text),
|
||||
"text": text,
|
||||
"transcript_chars": len(text),
|
||||
"duration_seconds": parsed.get("duration_seconds"),
|
||||
"language": parsed.get("language"),
|
||||
"npu_busy_delta_us": proof_delta,
|
||||
"verified_npu": bool(proof_delta and proof_delta > 0),
|
||||
"wall_ms": round((time.perf_counter() - t0) * 1000, 2),
|
||||
}
|
||||
|
||||
|
||||
def compact_audio_item(path: Path, root: Path, lane: str, no_npu: bool, whisper_url: str, timeout: float) -> dict[str, Any]:
|
||||
transcript = ""
|
||||
transcribed = False
|
||||
npu_delta = 0
|
||||
proof_ok = False
|
||||
duration = None
|
||||
language = None
|
||||
error = None
|
||||
if not no_npu:
|
||||
try:
|
||||
result = multipart_transcribe(path, whisper_url, timeout)
|
||||
transcript = result["text"]
|
||||
transcribed = result["ok"]
|
||||
npu_delta = result.get("npu_busy_delta_us") or 0
|
||||
proof_ok = bool(result.get("verified_npu"))
|
||||
duration = result.get("duration_seconds")
|
||||
language = result.get("language")
|
||||
except (urllib.error.URLError, TimeoutError, OSError, json.JSONDecodeError) as exc:
|
||||
error = f"whisper_error:{type(exc).__name__}"
|
||||
summary = classify_transcript(transcript, lane)
|
||||
item = {
|
||||
"basename": path.name,
|
||||
"relative_path_hash": sha256_text(path.relative_to(root).as_posix()),
|
||||
"file_id": sha256_file(path),
|
||||
"media_type": "audio",
|
||||
"duration_seconds": duration,
|
||||
"transcribed": transcribed,
|
||||
"transcript_chars": len(transcript),
|
||||
"language": language,
|
||||
**summary,
|
||||
"npu_busy_delta_us": npu_delta,
|
||||
"raw_transcript_logged": False,
|
||||
"full_path_included": False,
|
||||
}
|
||||
if error:
|
||||
item["error"] = error
|
||||
return item
|
||||
|
||||
|
||||
def process(args: argparse.Namespace) -> dict[str, Any]:
|
||||
repo_root = Path(__file__).resolve().parents[1]
|
||||
manifest_path = Path(args.manifest).expanduser().resolve()
|
||||
manifest = load_manifest(manifest_path)
|
||||
lane_cfg, approved_root, root = resolve_lane_root(manifest, manifest_path, args.lane, args.root)
|
||||
exts = allowed_exts(lane_cfg)
|
||||
if not exts:
|
||||
raise FailClosed(f"extensions_missing:{args.lane}")
|
||||
manifest_limit = int(lane_cfg.get("max_files", 50))
|
||||
limit = min(args.limit if args.limit is not None else manifest_limit, manifest_limit)
|
||||
files, skipped, files_seen = iter_files(root, approved_root, exts, float(lane_cfg.get("max_file_mb", 25)), args.max_age_days)
|
||||
selected = files[:limit]
|
||||
npu_before = read_busy()
|
||||
|
||||
items: list[dict[str, Any]] = []
|
||||
errors: list[str] = []
|
||||
doc_module = None
|
||||
if args.lane in AUDIO_LANES and not args.no_npu:
|
||||
validate_local_whisper_url(args.whisper_url)
|
||||
if args.lane in DOC_IMAGE_LANES and not args.no_npu:
|
||||
try:
|
||||
doc_module = load_doc_triage_module(repo_root)
|
||||
except Exception as exc:
|
||||
errors.append(f"doc_triage_import_error:{type(exc).__name__}")
|
||||
|
||||
for path in selected:
|
||||
try:
|
||||
if args.lane in AUDIO_LANES:
|
||||
item = compact_audio_item(path, root, args.lane, args.no_npu, args.whisper_url, args.timeout_seconds)
|
||||
elif doc_module is not None:
|
||||
opts = doc_module.TriageOptions(
|
||||
dry_run=False,
|
||||
include_ocr_text=False,
|
||||
include_full_path=False,
|
||||
use_embeddings=not args.no_npu,
|
||||
allowed_roots=[approved_root],
|
||||
timeout_seconds=args.timeout_seconds,
|
||||
)
|
||||
item = compact_doc_item(path, root, args.lane, doc_module.triage_file(path, opts))
|
||||
else:
|
||||
item = fallback_doc_item(path, root, args.lane)
|
||||
if args.include_full_path:
|
||||
item["full_path"] = str(path)
|
||||
item["full_path_included"] = True
|
||||
if args.include_raw_text:
|
||||
item["raw_text_included"] = False
|
||||
item["raw_text_note"] = "unsupported_by_batch_wrapper"
|
||||
items.append(item)
|
||||
except FailClosed:
|
||||
raise
|
||||
except Exception as exc:
|
||||
errors.append(f"{path.name}:{type(exc).__name__}")
|
||||
items.append({"basename": path.name, "ok": False, "error": type(exc).__name__, "raw_text_redacted": True, "full_path_included": False})
|
||||
|
||||
npu_after = read_busy()
|
||||
sysfs_delta = None if npu_before is None or npu_after is None else npu_after - npu_before
|
||||
item_deltas = [i.get("npu_busy_delta_us") for i in items if isinstance(i.get("npu_busy_delta_us"), int)]
|
||||
claimed = not args.no_npu and any((d or 0) > 0 for d in item_deltas + ([sysfs_delta] if isinstance(sysfs_delta, int) else []))
|
||||
proof_ok = claimed and bool(sysfs_delta is None or sysfs_delta > 0 or any((d or 0) > 0 for d in item_deltas))
|
||||
return {
|
||||
"ok": not errors,
|
||||
"lane": args.lane,
|
||||
"dry_run": True,
|
||||
"approved_root": True,
|
||||
"root_basename": root.name,
|
||||
"files_seen": files_seen,
|
||||
"files_processed": len(items),
|
||||
"skipped": skipped,
|
||||
"npu": {"claimed": claimed, "busy_delta_us": sysfs_delta, "proof_ok": proof_ok},
|
||||
"mutations": MUTATIONS_FALSE.copy(),
|
||||
"items": items,
|
||||
"raw_content_redacted": not args.include_raw_text,
|
||||
"full_paths_included": bool(args.include_full_path),
|
||||
"errors": errors,
|
||||
"gates": {
|
||||
"external_uploads": False,
|
||||
"private_root_broadening": False,
|
||||
"obsidian_mutation": False,
|
||||
"vector_db_mutation": False,
|
||||
"outbound_sends": False,
|
||||
"routing_changes": False,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def build_parser() -> argparse.ArgumentParser:
|
||||
parser = argparse.ArgumentParser(description="Explicit-root dry-run batch triage wrapper")
|
||||
parser.add_argument("--manifest", required=True, help="lane approval manifest; missing/unapproved fails closed")
|
||||
parser.add_argument("--lane", required=True, choices=LANES)
|
||||
parser.add_argument("--root", help="optional narrower root under the manifest-approved lane root")
|
||||
parser.add_argument("--dry-run", action="store_true", help="required; mutation modes are not implemented")
|
||||
parser.add_argument("--limit", type=int, default=None)
|
||||
parser.add_argument("--max-age-days", type=float, default=None)
|
||||
parser.add_argument("--include-raw-text", action="store_true", help="kept redacted by this wrapper; present only for explicit operator attempts")
|
||||
parser.add_argument("--include-full-path", action="store_true", help="operator-only local debugging")
|
||||
parser.add_argument("--no-npu", action="store_true", help="CPU-only smoke; never claims NPU")
|
||||
parser.add_argument("--json", action="store_true", help="emit compact JSON")
|
||||
parser.add_argument("--pretty", action="store_true", help="pretty JSON for local debugging")
|
||||
parser.add_argument("--whisper-url", default=DEFAULT_WHISPER_URL)
|
||||
parser.add_argument("--timeout-seconds", type=float, default=20.0)
|
||||
return parser
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
parser = build_parser()
|
||||
args = parser.parse_args(argv)
|
||||
if not args.dry_run:
|
||||
print(json.dumps({"ok": False, "error": "dry_run_required", "mutations": MUTATIONS_FALSE}), file=sys.stderr)
|
||||
return 2
|
||||
if args.limit is not None and args.limit < 1:
|
||||
print(json.dumps({"ok": False, "error": "limit_must_be_positive"}), file=sys.stderr)
|
||||
return 2
|
||||
try:
|
||||
out = process(args)
|
||||
except FailClosed as exc:
|
||||
out = {"ok": False, "error": "fail_closed", "reason": str(exc), "dry_run": True, "mutations": MUTATIONS_FALSE.copy()}
|
||||
print(json.dumps(out, indent=2 if args.pretty else None, sort_keys=True))
|
||||
return 0 if out.get("ok") else 2
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
Executable
+536
@@ -0,0 +1,536 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Compact, read-only NPU/OpenVINO utilization digest.
|
||||
|
||||
Default behavior is safe for on-demand or scheduled runs: health checks plus
|
||||
bounded synthetic probes, one compact JSONL artifact, and no service restarts,
|
||||
routing changes, advisory POSTs, vector mutations, outbound sends, or private
|
||||
root broadening.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import base64
|
||||
import datetime as dt
|
||||
import json
|
||||
import math
|
||||
import os
|
||||
import tempfile
|
||||
import time
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
import uuid
|
||||
import wave
|
||||
from dataclasses import asdict, dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Any, Callable
|
||||
|
||||
BUSY_PATH = Path("/sys/class/accel/accel0/device/npu_busy_time_us")
|
||||
DEFAULT_OUT_DIR = Path("/home/will/.local/state/npu-utilization/digests")
|
||||
|
||||
EMBED_URL = "http://127.0.0.1:18817/v1/embeddings"
|
||||
EMBED_HEALTH_URL = "http://127.0.0.1:18817/healthz"
|
||||
RERANK_URL = "http://127.0.0.1:18818/rerank"
|
||||
RERANK_HEALTH_URL = "http://127.0.0.1:18818/readyz"
|
||||
WHISPER_URL = "http://127.0.0.1:18816/v1/audio/transcriptions"
|
||||
WHISPER_HEALTH_URL = "http://127.0.0.1:18816/health"
|
||||
CLASSIFIER_URL = "http://127.0.0.1:18819/v1/classify"
|
||||
CLASSIFIER_HEALTH_URL = "http://127.0.0.1:18819/healthz"
|
||||
GENAI_HEALTH_URL = "http://127.0.0.1:18820/healthz"
|
||||
GENAI_GENERATE_URL = "http://127.0.0.1:18820/v1/generate"
|
||||
DOC_TRIAGE_URL = "http://127.0.0.1:18829/triage"
|
||||
DOC_TRIAGE_HEALTH_URL = "http://127.0.0.1:18829/healthz"
|
||||
RAG_ENDPOINT_HEALTH_URL = "http://127.0.0.1:18810/healthz"
|
||||
RAG_HEALTH_URL = "http://127.0.0.1:18814/healthz"
|
||||
ADVISORY_HEALTH_URL = "http://172.19.0.1:18830/healthz"
|
||||
|
||||
|
||||
@dataclass
|
||||
class ServiceRow:
|
||||
type: str = "service"
|
||||
service: str = ""
|
||||
reachable: bool = False
|
||||
probe_ran: bool = False
|
||||
proof_ok: bool | None = None
|
||||
calls: int = 0
|
||||
items: int = 0
|
||||
avg_ms: float | None = None
|
||||
npu_delta_us: int | None = None
|
||||
response_delta_us: int | None = None
|
||||
mode: str = "unavailable"
|
||||
fallbacks: int = 0
|
||||
warnings: list[str] = field(default_factory=list)
|
||||
gate: str = "none"
|
||||
jobs: int | None = None
|
||||
events: int | None = None
|
||||
files: int | None = None
|
||||
docs: int | None = None
|
||||
text_len: int | None = None
|
||||
sample_rate: int | None = None
|
||||
embedding_count: int | None = None
|
||||
embedding_dim: int | None = None
|
||||
dry_run: bool | None = None
|
||||
suppress: int | None = None
|
||||
escalate: int | None = None
|
||||
loaded: bool | None = None
|
||||
allowed_roots_count: int | None = None
|
||||
reason: str | None = None
|
||||
error: str | None = None
|
||||
|
||||
|
||||
def compact_dict(obj: Any) -> dict[str, Any]:
|
||||
data = asdict(obj) if hasattr(obj, "__dataclass_fields__") else dict(obj)
|
||||
return {k: v for k, v in data.items() if v is not None and v != []}
|
||||
|
||||
|
||||
def read_busy(path: Path = BUSY_PATH) -> int | None:
|
||||
try:
|
||||
return int(path.read_text().strip())
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def safe_error(exc: BaseException) -> str:
|
||||
return type(exc).__name__
|
||||
|
||||
|
||||
def http_get_json(url: str, timeout: float) -> tuple[int, dict[str, Any]]:
|
||||
try:
|
||||
req = urllib.request.Request(url, headers={"Accept": "application/json"})
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
body = resp.read(1024 * 1024).decode("utf-8", "replace")
|
||||
return int(resp.status), json.loads(body or "{}")
|
||||
except urllib.error.HTTPError as exc:
|
||||
try:
|
||||
body = exc.read(1024 * 1024).decode("utf-8", "replace")
|
||||
return int(exc.code), json.loads(body or "{}")
|
||||
except Exception:
|
||||
return int(exc.code), {"error": "http_error"}
|
||||
except Exception as exc:
|
||||
return 0, {"error": safe_error(exc)}
|
||||
|
||||
|
||||
def http_post_json(url: str, payload: dict[str, Any], timeout: float) -> tuple[int, dict[str, Any]]:
|
||||
body = json.dumps(payload, ensure_ascii=False).encode("utf-8")
|
||||
req = urllib.request.Request(url, data=body, headers={"Content-Type": "application/json", "Accept": "application/json"})
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
data = resp.read(2 * 1024 * 1024).decode("utf-8", "replace")
|
||||
return int(resp.status), json.loads(data or "{}")
|
||||
except urllib.error.HTTPError as exc:
|
||||
try:
|
||||
data = exc.read(1024 * 1024).decode("utf-8", "replace")
|
||||
return int(exc.code), json.loads(data or "{}")
|
||||
except Exception:
|
||||
return int(exc.code), {"error": "http_error"}
|
||||
except Exception as exc:
|
||||
return 0, {"error": safe_error(exc)}
|
||||
|
||||
|
||||
def health_row(service: str, url: str, timeout: float, gate: str = "none", mode: str = "health_only") -> tuple[ServiceRow, dict[str, Any]]:
|
||||
status, payload = http_get_json(url, timeout)
|
||||
ok = status == 200 and payload.get("ok", True) is not False
|
||||
row = ServiceRow(service=service, reachable=ok, mode=mode if ok else "unavailable", gate=gate)
|
||||
if not ok:
|
||||
row.fallbacks = 1
|
||||
row.warnings.append("unavailable")
|
||||
row.error = str(payload.get("error") or payload.get("ready_error") or f"http_{status}")[:80]
|
||||
return row, payload
|
||||
|
||||
|
||||
def measure_probe(fn: Callable[[], tuple[int, dict[str, Any]]], timeout_label: str, busy_path: Path = BUSY_PATH) -> tuple[int, dict[str, Any], float, int | None]:
|
||||
before = read_busy(busy_path)
|
||||
started = time.perf_counter()
|
||||
status, payload = fn()
|
||||
elapsed_ms = round((time.perf_counter() - started) * 1000, 3)
|
||||
after = read_busy(busy_path)
|
||||
delta = None if before is None or after is None else after - before
|
||||
return status, payload, elapsed_ms, delta
|
||||
|
||||
|
||||
def apply_proof(row: ServiceRow, delta: int | None) -> None:
|
||||
row.npu_delta_us = delta
|
||||
row.proof_ok = bool(delta is not None and delta > 0)
|
||||
if not row.proof_ok:
|
||||
row.fallbacks += 1
|
||||
row.warnings.append("no_positive_sysfs_delta" if delta is not None else "missing_sysfs_counter")
|
||||
|
||||
|
||||
def mark_skipped_fallback(row: ServiceRow, reason: str) -> None:
|
||||
"""Record a skipped/unloaded proof condition as a fallback.
|
||||
|
||||
Health-only rows that are intentionally never proof probes should keep
|
||||
fallbacks at zero. This helper is for proof-capable rows where a bounded
|
||||
smoke was disabled or skipped to avoid side effects such as cold-loading.
|
||||
"""
|
||||
row.fallbacks += 1
|
||||
row.warnings.append(reason)
|
||||
|
||||
|
||||
def probe_embeddings(timeout: float, busy_path: Path = BUSY_PATH, post_json: Callable[..., tuple[int, dict[str, Any]]] = http_post_json) -> ServiceRow:
|
||||
row, _ = health_row("embeddings", EMBED_HEALTH_URL, timeout)
|
||||
if not row.reachable:
|
||||
return row
|
||||
payload = {"input": "non-private npu utilization digest probe", "model": "bge-base-en-v1.5-int8-ov"}
|
||||
status, data, elapsed, delta = measure_probe(lambda: post_json(EMBED_URL, payload, timeout), "embeddings", busy_path)
|
||||
row.probe_ran = True
|
||||
row.calls = 1
|
||||
row.items = 1
|
||||
row.avg_ms = elapsed
|
||||
row.mode = "NPU"
|
||||
row.reachable = status == 200 and "data" in data
|
||||
row.embedding_count = len(data.get("data", [])) if isinstance(data.get("data"), list) else 0
|
||||
row.embedding_dim = data.get("embedding_dim")
|
||||
row.response_delta_us = data.get("npu_busy_delta_us") if isinstance(data.get("npu_busy_delta_us"), int) else None
|
||||
apply_proof(row, delta)
|
||||
if not row.reachable:
|
||||
row.warnings.append("probe_http_failed")
|
||||
row.error = str(data.get("error") or f"http_{status}")[:80]
|
||||
return row
|
||||
|
||||
|
||||
def probe_rerank(timeout: float, busy_path: Path = BUSY_PATH, post_json: Callable[..., tuple[int, dict[str, Any]]] = http_post_json) -> ServiceRow:
|
||||
row, _ = health_row("rerank", RERANK_HEALTH_URL, timeout)
|
||||
if not row.reachable:
|
||||
return row
|
||||
docs = ["Intel NPU accelerates OpenVINO inference.", "Bananas ripen on a kitchen counter."]
|
||||
payload = {"query": "OpenVINO NPU inference", "documents": docs, "top_k": 2, "return_documents": False}
|
||||
status, data, elapsed, delta = measure_probe(lambda: post_json(RERANK_URL, payload, timeout), "rerank", busy_path)
|
||||
row.probe_ran = True
|
||||
row.calls = 1
|
||||
row.docs = len(docs)
|
||||
row.avg_ms = float(data.get("duration_ms") or elapsed)
|
||||
row.mode = "NPU"
|
||||
row.reachable = status == 200 and data.get("ok", True) is not False
|
||||
row.response_delta_us = data.get("npu_busy_delta_us") if isinstance(data.get("npu_busy_delta_us"), int) else None
|
||||
apply_proof(row, delta)
|
||||
if not row.reachable:
|
||||
row.warnings.append("probe_http_failed")
|
||||
row.error = str(data.get("error") or f"http_{status}")[:80]
|
||||
return row
|
||||
|
||||
|
||||
def probe_classifier(timeout: float, busy_path: Path = BUSY_PATH, post_json: Callable[..., tuple[int, dict[str, Any]]] = http_post_json) -> ServiceRow:
|
||||
row, _ = health_row("classifier", CLASSIFIER_HEALTH_URL, timeout, mode="dry_run")
|
||||
if not row.reachable:
|
||||
return row
|
||||
payload = {
|
||||
"id": "npu-digest-probe",
|
||||
"text": "Non-private cron event: backup completed successfully, no user action required.",
|
||||
"options": {"dry_run": True, "include_evidence": False},
|
||||
}
|
||||
status, data, elapsed, delta = measure_probe(lambda: post_json(CLASSIFIER_URL, payload, timeout), "classifier", busy_path)
|
||||
row.probe_ran = True
|
||||
row.calls = 1
|
||||
row.events = 1
|
||||
row.avg_ms = elapsed
|
||||
row.mode = "dry_run"
|
||||
row.dry_run = True
|
||||
row.reachable = status == 200 and "error" not in data
|
||||
row.response_delta_us = next((data.get(k) for k in ("sysfs_npu_busy_delta_us", "npu_busy_delta_us") if isinstance(data.get(k), int)), None)
|
||||
raw_labels = data.get("labels")
|
||||
labels: dict[str, Any] = raw_labels if isinstance(raw_labels, dict) else {}
|
||||
raw_action = data.get("action")
|
||||
action: dict[str, Any] = raw_action if isinstance(raw_action, dict) else {}
|
||||
row.escalate = int(bool(action.get("escalate") or labels.get("action_required") or labels.get("tool_needed")))
|
||||
row.suppress = int(bool(action.get("suppress") or labels.get("no_op") or labels.get("duplicate")))
|
||||
row.items = len(labels)
|
||||
apply_proof(row, delta)
|
||||
if not row.reachable:
|
||||
row.warnings.append("probe_http_failed")
|
||||
row.error = str(data.get("error") or f"http_{status}")[:80]
|
||||
return row
|
||||
|
||||
|
||||
def write_tone_wav(path: Path, seconds: float = 0.35, sample_rate: int = 16000) -> None:
|
||||
frames = int(seconds * sample_rate)
|
||||
with wave.open(str(path), "wb") as wav:
|
||||
wav.setnchannels(1)
|
||||
wav.setsampwidth(2)
|
||||
wav.setframerate(sample_rate)
|
||||
for i in range(frames):
|
||||
value = int(9000 * math.sin(2 * math.pi * 440 * (i / sample_rate)))
|
||||
wav.writeframesraw(value.to_bytes(2, byteorder="little", signed=True))
|
||||
|
||||
|
||||
def post_multipart_file(url: str, file_path: Path, timeout: float) -> tuple[int, dict[str, Any]]:
|
||||
boundary = "----npu-digest-" + uuid.uuid4().hex
|
||||
file_bytes = file_path.read_bytes()
|
||||
parts = [
|
||||
f"--{boundary}\r\nContent-Disposition: form-data; name=\"model\"\r\n\r\nwhisper\r\n".encode(),
|
||||
f"--{boundary}\r\nContent-Disposition: form-data; name=\"response_format\"\r\n\r\njson\r\n".encode(),
|
||||
f"--{boundary}\r\nContent-Disposition: form-data; name=\"file\"; filename=\"npu-digest.wav\"\r\nContent-Type: audio/wav\r\n\r\n".encode(),
|
||||
file_bytes,
|
||||
f"\r\n--{boundary}--\r\n".encode(),
|
||||
]
|
||||
req = urllib.request.Request(url, data=b"".join(parts), headers={"Content-Type": f"multipart/form-data; boundary={boundary}"})
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
return int(resp.status), json.loads(resp.read(1024 * 1024).decode("utf-8", "replace") or "{}")
|
||||
except Exception as exc:
|
||||
return 0, {"error": safe_error(exc)}
|
||||
|
||||
|
||||
def probe_whisper(timeout: float, include_smoke: bool, busy_path: Path = BUSY_PATH) -> ServiceRow:
|
||||
row, _ = health_row("whisper", WHISPER_HEALTH_URL, timeout)
|
||||
row.jobs = 0
|
||||
if not row.reachable or not include_smoke:
|
||||
if row.reachable:
|
||||
row.mode = "health_only"
|
||||
row.reason = "smoke_disabled"
|
||||
mark_skipped_fallback(row, "skipped")
|
||||
return row
|
||||
with tempfile.TemporaryDirectory(prefix="npu-digest-whisper-") as tmp:
|
||||
wav_path = Path(tmp) / "probe.wav"
|
||||
write_tone_wav(wav_path)
|
||||
status, data, elapsed, delta = measure_probe(lambda: post_multipart_file(WHISPER_URL, wav_path, timeout), "whisper", busy_path)
|
||||
row.probe_ran = True
|
||||
row.calls = 1
|
||||
row.jobs = 1
|
||||
row.avg_ms = elapsed
|
||||
row.mode = "NPU"
|
||||
row.reachable = status == 200 and "error" not in data
|
||||
row.text_len = len(str(data.get("text") or ""))
|
||||
row.sample_rate = data.get("sample_rate") if isinstance(data.get("sample_rate"), int) else None
|
||||
row.response_delta_us = data.get("npu_busy_delta_us") if isinstance(data.get("npu_busy_delta_us"), int) else None
|
||||
apply_proof(row, delta)
|
||||
if not row.reachable:
|
||||
row.warnings.append("probe_http_failed")
|
||||
row.error = str(data.get("error") or f"http_{status}")[:80]
|
||||
return row
|
||||
|
||||
|
||||
def probe_genai(timeout: float, include_smoke: bool, busy_path: Path = BUSY_PATH, post_json: Callable[..., tuple[int, dict[str, Any]]] = http_post_json) -> ServiceRow:
|
||||
row, health = health_row("genai", GENAI_HEALTH_URL, timeout)
|
||||
row.loaded = bool(health.get("loaded")) if isinstance(health, dict) and "loaded" in health else None
|
||||
row.jobs = 0
|
||||
if not row.reachable:
|
||||
return row
|
||||
if not include_smoke or row.loaded is False:
|
||||
row.mode = "loaded=false" if row.loaded is False else "health_only"
|
||||
row.reason = "skipped_cold_load" if row.loaded is False else "smoke_disabled"
|
||||
mark_skipped_fallback(row, row.reason)
|
||||
return row
|
||||
payload = {"prompt": "Say pong.", "max_new_tokens": 8}
|
||||
status, data, elapsed, delta = measure_probe(lambda: post_json(GENAI_GENERATE_URL, payload, timeout), "genai", busy_path)
|
||||
row.probe_ran = True
|
||||
row.calls = 1
|
||||
row.jobs = 1
|
||||
row.avg_ms = elapsed
|
||||
row.mode = "NPU"
|
||||
row.reachable = status == 200 and "error" not in data
|
||||
apply_proof(row, delta)
|
||||
return row
|
||||
|
||||
|
||||
def doc_triage_sample_path() -> Path | None:
|
||||
candidates = [
|
||||
Path("/home/will/lab/swarm/openvino-doc-image-triage-npu/samples/synthetic_invoice.png"),
|
||||
Path(__file__).resolve().parents[1] / "openvino-doc-image-triage-npu" / "samples" / "synthetic_invoice.png",
|
||||
]
|
||||
for candidate in candidates:
|
||||
if candidate.exists() and candidate.with_suffix(".png.txt").exists():
|
||||
return candidate
|
||||
return None
|
||||
|
||||
|
||||
def probe_doc_triage(timeout: float, include_smoke: bool, busy_path: Path = BUSY_PATH, post_json: Callable[..., tuple[int, dict[str, Any]]] = http_post_json) -> ServiceRow:
|
||||
row, _ = health_row("doc_triage", DOC_TRIAGE_HEALTH_URL, timeout, gate="closed:private-root")
|
||||
row.files = 0
|
||||
if not row.reachable or not include_smoke:
|
||||
if row.reachable:
|
||||
row.mode = "health_only"
|
||||
row.reason = "smoke_disabled"
|
||||
mark_skipped_fallback(row, "skipped")
|
||||
return row
|
||||
sample = doc_triage_sample_path()
|
||||
if sample is not None:
|
||||
root = sample.parent.resolve()
|
||||
payload = {"path": str(sample), "options": {"allowed_roots": [str(root)], "include_ocr_text": False, "use_embeddings": True}}
|
||||
status, data, elapsed, delta = measure_probe(lambda: post_json(DOC_TRIAGE_URL, payload, timeout), "doc_triage", busy_path)
|
||||
else:
|
||||
with tempfile.TemporaryDirectory(prefix="npu-digest-doc-") as tmp:
|
||||
root = Path(tmp).resolve()
|
||||
sample = root / "synthetic-invoice.png"
|
||||
sample.write_bytes(base64.b64decode("iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/x8AAwMCAO+/p9sAAAAASUVORK5CYII="))
|
||||
sample.with_suffix(".png.txt").write_text("Synthetic invoice. Amount due $12.34 by 2026-06-30. No private data.\n")
|
||||
payload = {"path": str(sample), "options": {"allowed_roots": [str(root)], "include_ocr_text": False, "use_embeddings": True}}
|
||||
status, data, elapsed, delta = measure_probe(lambda: post_json(DOC_TRIAGE_URL, payload, timeout), "doc_triage", busy_path)
|
||||
row.probe_ran = True
|
||||
row.calls = 1
|
||||
row.files = 1
|
||||
row.avg_ms = elapsed
|
||||
row.mode = "NPU-via-embedding-service"
|
||||
row.allowed_roots_count = 1
|
||||
row.reachable = status == 200 and data.get("ok", True) is not False
|
||||
raw_result = data.get("result")
|
||||
result: dict[str, Any] = raw_result if isinstance(raw_result, dict) else {}
|
||||
raw_pages = result.get("pages")
|
||||
pages: list[Any] = raw_pages if isinstance(raw_pages, list) else []
|
||||
embedding: dict[str, Any] = {}
|
||||
if pages and isinstance(pages[0], dict):
|
||||
raw_attn = pages[0].get("needs_attention")
|
||||
attn: dict[str, Any] = raw_attn if isinstance(raw_attn, dict) else {}
|
||||
raw_embedding = attn.get("embedding")
|
||||
embedding = raw_embedding if isinstance(raw_embedding, dict) else {}
|
||||
row.response_delta_us = embedding.get("npu_busy_delta_us") if isinstance(embedding.get("npu_busy_delta_us"), int) else None
|
||||
apply_proof(row, delta)
|
||||
if not row.reachable:
|
||||
row.warnings.append("probe_http_failed")
|
||||
row.error = str(data.get("error") or f"http_{status}")[:80]
|
||||
return row
|
||||
|
||||
|
||||
def build_summary(rows: list[ServiceRow], artifact_path: str | None, counter_delta_us: int | None, started_at: str) -> dict[str, Any]:
|
||||
services_ok = sum(1 for r in rows if r.reachable)
|
||||
proof_rows = [r for r in rows if r.probe_ran and r.proof_ok is not None]
|
||||
proof_ok = sum(1 for r in proof_rows if r.proof_ok)
|
||||
gates_closed = sum(1 for r in rows if str(r.gate).startswith("closed:"))
|
||||
fallbacks = sum(r.fallbacks for r in rows)
|
||||
warnings: dict[str, int] = {}
|
||||
for row in rows:
|
||||
for warning in row.warnings:
|
||||
warnings[warning] = warnings.get(warning, 0) + 1
|
||||
return {
|
||||
"type": "summary",
|
||||
"timestamp": started_at,
|
||||
"counter": str(BUSY_PATH),
|
||||
"delta_us": counter_delta_us,
|
||||
"services_ok": services_ok,
|
||||
"services_total": len(rows),
|
||||
"proof_ok": proof_ok,
|
||||
"proof_total": len(proof_rows),
|
||||
"fallbacks": fallbacks,
|
||||
"gates_closed": gates_closed,
|
||||
"warnings": warnings,
|
||||
"artifact": artifact_path,
|
||||
}
|
||||
|
||||
|
||||
def render_text(summary: dict[str, Any], rows: list[ServiceRow]) -> str:
|
||||
lines = [
|
||||
f"NPU utilization digest {summary['timestamp']}",
|
||||
f"counter={summary['counter']} delta_us={summary.get('delta_us')}",
|
||||
f"services_ok={summary['services_ok']}/{summary['services_total']} proof_ok={summary['proof_ok']}/{summary['proof_total']} fallbacks={summary['fallbacks']} gates_closed={summary['gates_closed']}",
|
||||
]
|
||||
for r in rows:
|
||||
parts = [f"- {r.service}:", f"ok={str(r.reachable).lower()}"]
|
||||
if r.calls:
|
||||
parts.append(f"calls={r.calls}")
|
||||
if r.jobs is not None:
|
||||
parts.append(f"jobs={r.jobs}")
|
||||
if r.events is not None:
|
||||
parts.append(f"events={r.events}")
|
||||
if r.files is not None:
|
||||
parts.append(f"files={r.files}")
|
||||
if r.docs is not None:
|
||||
parts.append(f"docs={r.docs}")
|
||||
if r.avg_ms is not None:
|
||||
parts.append(f"avg_ms={r.avg_ms}")
|
||||
if r.npu_delta_us is not None:
|
||||
parts.append(f"npu_delta_us={r.npu_delta_us}")
|
||||
if r.proof_ok is not None:
|
||||
parts.append(f"proof={str(r.proof_ok).lower()}")
|
||||
if r.dry_run is not None:
|
||||
parts.append(f"dry_run={str(r.dry_run).lower()}")
|
||||
if r.suppress is not None:
|
||||
parts.append(f"suppress={r.suppress}")
|
||||
if r.escalate is not None:
|
||||
parts.append(f"escalate={r.escalate}")
|
||||
if r.loaded is not None:
|
||||
parts.append(f"loaded={str(r.loaded).lower()}")
|
||||
if r.allowed_roots_count is not None:
|
||||
parts.append(f"allowed_roots={r.allowed_roots_count}")
|
||||
if r.text_len is not None:
|
||||
parts.append(f"text_len={r.text_len}")
|
||||
if r.mode:
|
||||
parts.append(f"mode={r.mode}")
|
||||
if r.gate != "none":
|
||||
parts.append(f"gate={r.gate}")
|
||||
if r.reason:
|
||||
parts.append(f"reason={r.reason}")
|
||||
if r.warnings:
|
||||
parts.append("warnings=" + ",".join(sorted(set(r.warnings))))
|
||||
lines.append(" ".join(parts))
|
||||
warning_counts = summary.get("warnings") or {}
|
||||
lines.append("fallbacks: " + " ".join(f"{k}={v}" for k, v in sorted(warning_counts.items())) if warning_counts else "fallbacks: none")
|
||||
if summary.get("artifact"):
|
||||
lines.append(f"artifact: {summary['artifact']}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def write_jsonl(summary: dict[str, Any], rows: list[ServiceRow], out_dir: Path) -> Path:
|
||||
out_dir.mkdir(parents=True, exist_ok=True)
|
||||
stamp = summary["timestamp"].replace(":", "").replace("+", "").replace("-", "")
|
||||
path = out_dir / f"{stamp}.jsonl"
|
||||
with path.open("w", encoding="utf-8") as f:
|
||||
f.write(json.dumps(summary, sort_keys=True, separators=(",", ":")) + "\n")
|
||||
for row in rows:
|
||||
f.write(json.dumps(compact_dict(row), sort_keys=True, separators=(",", ":")) + "\n")
|
||||
return path
|
||||
|
||||
|
||||
def str_bool(value: str) -> bool:
|
||||
lowered = value.lower()
|
||||
if lowered in {"1", "true", "yes", "y", "on"}:
|
||||
return True
|
||||
if lowered in {"0", "false", "no", "n", "off"}:
|
||||
return False
|
||||
raise argparse.ArgumentTypeError("expected true or false")
|
||||
|
||||
|
||||
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(description="Compact NPU utilization digest")
|
||||
parser.add_argument("--format", choices=("text", "jsonl"), default="text")
|
||||
parser.add_argument("--out", default=str(DEFAULT_OUT_DIR))
|
||||
parser.add_argument("--timeout-s", type=float, default=8.0)
|
||||
parser.add_argument("--include-whisper-smoke", type=str_bool, default=True)
|
||||
parser.add_argument("--include-genai-smoke", type=str_bool, default=False)
|
||||
parser.add_argument("--include-doc-triage-smoke", type=str_bool, default=True)
|
||||
parser.add_argument("--no-write", action="store_true")
|
||||
parser.add_argument("--strict-proof", action="store_true", help="exit nonzero if a proof-required probe ran without positive sysfs delta")
|
||||
parser.add_argument("--verbose", action="store_true")
|
||||
return parser.parse_args(argv)
|
||||
|
||||
|
||||
def run(args: argparse.Namespace) -> tuple[dict[str, Any], list[ServiceRow]]:
|
||||
started_at = dt.datetime.now().astimezone().replace(microsecond=0).isoformat()
|
||||
before_all = read_busy(BUSY_PATH)
|
||||
rows = [
|
||||
probe_embeddings(args.timeout_s),
|
||||
probe_rerank(args.timeout_s),
|
||||
probe_whisper(args.timeout_s, args.include_whisper_smoke),
|
||||
probe_classifier(args.timeout_s),
|
||||
probe_genai(args.timeout_s, args.include_genai_smoke),
|
||||
probe_doc_triage(args.timeout_s, args.include_doc_triage_smoke),
|
||||
]
|
||||
rows.append(health_row("rag_endpoint", RAG_ENDPOINT_HEALTH_URL, args.timeout_s, gate="closed:vector-mutation")[0])
|
||||
rows.append(health_row("rag_health", RAG_HEALTH_URL, args.timeout_s)[0])
|
||||
rows.append(health_row("advisory_gateway", ADVISORY_HEALTH_URL, args.timeout_s, gate="closed:advisory-post")[0])
|
||||
after_all = read_busy(BUSY_PATH)
|
||||
delta_all = None if before_all is None or after_all is None else after_all - before_all
|
||||
summary = build_summary(rows, artifact_path=None, counter_delta_us=delta_all, started_at=started_at)
|
||||
return summary, rows
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
args = parse_args(argv)
|
||||
summary, rows = run(args)
|
||||
if not args.no_write:
|
||||
artifact = write_jsonl(summary, rows, Path(args.out).expanduser())
|
||||
summary["artifact"] = str(artifact)
|
||||
# rewrite with artifact path included in the summary line
|
||||
artifact.write_text("\n".join([json.dumps(summary, sort_keys=True, separators=(",", ":"))] + [json.dumps(compact_dict(r), sort_keys=True, separators=(",", ":")) for r in rows]) + "\n")
|
||||
if args.format == "jsonl":
|
||||
print(json.dumps(summary, sort_keys=True, separators=(",", ":")))
|
||||
for row in rows:
|
||||
print(json.dumps(compact_dict(row), sort_keys=True, separators=(",", ":")))
|
||||
else:
|
||||
print(render_text(summary, rows))
|
||||
if args.strict_proof and any(r.probe_ran and r.proof_ok is False for r in rows):
|
||||
return 2
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
Executable
+339
@@ -0,0 +1,339 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Local-file voice/audio NPU advisory pipeline.
|
||||
|
||||
Side-effect-free first slice:
|
||||
local audio file -> Whisper NPU -> classifier NPU -> advisory gate
|
||||
|
||||
No platform fetching, outbound sends, Obsidian/memory/vector writes, service
|
||||
restarts, or live Atlas/Hermes routing changes are performed by this script.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import ipaddress
|
||||
import json
|
||||
import mimetypes
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
import uuid
|
||||
import wave
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
|
||||
DEFAULT_WHISPER_URL = "http://127.0.0.1:18816/v1/audio/transcriptions"
|
||||
DEFAULT_CLASSIFIER_URL = "http://127.0.0.1:18819/v1/classify"
|
||||
NPU_BUSY_PATH = Path("/sys/class/accel/accel0/device/npu_busy_time_us")
|
||||
AUDIO_EXTENSIONS = {".wav", ".ogg", ".oga", ".opus", ".mp3", ".m4a", ".mp4", ".webm", ".flac"}
|
||||
ACTION_MARKERS = re.compile(
|
||||
r"\b(remind|todo|to-do|task|follow[- ]?up|schedule|call|email|send|draft|inspect|check|fix|review|question|ask)\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
|
||||
class PipelineError(RuntimeError):
|
||||
def __init__(self, message: str, *, status: int = 1, details: dict[str, Any] | None = None):
|
||||
super().__init__(message)
|
||||
self.status = status
|
||||
self.details = details or {}
|
||||
|
||||
|
||||
def validate_loopback_endpoint(url: str, *, label: str) -> str:
|
||||
"""Return url when it targets an explicit local HTTP(S) endpoint.
|
||||
|
||||
The pipeline reads local audio and posts transcripts/audio bytes, so endpoint
|
||||
overrides must not be able to exfiltrate data to remote hosts. Keep the
|
||||
policy intentionally narrow: localhost, IPv4 loopback, or IPv6 ::1 only.
|
||||
"""
|
||||
parsed = urllib.parse.urlparse(url)
|
||||
if parsed.scheme not in {"http", "https"}:
|
||||
raise PipelineError(
|
||||
f"{label}_url_scheme_not_allowed",
|
||||
details={"url_host": parsed.hostname or "", "allowed_schemes": ["http", "https"]},
|
||||
)
|
||||
host = parsed.hostname
|
||||
if not host:
|
||||
raise PipelineError(f"{label}_url_missing_host")
|
||||
normalized = host.rstrip(".").lower()
|
||||
if normalized == "localhost":
|
||||
return url
|
||||
try:
|
||||
address = ipaddress.ip_address(normalized)
|
||||
except ValueError as exc:
|
||||
raise PipelineError(
|
||||
f"{label}_url_host_not_loopback",
|
||||
details={"url_host": host, "allowed_hosts": ["localhost", "127.0.0.0/8", "::1"]},
|
||||
) from exc
|
||||
if not address.is_loopback:
|
||||
raise PipelineError(
|
||||
f"{label}_url_host_not_loopback",
|
||||
details={"url_host": host, "allowed_hosts": ["localhost", "127.0.0.0/8", "::1"]},
|
||||
)
|
||||
return url
|
||||
|
||||
|
||||
def read_npu_busy_us(path: Path = NPU_BUSY_PATH) -> int | None:
|
||||
try:
|
||||
return int(path.read_text().strip())
|
||||
except (OSError, ValueError):
|
||||
return None
|
||||
|
||||
|
||||
def delta_us(before: int | None, after: int | None) -> int | None:
|
||||
if before is None or after is None:
|
||||
return None
|
||||
return max(0, after - before)
|
||||
|
||||
|
||||
def encode_multipart(fields: dict[str, str], files: dict[str, tuple[str, bytes, str]]) -> tuple[bytes, str]:
|
||||
boundary = "----npu-voice-audio-" + uuid.uuid4().hex
|
||||
parts: list[bytes] = []
|
||||
for name, value in fields.items():
|
||||
parts.append(f"--{boundary}\r\n".encode())
|
||||
parts.append(f'Content-Disposition: form-data; name="{name}"\r\n\r\n'.encode())
|
||||
parts.append(str(value).encode())
|
||||
parts.append(b"\r\n")
|
||||
for name, (filename, data, content_type) in files.items():
|
||||
parts.append(f"--{boundary}\r\n".encode())
|
||||
parts.append(f'Content-Disposition: form-data; name="{name}"; filename="{filename}"\r\n'.encode())
|
||||
parts.append(f"Content-Type: {content_type}\r\n\r\n".encode())
|
||||
parts.append(data)
|
||||
parts.append(b"\r\n")
|
||||
parts.append(f"--{boundary}--\r\n".encode())
|
||||
return b"".join(parts), f"multipart/form-data; boundary={boundary}"
|
||||
|
||||
|
||||
def post_json(url: str, payload: dict[str, Any], *, timeout: int) -> dict[str, Any]:
|
||||
url = validate_loopback_endpoint(url, label="classifier")
|
||||
req = urllib.request.Request(
|
||||
url,
|
||||
data=json.dumps(payload).encode(),
|
||||
headers={"Content-Type": "application/json"},
|
||||
method="POST",
|
||||
)
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
return json.loads(resp.read().decode())
|
||||
except urllib.error.HTTPError as exc:
|
||||
body = exc.read().decode(errors="replace")[:300]
|
||||
raise PipelineError(f"classifier_http_{exc.code}", details={"body_preview": body}) from exc
|
||||
except (urllib.error.URLError, TimeoutError, json.JSONDecodeError) as exc:
|
||||
raise PipelineError(f"classifier_request_failed: {exc}") from exc
|
||||
|
||||
|
||||
def post_whisper(url: str, audio_path: Path, audio_data: bytes, language: str, *, timeout: int) -> dict[str, Any]:
|
||||
url = validate_loopback_endpoint(url, label="whisper")
|
||||
content_type = mimetypes.guess_type(audio_path.name)[0] or "application/octet-stream"
|
||||
body, multipart_type = encode_multipart(
|
||||
{"model": "whisper-1", "language": language, "response_format": "json"},
|
||||
{"file": (audio_path.name, audio_data, content_type)},
|
||||
)
|
||||
req = urllib.request.Request(url, data=body, headers={"Content-Type": multipart_type}, method="POST")
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
return json.loads(resp.read().decode())
|
||||
except urllib.error.HTTPError as exc:
|
||||
body = exc.read().decode(errors="replace")[:300]
|
||||
raise PipelineError(f"whisper_http_{exc.code}", details={"body_preview": body}) from exc
|
||||
except (urllib.error.URLError, TimeoutError, json.JSONDecodeError) as exc:
|
||||
raise PipelineError(f"whisper_request_failed: {exc}") from exc
|
||||
|
||||
|
||||
def validate_audio_path(path_text: str, *, max_bytes: int, max_audio_seconds: float | None) -> tuple[Path, int]:
|
||||
path = Path(path_text).expanduser()
|
||||
if not path.is_absolute():
|
||||
raise PipelineError("audio_path_must_be_absolute")
|
||||
if path.is_symlink():
|
||||
raise PipelineError("audio_path_must_not_be_symlink")
|
||||
if not path.exists():
|
||||
raise PipelineError("audio_path_not_found")
|
||||
if not path.is_file():
|
||||
raise PipelineError("audio_path_not_file")
|
||||
if path.suffix.lower() not in AUDIO_EXTENSIONS:
|
||||
raise PipelineError("unsupported_audio_extension", details={"extension": path.suffix.lower()})
|
||||
size = path.stat().st_size
|
||||
if size <= 0:
|
||||
raise PipelineError("audio_file_empty")
|
||||
if size > max_bytes:
|
||||
raise PipelineError("audio_file_too_large", details={"bytes": size, "max_bytes": max_bytes})
|
||||
if max_audio_seconds is not None and path.suffix.lower() == ".wav":
|
||||
try:
|
||||
with wave.open(str(path), "rb") as wav:
|
||||
duration = wav.getnframes() / float(wav.getframerate())
|
||||
except wave.Error as exc:
|
||||
raise PipelineError(f"wav_decode_failed: {exc}") from exc
|
||||
if duration > max_audio_seconds:
|
||||
raise PipelineError("audio_duration_too_long", details={"duration_seconds": round(duration, 3), "max_audio_seconds": max_audio_seconds})
|
||||
return path, size
|
||||
|
||||
|
||||
def extract_transcript(payload: dict[str, Any]) -> str:
|
||||
text = payload.get("text") or payload.get("transcript") or payload.get("transcription")
|
||||
if not text and isinstance(payload.get("segments"), list):
|
||||
text = " ".join(str(seg.get("text", "")) for seg in payload["segments"] if isinstance(seg, dict))
|
||||
return str(text or "").strip()
|
||||
|
||||
|
||||
def label_value(labels: dict[str, Any], key: str, default: Any = None) -> Any:
|
||||
value = labels.get(key, default)
|
||||
if isinstance(value, dict) and "value" in value:
|
||||
return value.get("value")
|
||||
return value
|
||||
|
||||
|
||||
def compact_labels(classifier_payload: dict[str, Any]) -> dict[str, Any]:
|
||||
raw_labels = classifier_payload.get("labels")
|
||||
labels: dict[str, Any] = raw_labels if isinstance(raw_labels, dict) else {}
|
||||
return {
|
||||
"workflow_category": label_value(labels, "workflow_category"),
|
||||
"tool_needed": bool(label_value(labels, "tool_needed", False)),
|
||||
"urgency": label_value(labels, "urgency", "normal"),
|
||||
"safety_confirmation_required": bool(label_value(labels, "safety_confirmation_required", False)),
|
||||
}
|
||||
|
||||
|
||||
def classify_text(
|
||||
*,
|
||||
classifier_url: str,
|
||||
item_id: str,
|
||||
source: str,
|
||||
title: str,
|
||||
transcript: str,
|
||||
max_transcript_chars: int,
|
||||
dry_run: bool,
|
||||
timeout: int,
|
||||
) -> tuple[dict[str, Any], int | None]:
|
||||
bounded_transcript = transcript[:max_transcript_chars]
|
||||
title_line = f"Title: {title}\n" if title else ""
|
||||
text = "Voice memo transcript summary candidate.\n" f"Source: {source}\n" f"{title_line}Transcript:\n{bounded_transcript}"
|
||||
payload = {
|
||||
"id": item_id,
|
||||
"text": text,
|
||||
"context": {"source": source, "media": "audio"},
|
||||
"options": {"include_evidence": False, "dry_run": dry_run},
|
||||
}
|
||||
before = read_npu_busy_us()
|
||||
data = post_json(classifier_url, payload, timeout=timeout)
|
||||
after = read_npu_busy_us()
|
||||
return data, delta_us(before, after)
|
||||
|
||||
|
||||
def decide_gate(transcript: str, labels: dict[str, Any], whisper_proven: bool, classifier_proven: bool) -> tuple[bool, str, str]:
|
||||
safety_required = bool(labels.get("safety_confirmation_required"))
|
||||
urgency = str(labels.get("urgency") or "normal").lower()
|
||||
action_worthy = bool(labels.get("tool_needed")) or urgency in {"high", "critical"} or bool(ACTION_MARKERS.search(transcript))
|
||||
if not whisper_proven or not classifier_proven:
|
||||
return action_worthy, "blocked_missing_npu_proof", "npu_proof_required"
|
||||
if safety_required:
|
||||
return action_worthy, "blocked_safety_confirmation_required", "human_approval_required"
|
||||
if action_worthy:
|
||||
return True, "advisory_only_not_sent", "dry_run_no_side_effects"
|
||||
return False, "suppressed_not_action_worthy", "dry_run_no_side_effects"
|
||||
|
||||
|
||||
def run_pipeline(args: argparse.Namespace) -> dict[str, Any]:
|
||||
args.whisper_url = validate_loopback_endpoint(args.whisper_url, label="whisper")
|
||||
args.classifier_url = validate_loopback_endpoint(args.classifier_url, label="classifier")
|
||||
audio_path, audio_bytes = validate_audio_path(
|
||||
args.audio,
|
||||
max_bytes=args.max_bytes,
|
||||
max_audio_seconds=args.max_audio_seconds,
|
||||
)
|
||||
audio_data = audio_path.read_bytes()
|
||||
item_id = args.id or f"voice-audio-{int(time.time())}"
|
||||
|
||||
whisper_before = read_npu_busy_us()
|
||||
whisper_payload = post_whisper(args.whisper_url, audio_path, audio_data, args.language, timeout=args.timeout)
|
||||
whisper_after = read_npu_busy_us()
|
||||
whisper_sysfs_delta = delta_us(whisper_before, whisper_after)
|
||||
transcript = extract_transcript(whisper_payload)
|
||||
if not transcript:
|
||||
raise PipelineError("whisper_empty_transcript")
|
||||
|
||||
whisper_response_delta = int(whisper_payload.get("npu_busy_delta_us") or 0)
|
||||
whisper_proven = whisper_response_delta > 0 and (whisper_sysfs_delta is None or whisper_sysfs_delta > 0)
|
||||
|
||||
classifier_payload, classifier_sysfs_observed = classify_text(
|
||||
classifier_url=args.classifier_url,
|
||||
item_id=item_id,
|
||||
source=args.source,
|
||||
title=args.title or "",
|
||||
transcript=transcript,
|
||||
max_transcript_chars=args.max_transcript_chars,
|
||||
dry_run=args.dry_run,
|
||||
timeout=args.timeout,
|
||||
)
|
||||
labels = compact_labels(classifier_payload)
|
||||
classifier_response_delta = int(classifier_payload.get("npu_busy_delta_us") or 0)
|
||||
classifier_response_sysfs_delta = int(classifier_payload.get("sysfs_npu_busy_delta_us") or 0)
|
||||
classifier_proven = classifier_response_delta > 0 and classifier_response_sysfs_delta > 0 and (classifier_sysfs_observed is None or classifier_sysfs_observed > 0)
|
||||
|
||||
action_worthy, atlas_gate, next_gate = decide_gate(transcript, labels, whisper_proven, classifier_proven)
|
||||
|
||||
output: dict[str, Any] = {
|
||||
"ok": True,
|
||||
"id": item_id,
|
||||
"source": args.source,
|
||||
"transcript_chars": len(transcript),
|
||||
"action_worthy": action_worthy,
|
||||
"atlas_gate": atlas_gate,
|
||||
"next_gate": next_gate,
|
||||
"whisper_npu_delta_us": whisper_response_delta,
|
||||
"whisper_sysfs_delta_us": whisper_sysfs_delta,
|
||||
"classifier_npu_delta_us": classifier_response_delta,
|
||||
"classifier_sysfs_delta_us": classifier_response_sysfs_delta,
|
||||
"classifier_observed_sysfs_delta_us": classifier_sysfs_observed,
|
||||
"labels": labels,
|
||||
"external_sends": 0,
|
||||
"writes": 0,
|
||||
}
|
||||
if args.include_transcript:
|
||||
output["transcript"] = transcript
|
||||
if args.include_transcript_preview_chars > 0:
|
||||
output["transcript_preview"] = transcript[: args.include_transcript_preview_chars]
|
||||
if args.include_raw:
|
||||
output["raw"] = {"whisper": whisper_payload, "classifier": classifier_payload}
|
||||
return output
|
||||
|
||||
|
||||
def build_parser() -> argparse.ArgumentParser:
|
||||
parser = argparse.ArgumentParser(description="Run local-file audio through NPU Whisper and NPU classifier in dry-run advisory mode.")
|
||||
parser.add_argument("--audio", required=True, help="Absolute path to a local audio file; no URL/platform fetching is performed.")
|
||||
parser.add_argument("--id", default="", help="Optional stable item id for classifier correlation.")
|
||||
parser.add_argument("--source", default="local_file", choices=["local_file", "manual_smoke", "local_voice_memo", "meeting_snippet", "staged_telegram", "staged_discord"], help="Local/staged source label only.")
|
||||
parser.add_argument("--title", default="", help="Optional short local title for classifier context.")
|
||||
parser.add_argument("--language", default="en")
|
||||
parser.add_argument("--whisper-url", default=DEFAULT_WHISPER_URL)
|
||||
parser.add_argument("--classifier-url", default=DEFAULT_CLASSIFIER_URL)
|
||||
parser.add_argument("--dry-run", dest="dry_run", action="store_true", default=True, help="Keep classifier in dry-run advisory mode (default).")
|
||||
parser.add_argument("--no-dry-run", dest="dry_run", action="store_false", help="Send dry_run=false to classifier; this script still performs no side effects.")
|
||||
parser.add_argument("--json", action="store_true", help="Emit compact JSON; default is JSON for machine-safe handoff.")
|
||||
parser.add_argument("--include-transcript", action="store_true", help="Include full transcript in output; off by default.")
|
||||
parser.add_argument("--include-transcript-preview-chars", type=int, default=0, help="Include a bounded transcript preview; default 0.")
|
||||
parser.add_argument("--include-raw", action="store_true", help="Include raw service responses for one-off local debugging; off by default.")
|
||||
parser.add_argument("--max-bytes", type=int, default=25 * 1024 * 1024)
|
||||
parser.add_argument("--max-audio-seconds", type=float, default=300.0, help="Enforced for WAV inputs; other codecs remain size-capped.")
|
||||
parser.add_argument("--max-transcript-chars", type=int, default=6000)
|
||||
parser.add_argument("--timeout", type=int, default=300)
|
||||
return parser
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
parser = build_parser()
|
||||
args = parser.parse_args(argv)
|
||||
try:
|
||||
result = run_pipeline(args)
|
||||
print(json.dumps(result, ensure_ascii=False, sort_keys=True))
|
||||
return 0
|
||||
except PipelineError as exc:
|
||||
result = {"ok": False, "error": str(exc), "external_sends": 0, "writes": 0, **exc.details}
|
||||
print(json.dumps(result, ensure_ascii=False, sort_keys=True), file=sys.stderr)
|
||||
return exc.status
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
+122
-134
@@ -3,7 +3,7 @@ type: runbook
|
||||
system: openvino-npu-services
|
||||
status: draft
|
||||
created: 2026-06-04
|
||||
updated: 2026-06-04
|
||||
updated: 2026-06-05
|
||||
tags:
|
||||
- runbook
|
||||
- openvino
|
||||
@@ -18,33 +18,92 @@ related:
|
||||
|
||||
# OpenVINO NPU Services Runbook
|
||||
|
||||
This runbook is the integrated operations view for Will's local Intel NPU/OpenVINO services from the `npu-capability-expansion` board.
|
||||
This runbook is the integrated operations view for Will's local Intel NPU/OpenVINO services after the first approved `npu-maximization` lanes. It treats the NPU as a local reflex layer: classify, embed, rerank, transcribe, triage, and draft compact advisory output while Atlas/Hermes keeps final authority unless a separate approval changes that.
|
||||
|
||||
Safety posture:
|
||||
- Do not restart the live Atlas/Hermes gateway from this runbook.
|
||||
- Do not change primary Atlas/Hermes routing without explicit Will approval.
|
||||
- Do not delete, overwrite, or in-place reindex existing Chroma/vector collections.
|
||||
- Treat HTTP 200 as necessary but not sufficient for NPU-backed services; verify `/sys/class/accel/accel0/device/npu_busy_time_us` before/after an inference.
|
||||
- Keep endpoints local-only unless Will explicitly approves broader exposure.
|
||||
- Keep raw prompts, private documents, OCR text, and secrets out of logs and durable handoffs.
|
||||
- Treat HTTP 200 as necessary but not sufficient for NPU-backed services; verify `/sys/class/accel/accel0/device/npu_busy_time_us` before/after a real inference.
|
||||
- Keep endpoints local-only or on the approved Docker bridge only; do not add wildcard binds.
|
||||
- Keep raw prompts, private documents, OCR text, transcripts, and secrets out of logs and durable handoffs.
|
||||
- Keep operational outputs compact: booleans, counts, paths, deltas, and gates rather than raw JSON dumps.
|
||||
|
||||
## Current service map
|
||||
## Reflex-layer topology
|
||||
|
||||
| Capability | Port | Runtime / service | Path | State | Health endpoint | NPU proof |
|
||||
| --- | ---: | --- | --- | --- | --- | --- |
|
||||
| Obsidian/RAG endpoint | 18810 | `obsidian-reindex-endpoint.service` / local Python endpoint | `~/lab/swarm/scripts/` | live baseline; uses collection `obsidian_bge_npu` | `http://127.0.0.1:18810/healthz` | indirect via embeddings `:18817`; do not mutate existing collection |
|
||||
| RAG/embedding health wrapper | 18814 | `rag-embedding-health.service` | `~/lab/swarm/swarm-common/rag-embedding-health.service` | live baseline | `http://127.0.0.1:18814/healthz` | should exercise embeddings path when configured |
|
||||
| Whisper transcription, OpenVINO NPU | 18816 | Docker Compose service/container `whisper-server-npu` | `~/lab/swarm/whisper-openvino-npu/` | live baseline | `http://127.0.0.1:18816/health` | transcription response includes `npu_busy_delta_us`; sysfs delta must increase |
|
||||
| OpenVINO embeddings | 18817 | user systemd `openvino-embeddings.service` | `~/lab/swarm/scripts/openvino-embeddings-server.py`; unit in `~/lab/swarm/swarm-common/openvino-embeddings.service` | live baseline, enabled | `http://127.0.0.1:18817/healthz` | embedding response and sysfs delta must be positive |
|
||||
| NPU reranker prototype | 18818 | optional user systemd `openvino-reranker.service` | `~/lab/swarm/openvino-reranker-npu/` | approved prototype; not installed/enabled | `http://127.0.0.1:18818/readyz` | `/readyz` reports `device=NPU`; `/v1/rerank` response and sysfs delta must be positive |
|
||||
| NPU router/classifier prototype | 18819 | optional user systemd `openvino-router-classifier.service` | `~/lab/swarm/openvino-classifier-npu/` | approved prototype; not installed/enabled | `http://127.0.0.1:18819/healthz` | `/v1/classify` response has positive `npu_busy_delta_us` and `sysfs_npu_busy_delta_us` |
|
||||
| Small OpenVINO GenAI NPU worker | 18820 | optional user systemd `openvino-genai-npu-worker.service` | `~/lab/swarm/openvino-genai-npu-worker/` | approved prototype; not installed/enabled | `http://127.0.0.1:18820/healthz`; `GET /models` | generation response includes positive `npu_busy_delta_us` |
|
||||
| Document/image triage prototype | optional 18829 for review only; 18828 was an earlier smoke alternate | CLI-first; foreground local-only server if needed; no persistent unit yet | `~/lab/swarm/openvino-doc-image-triage-npu/` | approved prototype; not installed/enabled | `http://127.0.0.1:18829/healthz`; `GET /models` | v1 NPU stage is semantic embedding through `:18817`; image classification/OCR remain CPU/local |
|
||||
```text
|
||||
event / audio / doc / query / task
|
||||
-> local OpenVINO/NPU specialists
|
||||
embeddings :18817, rerank :18818, whisper :18816,
|
||||
classifier :18819, genai worker :18820, doc/image triage :18829,
|
||||
advisory gateway 172.19.0.1:18830
|
||||
-> explicit policy and authority gates
|
||||
-> Atlas/Hermes or human only when approved/useful
|
||||
```
|
||||
|
||||
Authority split:
|
||||
- NPU services may advise, label, score, transcribe, embed, rerank, triage explicit roots/files, and draft bounded summaries.
|
||||
- NPU services must not route Atlas/Hermes, write memory, send outbound messages, restart services, execute tools, mutate Kanban, or mutate vector DBs without separate approval.
|
||||
|
||||
## Live baseline services
|
||||
|
||||
These are part of the current live local baseline. Use read-only checks unless Will explicitly asks for remediation.
|
||||
|
||||
| Capability | Port / bind | Runtime / service | State | Health / proof | Notes |
|
||||
| --- | ---: | --- | --- | --- | --- |
|
||||
| Obsidian/RAG endpoint | `18810` | `obsidian-reindex-endpoint.service` / local Python endpoint | live baseline | `http://127.0.0.1:18810/healthz`; NPU proof is indirect through embeddings/rerank | Uses collection `obsidian_bge_npu`; do not mutate/reindex in place. Discovery observed `RAG_RERANK_ENABLED=true` and `RAG_RERANK_REQUIRE_NPU_PROOF=true`; do not change from this runbook. |
|
||||
| RAG/embedding health wrapper | `18814` | `rag-embedding-health.service` | live baseline | `http://127.0.0.1:18814/healthz` | Health wrapper only; use compact summaries. |
|
||||
| Whisper transcription | `18816` | Docker Compose service/container `whisper-server-npu` | live baseline | `http://127.0.0.1:18816/health`; transcription response plus sysfs busy delta must increase | Use small non-private WAV fixtures for proof. Do not restart from docs. |
|
||||
| OpenVINO embeddings | `18817` | user systemd `openvino-embeddings.service` | live baseline, enabled | `http://127.0.0.1:18817/healthz`; embedding response and sysfs delta must be positive | Model `bge-base-en-v1.5-int8-ov`, dim 768. Existing bind is broader than new-service guidance; do not broaden anything else. |
|
||||
|
||||
## Live local-only advisory specialists
|
||||
|
||||
These services are available locally for advisory/reflex work, not for authority. Some were originally prototypes but discovery/review found them active/enabled; do not reinstall or enable again blindly.
|
||||
|
||||
| Capability | Port / bind | Runtime / service | State | Health / proof | Authority boundary |
|
||||
| --- | ---: | --- | --- | --- | --- |
|
||||
| NPU reranker | `18818` localhost | `openvino-reranker.service` / `openvino-reranker-npu/` | live local specialist | `/readyz`; `/rerank` response and positive sysfs delta | Rerank only; no vector mutation. |
|
||||
| NPU router/classifier | `18819` localhost | `openvino-router-classifier.service` / `openvino-classifier-npu/` | live local specialist, dry-run/advisory | `/healthz`; `/v1/classify` response and positive sysfs delta | Labels/recommendations only; no routing, sends, memory writes, restarts, or tool execution. |
|
||||
| Small OpenVINO GenAI worker | `18820` localhost | `openvino-genai-npu-worker.service` / `openvino-genai-npu-worker/` | live local specialist; may report `loaded=false` until used | `/healthz`, `/models`; generation proof requires positive sysfs delta | Bounded draft/title/summary jobs only; not primary Atlas chat. Avoid cold-load generation unless the task requires it. |
|
||||
| Document/image triage | `18829` localhost | `openvino-doc-image-triage-npu/` | live local specialist with explicit roots | `/healthz`, `/models`; v1 NPU proof is semantic embedding through `:18817` | Request roots may narrow configured roots, never broaden. OCR/image classification are CPU/local fallbacks. |
|
||||
| Advisory gateway | `172.19.0.1:18830` approved bridge | `openvino-advisory-gateway.service` / `openvino-advisory-gateway/` | live bridge-facing advisory wrapper | `/healthz`; classify/generate/triage responses include NPU proof | For `n8n-agent` and host cron. POSTs can write metadata events, so use health-only unless classification/draft is in scope. No wildcard bind. |
|
||||
|
||||
Port notes:
|
||||
- `18818`, `18819`, and `18820` are reserved prototype ports from the program plan; check listeners before binding.
|
||||
- `18820` is reserved for the GenAI worker prototype. Use optional `18829` for document/image triage foreground review until Will approves a final persistent port. `18828` was used in earlier review smoke only and should not be treated as the preferred documented port.
|
||||
- Existing `:18817` is currently bound on `0.0.0.0` by the user service; prototype services should still default to `127.0.0.1`.
|
||||
- Prefer localhost for host-only sidecars. The advisory gateway bridge bind is intentionally for Docker bridge consumers such as `n8n-agent`.
|
||||
- `18828` was an earlier review alternate for doc/image triage and should not be treated as the preferred documented port.
|
||||
- Check listeners before foreground smokes: `ss -ltnp | grep -E ':(18810|18814|18816|18817|18818|18819|18820|18829|18830)\b'`.
|
||||
|
||||
## Dry-run examples and approved lane artifacts
|
||||
|
||||
The first-slice lanes below are approved as dry-run/local advisory examples. They may be merged into the repo by the integration lane, but they do not grant authority to mutate live Atlas/Hermes behavior.
|
||||
|
||||
| Lane | Approved branch / commit | Artifact paths | Safe use |
|
||||
| --- | --- | --- | --- |
|
||||
| Observability/utilization digest | `feature/npu-max-observability` @ `d661dc299` | `docs/npu-utilization-digest.md`, `scripts/npu-utilization-digest.py` | Read-only compact digest; can write JSONL under `~/.local/state/npu-utilization/digests` unless `--no-write`. Reviewer verified services_ok=9/9, proof_ok=5/5 on live smoke. |
|
||||
| Context-gate advisory CLI | `feature/npu-max-context-gate` @ `b4ef90aff` | `openvino_context_gate/`, `scripts/context-gate-advisory.py` | Plans typed context bundle sources; no retrieval, routing, memory write, or private content. Classifier URL is loopback-only and redirects fail closed. |
|
||||
| Cron/n8n advisory classifier | `feature/npu-max-cron-n8n` @ `54d3bcb7` | `openvino-advisory-gateway/docs/cron-n8n-advisory-classifier.md`, `examples/cron-advisory-dry-run.sh`, `examples/n8n-advisory-dry-run-fragment.json` | Dry-run event classification: duplicate/stale/no-op/action-required -> suppress/log/summarize/escalate recommendation, then human/Atlas gate before side effects. |
|
||||
| Explicit-root batch doc/image/audio triage | `feature/npu-max-doc-audio-triage` @ `bfa62cddb` | `docs/npu-batch-triage-dry-run.md`, `scripts/npu-batch-triage-dry-run.py`, `config/triage-roots*.yaml` | Reads only approved/narrow staging roots; reports compact counts/proof; no file moves, Obsidian/RAG writes, sends, or vector mutation. Whisper endpoint override is loopback `:18816` only. |
|
||||
| Voice/audio local-file pipeline | `feature/npu-max-voice` @ `534816249` | `docs/npu-voice-audio-pipeline.md`, `scripts/npu_voice_audio_pipeline.py` | Local audio file -> Whisper NPU -> classifier NPU -> advisory gate. No platform fetching, sends, writes, memory writes, or routing changes. |
|
||||
| Kanban/task hygiene advisory | `feature/npu-max-kanban-hygiene` @ `575a3cef6` | `scripts/kanban-hygiene-advisory.py` | Reads compact board summaries and suggests labels/next gates only. Does not call Kanban tools or mutate the board. NPU proof failures dominate generic review-required gates. |
|
||||
|
||||
Dry-run command patterns:
|
||||
|
||||
```bash
|
||||
# Compact service/proof digest; no artifact write during review.
|
||||
scripts/npu-utilization-digest.py --no-write --include-genai-smoke false
|
||||
|
||||
# Local-only context-gate planning; does not retrieve private content.
|
||||
python scripts/context-gate-advisory.py --query "How do I check NPU reranker proof?" --format compact
|
||||
|
||||
# Cron/n8n event advisory wrapper; dry-run only, one compact decision line.
|
||||
openvino-advisory-gateway/examples/cron-advisory-dry-run.sh npu-service-health warning health_check "openvino-reranker timeout twice" "service:openvino-reranker:timeout"
|
||||
|
||||
# Explicit-root triage; manifest root may be narrowed by --root, never broadened.
|
||||
python scripts/npu-batch-triage-dry-run.py --manifest config/triage-roots.test.yaml --lane receipts --root openvino-doc-image-triage-npu/samples --limit 5 --dry-run --json
|
||||
|
||||
# Local-file audio advisory; transcript omitted unless explicitly requested.
|
||||
/home/will/.venvs/npu/bin/python scripts/npu_voice_audio_pipeline.py --audio /tmp/npu-voice-smoke.wav --title "synthetic smoke" --source manual_smoke --json
|
||||
```
|
||||
|
||||
## Read-only unified health check
|
||||
|
||||
@@ -55,15 +114,15 @@ cd ~/lab/swarm
|
||||
./scripts/npu-service-health.sh
|
||||
```
|
||||
|
||||
The script is read-only. It checks listeners for `18810`, `18816`, `18817`, `18818`, `18819`, `18820`, `18829` plus the existing `18814` wrapper and `18828` review alternate, user service state, Docker Compose state for `whisper-server-npu`, JSON health endpoints, and performs a non-private embeddings request while measuring `/sys/class/accel/accel0/device/npu_busy_time_us` before and after. A positive sysfs delta is required for the embeddings proof.
|
||||
The script is read-only. It checks listeners for the live baseline and local specialists, user service state, Docker Compose state for `whisper-server-npu`, JSON health endpoints, and a non-private embeddings request while measuring `/sys/class/accel/accel0/device/npu_busy_time_us` before and after. A positive sysfs delta is required for the embeddings proof.
|
||||
|
||||
Manual minimal checks:
|
||||
|
||||
```bash
|
||||
BUSY=/sys/class/accel/accel0/device/npu_busy_time_us
|
||||
cat "$BUSY"
|
||||
ss -ltnp | grep -E ':(18810|18816|18817|18818|18819|18820|18829)\b' || true
|
||||
systemctl --user is-active openvino-embeddings.service rag-embedding-health.service
|
||||
ss -ltnp | grep -E ':(18810|18814|18816|18817|18818|18819|18820|18829|18830)\b' || true
|
||||
systemctl --user is-active openvino-embeddings.service rag-embedding-health.service openvino-reranker.service openvino-router-classifier.service openvino-genai-npu-worker.service openvino-doc-image-triage.service openvino-advisory-gateway.service
|
||||
cd ~/lab/swarm && docker compose ps whisper-server-npu
|
||||
curl -fsS http://127.0.0.1:18817/healthz | jq .
|
||||
```
|
||||
@@ -87,23 +146,7 @@ A healthy NPU path has:
|
||||
|
||||
## Service-specific smoke checks
|
||||
|
||||
For any foreground prototype server below, run it in a terminal you control or capture its PID and stop it at the end of the smoke. Do not use `systemctl --user enable`, Docker Compose `up -d`, `nohup`, or shell disowning for these review smokes unless Will explicitly approved persistent service enablement.
|
||||
|
||||
Safe foreground-server pattern:
|
||||
|
||||
```bash
|
||||
server_pid=""
|
||||
cleanup() {
|
||||
if [[ -n "$server_pid" ]] && kill -0 "$server_pid" 2>/dev/null; then
|
||||
kill "$server_pid"
|
||||
wait "$server_pid" 2>/dev/null || true
|
||||
fi
|
||||
}
|
||||
trap cleanup EXIT
|
||||
# start prototype server with --host 127.0.0.1 --port <port> &
|
||||
# server_pid=$!
|
||||
# run curl/smoke commands, then let trap stop it
|
||||
```
|
||||
For any foreground prototype/server smoke, run it in a terminal you control or capture its PID and stop it at the end. Do not use `systemctl --user enable`, Docker Compose `up -d`, `nohup`, or shell disowning unless Will explicitly approved persistent service enablement. Several specialists are already live; do not start duplicate listeners.
|
||||
|
||||
### Whisper NPU (`:18816`)
|
||||
|
||||
@@ -115,7 +158,6 @@ curl -fsS http://127.0.0.1:18816/health | jq .
|
||||
|
||||
Operational notes:
|
||||
- Managed as Docker Compose service/container `whisper-server-npu` in `~/lab/swarm`.
|
||||
- Consistent with existing swarm service patterns because it is a containerized service with Compose health.
|
||||
- Do not restart it from this runbook unless Will asked for remediation.
|
||||
|
||||
### OpenVINO embeddings (`:18817`)
|
||||
@@ -127,26 +169,10 @@ curl -fsS http://127.0.0.1:18817/healthz | jq .
|
||||
|
||||
Operational notes:
|
||||
- User systemd unit: `openvino-embeddings.service`.
|
||||
- Model: `bge-base-en-v1.5-int8-ov`.
|
||||
- Model directory: `/home/will/.cache/openvino-models/bge-base-en-v1.5-int8-ov`.
|
||||
- Live RAG `:18810` uses Chroma collection `obsidian_bge_npu` through this service. Do not reindex or replace this collection in place.
|
||||
|
||||
### Reranker prototype (`:18818`)
|
||||
|
||||
Foreground review start only, after confirming port is free:
|
||||
|
||||
```bash
|
||||
ss -ltnp | grep ':18818\b' || true
|
||||
cd ~/lab/swarm/openvino-reranker-npu
|
||||
source /home/will/.venvs/openvino-reranker/bin/activate
|
||||
OPENVINO_RERANKER_HOST=127.0.0.1 \
|
||||
OPENVINO_RERANKER_PORT=18818 \
|
||||
OPENVINO_RERANKER_DEVICE=NPU \
|
||||
OPENVINO_RERANKER_MODEL_DIR=/home/will/.cache/openvino-models/rerankers/ms-marco-MiniLM-L6-v2-int8-ov \
|
||||
python server.py
|
||||
```
|
||||
|
||||
From another shell:
|
||||
### Reranker (`:18818`)
|
||||
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:18818/readyz | jq .
|
||||
@@ -154,107 +180,78 @@ python ~/lab/swarm/openvino-reranker-npu/smoke.py --url http://127.0.0.1:18818
|
||||
```
|
||||
|
||||
Approval gate:
|
||||
- May be installed as `openvino-reranker.service` only after foreground smoke and Will approval.
|
||||
- May be integrated into RAG only behind disabled-by-default knobs such as `RAG_RERANK_ENABLED=false`; request-time reranking must not mutate Chroma.
|
||||
- Rerank may score candidate passages only. Any change to RAG answer selection, rerank policy, or vector DB behavior requires separate approval and rollback notes.
|
||||
|
||||
### Router/classifier prototype (`:18819`)
|
||||
|
||||
Foreground review start only, after confirming port is free:
|
||||
|
||||
```bash
|
||||
ss -ltnp | grep ':18819\b' || true
|
||||
cd ~/lab/swarm/openvino-classifier-npu
|
||||
/home/will/.venvs/npu/bin/python router_classifier.py --host 127.0.0.1 --port 18819
|
||||
```
|
||||
|
||||
Smoke:
|
||||
### Router/classifier (`:18819`)
|
||||
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:18819/healthz | jq .
|
||||
curl -fsS http://127.0.0.1:18819/v1/classify \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"id":"smoke","text":"Urgent: check whether port 18817 is listening and inspect systemd logs.","options":{"include_evidence":true,"dry_run":true}}' | jq .
|
||||
-d '{"id":"smoke","text":"Urgent: check whether port 18817 is listening and inspect systemd logs.","options":{"include_evidence":false,"dry_run":true}}' | jq '{id, labels, npu_busy_delta_us, sysfs_npu_busy_delta_us}'
|
||||
```
|
||||
|
||||
Approval gate:
|
||||
- May be installed as `openvino-router-classifier.service` only after Will approves live service enablement.
|
||||
- Must remain dry-run and must not alter Hermes/Atlas routing, memory writes, safety confirmation flow, or outbound messages without a separate explicit approval.
|
||||
- Must remain dry-run/advisory and must not alter Hermes/Atlas routing, memory writes, safety confirmation flow, or outbound messages without a separate explicit approval.
|
||||
|
||||
### Small GenAI NPU worker (`:18820`)
|
||||
|
||||
Foreground review start only, after confirming port is free:
|
||||
|
||||
```bash
|
||||
ss -ltnp | grep ':18820\b' || true
|
||||
cd ~/lab/swarm/openvino-genai-npu-worker
|
||||
/home/will/.venvs/npu/bin/python worker.py --host 127.0.0.1 --port 18820
|
||||
```
|
||||
|
||||
Smoke:
|
||||
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:18820/healthz | jq .
|
||||
curl -fsS http://127.0.0.1:18820/models | jq .
|
||||
curl -fsS http://127.0.0.1:18820/v1/worker/condense-notification \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"input":"Non-private smoke notification for local NPU worker.","max_new_tokens":64}' | jq .
|
||||
```
|
||||
|
||||
Approval gate:
|
||||
- May be installed as `openvino-genai-npu-worker.service` only after Will approves persistent service enablement.
|
||||
- Must not become primary Atlas/Hermes model routing. Use only for bounded background jobs such as title, summary, notification condensation, and memory-candidate drafting.
|
||||
- Must not become primary Atlas/Hermes model routing. Use only for bounded local jobs such as title, summary, notification condensation, and memory-candidate drafting after the relevant job is approved.
|
||||
- Avoid generation smokes that cold-load the model unless the task explicitly calls for it.
|
||||
|
||||
### Document/image triage prototype (`:18829` optional review port)
|
||||
|
||||
Foreground review start only, after confirming the port is free:
|
||||
|
||||
```bash
|
||||
ss -ltnp | grep ':18829\b' || true
|
||||
cd ~/lab/swarm/openvino-doc-image-triage-npu
|
||||
/home/will/.venvs/npu/bin/python server.py --host 127.0.0.1 --port 18829 --allowed-root "$PWD"
|
||||
```
|
||||
|
||||
Smoke:
|
||||
### Document/image triage (`:18829`)
|
||||
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:18829/healthz | jq .
|
||||
curl -fsS http://127.0.0.1:18829/models | jq .
|
||||
/home/will/.venvs/npu/bin/python tests/smoke_test.py
|
||||
```
|
||||
|
||||
Approval gate:
|
||||
- Do not point it at arbitrary directories; allowed roots must be equal to or under configured roots.
|
||||
- Do not include raw OCR text or full source paths unless Will explicitly asks for a one-off response.
|
||||
- Do not include raw OCR text or full source paths unless Will explicitly asks for one-off debugging.
|
||||
- v1 only uses the NPU through `:18817` embeddings for needs-attention; image category classification and OCR are CPU/local fallbacks.
|
||||
|
||||
## Systemd and Compose recommendations
|
||||
### Advisory gateway (`172.19.0.1:18830`)
|
||||
|
||||
Recommended management split:
|
||||
- Keep containerized services in Docker Compose when they already have Docker build/runtime shape and Compose health (`whisper-server-npu`).
|
||||
- Keep host-side OpenVINO Python prototypes as user systemd services when they depend on local venvs, sysfs NPU access, model caches, and localhost-only APIs (`openvino-embeddings`, optional reranker/classifier/GenAI worker).
|
||||
- Do not add the prototypes to the live gateway or primary routing during installation. Installation and routing are separate approval gates.
|
||||
```bash
|
||||
curl -fsS http://172.19.0.1:18830/healthz | jq .
|
||||
docker exec n8n-agent wget -qO- -T 8 http://172.19.0.1:18830/healthz
|
||||
```
|
||||
|
||||
User-systemd unit expectations for optional prototypes:
|
||||
- `WorkingDirectory` points at the service directory under `~/lab/swarm/`.
|
||||
- `ExecStart` uses the existing venv path documented by the prototype.
|
||||
- `Environment` pins host to `127.0.0.1`, port, model path, device `NPU`, and any upstream endpoint.
|
||||
- `Restart=on-failure`, not aggressive restart loops.
|
||||
- Logs go to user journal; do not log raw request bodies.
|
||||
- Start manually for smoke; enable on boot only after Will approval.
|
||||
Approval gate:
|
||||
- Classification/generation/triage POSTs are advisory only and may write metadata counters. Do not wire outputs to sends, restarts, memory writes, tool execution, or Atlas/Hermes routing without a separate reviewed approval.
|
||||
|
||||
Compose expectations for existing swarm services:
|
||||
- Prefer `cd ~/lab/swarm && make ps`, `make status`, and targeted `docker compose ps <service>` for read-only checks.
|
||||
- Do not run `docker compose up -d`, restart containers, pull images, or prune volumes from this runbook without approval.
|
||||
## Approval-gated / not-live integrations
|
||||
|
||||
The following remain closed even though dry-run examples and local specialists exist:
|
||||
|
||||
| Integration | Current gate |
|
||||
| --- | --- |
|
||||
| Primary Atlas/Hermes routing changes | closed; no live routing authority changes from this program slice |
|
||||
| Memory writes from NPU classifier/GenAI/advisory gateway | closed |
|
||||
| Telegram/Discord/email/outbound sends from cron/n8n/voice/advisory output | closed |
|
||||
| Service restarts or tool execution triggered by classifier/gateway output | closed |
|
||||
| Automatic Kanban task mutation, assignment, block/unblock, completion, or task creation | closed |
|
||||
| Broad private document/image/audio root processing | closed; only explicit approved/narrow roots |
|
||||
| Vector DB mutation/reindex or Chroma collection replacement | closed |
|
||||
| Wildcard binds or broader exposure for new services | closed |
|
||||
| GenAI worker as primary chat model | closed; bounded local drafts only |
|
||||
| Diffusion/image generation on the NPU | rejected/parked for this program |
|
||||
|
||||
## Monitoring and logging notes
|
||||
|
||||
Minimum recurring monitoring should include:
|
||||
- Listener presence for `18816`, `18817`, and any approved optional prototype ports.
|
||||
- User service state for `openvino-embeddings.service` and any approved optional prototype unit.
|
||||
- Docker Compose health for `whisper-server-npu`.
|
||||
- Listener presence for live baseline and any approved specialist ports.
|
||||
- User service state for OpenVINO services and Docker Compose health for `whisper-server-npu`.
|
||||
- HTTP health endpoint success.
|
||||
- Positive sysfs NPU busy-time delta on at least one non-private inference probe, preferably embeddings `:18817` because it is already live and central.
|
||||
- Journal/container logs only at summary level. Avoid raw prompts, raw OCR text, private document names, credentials, and API keys.
|
||||
- Compact counts/deltas/gates only. Avoid raw prompts, transcripts, OCR text, private document names, credentials, and API keys.
|
||||
|
||||
Useful log commands:
|
||||
|
||||
@@ -264,23 +261,14 @@ journalctl --user -u rag-embedding-health.service -n 100 --no-pager
|
||||
journalctl --user -u openvino-reranker.service -n 100 --no-pager
|
||||
journalctl --user -u openvino-router-classifier.service -n 100 --no-pager
|
||||
journalctl --user -u openvino-genai-npu-worker.service -n 100 --no-pager
|
||||
journalctl --user -u openvino-advisory-gateway.service -n 100 --no-pager
|
||||
cd ~/lab/swarm && docker compose logs --tail 100 whisper-server-npu
|
||||
```
|
||||
|
||||
## Approval gates
|
||||
## Approved/parked outcomes
|
||||
|
||||
Requires explicit Will approval before proceeding:
|
||||
- Installing, enabling, or autostarting `openvino-reranker.service`, `openvino-router-classifier.service`, or `openvino-genai-npu-worker.service`.
|
||||
- Assigning a final persistent port to document/image triage or enabling it as a persistent service.
|
||||
- Enabling live RAG reranking or any request path that changes Atlas/RAG answers.
|
||||
- Changing primary Atlas/Hermes routing or connecting router/classifier outputs to live decisions.
|
||||
- Connecting the GenAI worker to primary Atlas chat, gateway routing, memory writes, or outbound notifications.
|
||||
- Restarting the live Atlas/Hermes gateway.
|
||||
- Deleting, overwriting, or in-place reindexing existing vector collections.
|
||||
- Broadening bind addresses or exposure beyond local-only defaults.
|
||||
|
||||
Approved/parked outcomes:
|
||||
- Built/approved prototypes: reranker (`:18818`), router/classifier (`:18819`), small GenAI worker (`:18820`), document/image triage (review ports `:18828`/`:18829`).
|
||||
- Live baseline retained: Whisper NPU (`:18816`), OpenVINO embeddings (`:18817`), RAG endpoint (`:18810`) using `obsidian_bge_npu`.
|
||||
- Live baseline retained: RAG endpoint (`:18810`), RAG health wrapper (`:18814`), Whisper NPU (`:18816`), OpenVINO embeddings (`:18817`).
|
||||
- Live local-only advisory/reflex specialists: reranker (`:18818`), router/classifier (`:18819`), GenAI worker (`:18820`), doc/image triage (`:18829`), advisory gateway bridge (`172.19.0.1:18830`).
|
||||
- Approved dry-run examples: utilization digest, context gate plan, cron/n8n advisory classifier, explicit-root batch triage, local-file voice/audio pipeline, Kanban hygiene advisory.
|
||||
- Parked: always-on wake-word/audio and conventional vision detection until Will wants a concrete use case.
|
||||
- Rejected for this NPU program: diffusion/image generation.
|
||||
|
||||
@@ -0,0 +1,200 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import socket
|
||||
import subprocess
|
||||
import sys
|
||||
import threading
|
||||
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parents[1]
|
||||
if str(REPO_ROOT) not in sys.path:
|
||||
sys.path.insert(0, str(REPO_ROOT))
|
||||
|
||||
from openvino_context_gate.context_gate import ( # noqa: E402
|
||||
AUTHORITY,
|
||||
ClassifierResult,
|
||||
ContextGateError,
|
||||
build_plan,
|
||||
classify_live,
|
||||
compact_json,
|
||||
compact_line,
|
||||
)
|
||||
|
||||
|
||||
def fake_classifier(
|
||||
labels: dict,
|
||||
*,
|
||||
endpoint_delta: int | None = 120,
|
||||
sysfs_delta: int | None = 120,
|
||||
outer_delta: int | None = 80,
|
||||
) -> ClassifierResult:
|
||||
return ClassifierResult(
|
||||
labels=labels,
|
||||
npu_busy_delta_us=endpoint_delta,
|
||||
sysfs_npu_busy_delta_us=sysfs_delta,
|
||||
outer_sysfs_delta_us=outer_delta,
|
||||
live=True,
|
||||
)
|
||||
|
||||
|
||||
def labels(category: str, *, tool: bool = False, safety: bool = False, memory: str = "none") -> dict:
|
||||
return {
|
||||
"tool_needed": {"value": tool, "confidence": 0.8 if tool else 0.4},
|
||||
"memory_candidate": {"value": memory, "confidence": 0.8 if memory != "none" else 0.3},
|
||||
"urgency": {"value": "normal", "confidence": 0.6},
|
||||
"workflow_category": {"value": category, "confidence": 0.86},
|
||||
"safety_confirmation_required": {"value": safety, "confidence": 0.9 if safety else 0.1},
|
||||
}
|
||||
|
||||
|
||||
def test_current_npu_debug_query_selects_ops_live_and_repo_sources() -> None:
|
||||
plan = build_plan(
|
||||
"How do I check whether the RAG reranker is using the NPU?",
|
||||
context={"platform": "cli", "repo_path": "/home/will/lab/swarm"},
|
||||
classifier=fake_classifier(labels("devops", tool=True)),
|
||||
)
|
||||
assert plan["schema"] == "atlas_context_gate_plan_v1"
|
||||
assert plan["bundle_plan"]["bundle_name"] == "OpsDebugBundle"
|
||||
assert [s["source"] for s in plan["source_plan"]][:2] == ["live_system", "repo_files"]
|
||||
assert plan["npu_proof"]["verified"] is True
|
||||
assert plan["authority"] == AUTHORITY
|
||||
assert all(value.startswith("closed_") for value in plan["gates"].values())
|
||||
|
||||
|
||||
def test_prior_plan_query_uses_session_or_rag_and_coding_for_kanban() -> None:
|
||||
plan = build_plan(
|
||||
"Where did we leave the NPU context gate implementation plan?",
|
||||
context={"platform": "kanban", "task_id": "t_example", "repo_path": "/home/will/lab/swarm"},
|
||||
classifier=fake_classifier(labels("coding", tool=True)),
|
||||
)
|
||||
sources = [s["source"] for s in plan["source_plan"]]
|
||||
assert plan["bundle_plan"]["bundle_name"] == "CodingTaskBundle"
|
||||
assert "repo_files" in sources
|
||||
assert "session_search" in sources
|
||||
assert "rag_search" in sources
|
||||
|
||||
|
||||
def test_simple_creative_query_no_retrieval_offline_no_npu_claim() -> None:
|
||||
plan = build_plan("Write a haiku about Seattle rain.")
|
||||
assert plan["bundle_plan"]["bundle_name"] == "SimpleResponseBundle"
|
||||
assert [s["source"] for s in plan["source_plan"]] == ["no_retrieval"]
|
||||
assert plan["npu_proof"]["verified"] is False
|
||||
assert "npu_proof_inconclusive" in plan["warnings"]
|
||||
assert "offline_heuristic_classifier_no_npu_claim" in plan["warnings"]
|
||||
|
||||
|
||||
def test_unsafe_live_routing_request_keeps_authority_closed_and_blocks_side_effect() -> None:
|
||||
plan = build_plan(
|
||||
"Change Hermes live routing to use the classifier automatically.",
|
||||
context={"repo_path": "/home/will/lab/swarm"},
|
||||
classifier=fake_classifier(labels("coding", tool=True, safety=True)),
|
||||
)
|
||||
assert plan["authority"] == AUTHORITY
|
||||
assert plan["authority"]["may_route"] is False
|
||||
assert any(field["field"] == "authority_side_effect" for field in plan["bundle_plan"]["blocked_fields"])
|
||||
assert plan["gates"]["live_routing_change"] == "closed_requires_explicit_approval"
|
||||
|
||||
|
||||
def test_rejects_non_dry_run_and_private_text_options() -> None:
|
||||
with pytest.raises(ContextGateError, match="dry_run_must_remain_true"):
|
||||
build_plan("hello", options={"dry_run": False})
|
||||
with pytest.raises(ContextGateError, match="include_private_text"):
|
||||
build_plan("hello", options={"include_private_text": True})
|
||||
|
||||
|
||||
def test_compact_outputs_are_small_and_parseable() -> None:
|
||||
plan = build_plan("How do I check whether port 18819 is healthy?")
|
||||
line = compact_line(plan)
|
||||
assert "schema=atlas_context_gate_plan_v1" in line
|
||||
assert "gates=closed:" in line
|
||||
parsed = json.loads(compact_json(plan))
|
||||
assert parsed["schema"] == "atlas_context_gate_plan_v1"
|
||||
assert isinstance(parsed["sources"], list)
|
||||
assert "authority" in parsed
|
||||
|
||||
|
||||
def test_cli_offline_compact_json_smoke() -> None:
|
||||
script = REPO_ROOT / "scripts" / "context-gate-advisory.py"
|
||||
result = subprocess.run(
|
||||
[sys.executable, str(script), "--offline", "--query", "Write a haiku about Seattle rain.", "--format", "compact-json"],
|
||||
check=True,
|
||||
text=True,
|
||||
capture_output=True,
|
||||
cwd=REPO_ROOT,
|
||||
)
|
||||
parsed = json.loads(result.stdout)
|
||||
assert parsed["ok"] is True
|
||||
assert parsed["bundle_name"] == "SimpleResponseBundle"
|
||||
assert parsed["sources"] == ["no_retrieval"]
|
||||
assert parsed["npu_proof"]["verified"] is False
|
||||
|
||||
|
||||
def test_npu_proof_requires_positive_sysfs_delta() -> None:
|
||||
classifier = fake_classifier(labels("devops", tool=True), endpoint_delta=120, sysfs_delta=0, outer_delta=None)
|
||||
plan = build_plan("How do I check whether the RAG reranker is using the NPU?", classifier=classifier)
|
||||
assert plan["npu_proof"]["verified"] is False
|
||||
assert "npu_proof_inconclusive" in plan["warnings"]
|
||||
|
||||
endpoint_sysfs_plan = build_plan(
|
||||
"How do I check whether the RAG reranker is using the NPU?",
|
||||
classifier=fake_classifier(labels("devops", tool=True), endpoint_delta=120, sysfs_delta=1, outer_delta=None),
|
||||
)
|
||||
assert endpoint_sysfs_plan["npu_proof"]["verified"] is True
|
||||
|
||||
outer_sysfs_plan = build_plan(
|
||||
"How do I check whether the RAG reranker is using the NPU?",
|
||||
classifier=fake_classifier(labels("devops", tool=True), endpoint_delta=120, sysfs_delta=0, outer_delta=1),
|
||||
)
|
||||
assert outer_sysfs_plan["npu_proof"]["verified"] is True
|
||||
|
||||
|
||||
def test_classifier_url_must_be_loopback_or_localhost() -> None:
|
||||
for url in [
|
||||
"http://example.com/v1/classify",
|
||||
"https://10.0.0.5/v1/classify",
|
||||
"http://0.0.0.0:18819/v1/classify",
|
||||
"ftp://127.0.0.1/v1/classify",
|
||||
]:
|
||||
with pytest.raises(ContextGateError, match="invalid_classifier_url"):
|
||||
classify_live("hello", classifier_url=url, timeout=0.01)
|
||||
|
||||
|
||||
def test_classifier_url_redirect_to_non_loopback_is_not_followed(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
requests: list[str] = []
|
||||
|
||||
class RedirectHandler(BaseHTTPRequestHandler):
|
||||
def do_POST(self) -> None: # noqa: N802 - stdlib callback name
|
||||
requests.append(self.path)
|
||||
self.send_response(302)
|
||||
self.send_header("Location", "http://example.com/v1/classify")
|
||||
self.end_headers()
|
||||
|
||||
def log_message(self, format: str, *args: object) -> None:
|
||||
return
|
||||
|
||||
original_create_connection = socket.create_connection
|
||||
|
||||
def guarded_create_connection(address, *args, **kwargs): # type: ignore[no-untyped-def]
|
||||
host = address[0]
|
||||
if host not in {"127.0.0.1", "localhost"}:
|
||||
raise AssertionError(f"attempted non-loopback redirect connection to {host}")
|
||||
return original_create_connection(address, *args, **kwargs)
|
||||
|
||||
server = ThreadingHTTPServer(("127.0.0.1", 0), RedirectHandler)
|
||||
thread = threading.Thread(target=server.serve_forever, daemon=True)
|
||||
thread.start()
|
||||
monkeypatch.setattr(socket, "create_connection", guarded_create_connection)
|
||||
try:
|
||||
url = f"http://127.0.0.1:{server.server_port}/v1/classify"
|
||||
with pytest.raises(ContextGateError, match="classifier_unavailable"):
|
||||
classify_live("hello", classifier_url=url, timeout=1.0)
|
||||
finally:
|
||||
server.shutdown()
|
||||
server.server_close()
|
||||
thread.join(timeout=2)
|
||||
|
||||
assert requests == ["/v1/classify"]
|
||||
@@ -0,0 +1,198 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
import json
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import unittest
|
||||
from pathlib import Path
|
||||
from typing import cast
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
MODULE_PATH = ROOT / "scripts" / "kanban-hygiene-advisory.py"
|
||||
|
||||
|
||||
def load_module():
|
||||
spec = importlib.util.spec_from_file_location("kanban_hygiene_advisory", MODULE_PATH)
|
||||
assert spec and spec.loader
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
sys.modules[spec.name] = module
|
||||
spec.loader.exec_module(module)
|
||||
return module
|
||||
|
||||
|
||||
def task(task_id: str, title: str, status: str = "ready", **extra):
|
||||
row = {
|
||||
"id": task_id,
|
||||
"title": title,
|
||||
"status": status,
|
||||
"assignee": "engineer",
|
||||
"created_at": 1_780_000_000,
|
||||
"updated_at": 1_780_000_100,
|
||||
}
|
||||
row.update(extra)
|
||||
return row
|
||||
|
||||
|
||||
class KanbanHygieneAdvisoryTests(unittest.TestCase):
|
||||
def setUp(self):
|
||||
self.mod = load_module()
|
||||
|
||||
def advisory(self, tasks, now=1_780_003_600):
|
||||
return self.mod.advisory(
|
||||
tasks,
|
||||
board="npu-maximization",
|
||||
now=now,
|
||||
input_metadata={},
|
||||
include_evidence=False,
|
||||
)
|
||||
|
||||
def test_output_contract_and_authority_flags_are_all_false(self):
|
||||
output = self.advisory([
|
||||
task("t_spec", "spec: Kanban/task hygiene classifier", body_excerpt="Define dry-run labels and next gate.")
|
||||
])
|
||||
self.assertEqual(output["schema"], "kanban_hygiene_advisory_v1")
|
||||
self.assertTrue(output["dry_run"])
|
||||
self.assertEqual(output["counts"]["tasks"], 1)
|
||||
self.assertTrue(output["npu_proof"]["required_for_npu_claims"])
|
||||
self.assertFalse(output["npu_proof"]["attempted"])
|
||||
self.assertTrue(output["authority"])
|
||||
self.assertTrue(all(value is False for value in output["authority"].values()))
|
||||
|
||||
def test_required_labels_and_kanban_lane_gate(self):
|
||||
output = self.advisory([
|
||||
task("t1", "spec: Kanban/task hygiene classifier", body_excerpt="Read board summaries and suggest review-needed next gate labels.")
|
||||
])
|
||||
item = output["items"][0]
|
||||
for key in ["task_type", "project", "lane", "blocker", "staleness", "duplicate", "review_needed", "next_gate"]:
|
||||
self.assertIn(key, item)
|
||||
self.assertEqual(item["task_type"]["value"], "spec")
|
||||
self.assertEqual(item["project"]["value"], "npu-maximization")
|
||||
self.assertEqual(item["lane"]["value"], "kanban_hygiene")
|
||||
self.assertEqual(item["next_gate"]["value"], "ready_for_implementation")
|
||||
|
||||
def test_lifecycle_chain_is_not_duplicate_even_with_same_normalized_title(self):
|
||||
rows = [
|
||||
task("t_spec", "spec: Kanban hygiene advisory", children=["t_impl"]),
|
||||
task("t_impl", "implement: Kanban hygiene advisory", parents=["t_spec"], children=["t_review"]),
|
||||
task("t_review", "review: Kanban hygiene advisory", parents=["t_impl"]),
|
||||
]
|
||||
output = self.advisory(rows)
|
||||
self.assertEqual(output["counts"]["duplicates"], 0)
|
||||
self.assertTrue(all(not item["duplicate"]["is_duplicate"] for item in output["items"]))
|
||||
|
||||
def test_duplicate_same_type_lane_and_normalized_title_is_flagged(self):
|
||||
rows = [
|
||||
task("t_a", "implement: dry-run Kanban hygiene advisory", body_excerpt="Kanban board summaries"),
|
||||
task("t_b", "implement: dry run kanban hygiene advisory", body_excerpt="Kanban board summaries"),
|
||||
]
|
||||
output = self.advisory(rows)
|
||||
self.assertEqual(output["counts"]["duplicates"], 1)
|
||||
dupes = [item for item in output["items"] if item["duplicate"]["is_duplicate"]]
|
||||
self.assertEqual(len(dupes), 1)
|
||||
self.assertEqual(dupes[0]["next_gate"]["value"], "dedupe_review")
|
||||
|
||||
def test_staleness_is_deterministic_with_now(self):
|
||||
output = self.advisory([
|
||||
task("t_run", "implement: NPU service", status="running", updated_at=1_780_000_000, heartbeat_at=1_780_000_000)
|
||||
], now=1_780_007_201)
|
||||
item = output["items"][0]
|
||||
self.assertEqual(item["staleness"]["value"], "stale_lock")
|
||||
self.assertEqual(item["next_gate"]["value"], "investigate_stale_lock")
|
||||
self.assertEqual(output["counts"]["stale"], 1)
|
||||
|
||||
def test_review_required_marker_sets_ready_for_review(self):
|
||||
output = self.advisory([
|
||||
task(
|
||||
"t_impl",
|
||||
"implement: dry-run Kanban hygiene advisory",
|
||||
status="blocked",
|
||||
body_excerpt="review-required: code change needs review",
|
||||
changed_files=["scripts/kanban-hygiene-advisory.py"],
|
||||
tests_run=8,
|
||||
)
|
||||
])
|
||||
item = output["items"][0]
|
||||
self.assertTrue(item["review_needed"]["value"])
|
||||
self.assertEqual(item["review_needed"]["kind"], "code_change")
|
||||
self.assertEqual(item["next_gate"]["value"], "ready_for_review")
|
||||
|
||||
def test_missing_parent_waits_without_marking_blocked(self):
|
||||
output = self.advisory([
|
||||
task("t_child", "implement: context gate", status="todo", parents=["t_parent"], body_excerpt="RAG context gate")
|
||||
])
|
||||
item = output["items"][0]
|
||||
self.assertEqual(item["blocker"]["value"], "missing_parent")
|
||||
self.assertFalse(item["blocker"]["blocked"])
|
||||
self.assertEqual(item["next_gate"]["value"], "wait_for_parents")
|
||||
|
||||
def test_npu_claim_without_busy_delta_routes_to_proof_gate(self):
|
||||
for excerpt in [
|
||||
"NPU classifier returned HTTP 200 but missing busy delta evidence",
|
||||
"NPU reranker reported npu_busy_delta_us=0",
|
||||
"NPU reranker reported npu_busy_delta_us=-5",
|
||||
"NPU reranker reported npu_busy_delta_us=-0.1",
|
||||
]:
|
||||
with self.subTest(excerpt=excerpt):
|
||||
output = self.advisory([task("t_npu", "test: NPU classifier smoke", body_excerpt=excerpt)])
|
||||
item = output["items"][0]
|
||||
self.assertTrue(item["review_needed"]["value"])
|
||||
self.assertEqual(item["review_needed"]["kind"], "npu_proof_gate")
|
||||
self.assertEqual(item["next_gate"]["value"], "needs_npu_proof")
|
||||
|
||||
def test_npu_proof_gate_dominates_review_required_marker(self):
|
||||
for excerpt in [
|
||||
"review-required: NPU reranker reported npu_busy_delta_us=0 after smoke",
|
||||
"review-required: NPU classifier returned HTTP 200 but missing busy delta evidence",
|
||||
]:
|
||||
with self.subTest(excerpt=excerpt):
|
||||
output = self.advisory([
|
||||
task(
|
||||
"t_npu_review",
|
||||
"implement: NPU classifier smoke",
|
||||
status="blocked",
|
||||
body_excerpt=excerpt,
|
||||
changed_files=["scripts/npu-classifier.py"],
|
||||
tests_run=1,
|
||||
)
|
||||
])
|
||||
item = output["items"][0]
|
||||
self.assertTrue(item["review_needed"]["value"])
|
||||
self.assertEqual(item["review_needed"]["kind"], "npu_proof_gate")
|
||||
self.assertEqual(item["next_gate"]["value"], "needs_npu_proof")
|
||||
|
||||
def test_cli_accepts_jsonl_auto_format_and_invalid_schema_exits_nonzero(self):
|
||||
good_rows = [
|
||||
json.dumps(task("t1", "docs: service map update", body_excerpt="runbook README")),
|
||||
json.dumps(task("t2", "ops: utilization digest", body_excerpt="health metrics digest")),
|
||||
]
|
||||
with tempfile.NamedTemporaryFile("w", suffix=".jsonl", delete=False) as handle:
|
||||
handle.write("\n".join(good_rows))
|
||||
good_path = handle.name
|
||||
try:
|
||||
result = subprocess.run(
|
||||
[sys.executable, str(MODULE_PATH), "--input", good_path, "--board", "npu-maximization", "--now", "1780003600"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
finally:
|
||||
Path(good_path).unlink(missing_ok=True)
|
||||
self.assertEqual(result.returncode, 0, result.stderr)
|
||||
parsed = json.loads(result.stdout)
|
||||
self.assertEqual(parsed["counts"]["tasks"], 2)
|
||||
|
||||
bad = subprocess.run(
|
||||
[sys.executable, str(MODULE_PATH)],
|
||||
input=json.dumps({"tasks": [{"id": "missing-fields"}]}),
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
self.assertNotEqual(bad.returncode, 0)
|
||||
self.assertIn("missing required fields", bad.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
@@ -0,0 +1,202 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
import yaml
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
SCRIPT = ROOT / "scripts" / "npu-batch-triage-dry-run.py"
|
||||
|
||||
|
||||
def load_script_module():
|
||||
spec = importlib.util.spec_from_file_location("npu_batch_triage_dry_run", SCRIPT)
|
||||
assert spec and spec.loader
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(module)
|
||||
return module
|
||||
|
||||
|
||||
def run_cli(*args: str) -> subprocess.CompletedProcess[str]:
|
||||
return subprocess.run(
|
||||
[sys.executable, str(SCRIPT), *args],
|
||||
cwd=ROOT,
|
||||
text=True,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.PIPE,
|
||||
check=False,
|
||||
)
|
||||
|
||||
|
||||
def manifest(tmp_path: Path, lane: str = "screenshots", approved: bool = True, root: Path | None = None, exts: list[str] | None = None) -> Path:
|
||||
root = root or tmp_path / "approved"
|
||||
root.mkdir(parents=True, exist_ok=True)
|
||||
data = {
|
||||
"version": 1,
|
||||
"policy": {
|
||||
"default_mode": "dry_run",
|
||||
"require_explicit_root": True,
|
||||
"allow_external_uploads": False,
|
||||
"allow_mutations": False,
|
||||
"log_raw_text": False,
|
||||
},
|
||||
"roots": {
|
||||
lane: {
|
||||
"approved": approved,
|
||||
"root": str(root),
|
||||
"allowed_extensions": exts or [".png", ".txt", ".wav"],
|
||||
"max_files": 10,
|
||||
"max_file_mb": 5,
|
||||
}
|
||||
},
|
||||
}
|
||||
path = tmp_path / "triage-roots.yaml"
|
||||
path.write_text(yaml.safe_dump(data))
|
||||
return path
|
||||
|
||||
|
||||
def test_missing_manifest_fails_closed(tmp_path: Path) -> None:
|
||||
result = run_cli("--manifest", str(tmp_path / "missing.yaml"), "--lane", "screenshots", "--dry-run", "--no-npu", "--json")
|
||||
out = json.loads(result.stdout)
|
||||
assert result.returncode == 2
|
||||
assert out["ok"] is False
|
||||
assert out["error"] == "fail_closed"
|
||||
assert "manifest_missing" in out["reason"]
|
||||
assert all(v is False for v in out["mutations"].values())
|
||||
|
||||
|
||||
def test_unapproved_lane_fails_closed(tmp_path: Path) -> None:
|
||||
man = manifest(tmp_path, approved=False)
|
||||
result = run_cli("--manifest", str(man), "--lane", "screenshots", "--dry-run", "--no-npu", "--json")
|
||||
out = json.loads(result.stdout)
|
||||
assert result.returncode == 2
|
||||
assert out["ok"] is False
|
||||
assert "lane_unapproved:screenshots" in out["reason"]
|
||||
|
||||
|
||||
def test_request_root_cannot_broaden_manifest_root(tmp_path: Path) -> None:
|
||||
approved = tmp_path / "approved" / "narrow"
|
||||
man = manifest(tmp_path, root=approved)
|
||||
broad = tmp_path / "approved"
|
||||
result = run_cli("--manifest", str(man), "--lane", "screenshots", "--root", str(broad), "--dry-run", "--no-npu", "--json")
|
||||
out = json.loads(result.stdout)
|
||||
assert result.returncode == 2
|
||||
assert out["ok"] is False
|
||||
assert "request_root_broadens_approval:screenshots" in out["reason"]
|
||||
|
||||
|
||||
def test_symlink_escape_is_skipped_and_output_redacted(tmp_path: Path) -> None:
|
||||
approved = tmp_path / "approved"
|
||||
approved.mkdir()
|
||||
outside = tmp_path / "outside"
|
||||
outside.mkdir()
|
||||
(approved / "note.png").write_bytes(b"fake image")
|
||||
(approved / "note.png.txt").write_text("Invoice payment due 2026-06-10 $42.00")
|
||||
(outside / "secret.png").write_bytes(b"secret")
|
||||
os.symlink(outside / "secret.png", approved / "escape.png")
|
||||
man = manifest(tmp_path, root=approved, exts=[".png"])
|
||||
|
||||
result = run_cli("--manifest", str(man), "--lane", "screenshots", "--dry-run", "--no-npu", "--json")
|
||||
out = json.loads(result.stdout)
|
||||
assert result.returncode == 0
|
||||
assert out["ok"] is True
|
||||
assert out["files_processed"] == 1
|
||||
assert out["skipped"]["symlink_escape"] == 1
|
||||
item = out["items"][0]
|
||||
assert item["basename"] == "note.png"
|
||||
assert item["raw_text_redacted"] is True
|
||||
assert item["full_path_included"] is False
|
||||
assert "full_path" not in item
|
||||
assert "Invoice" not in json.dumps(out)
|
||||
assert out["npu"]["claimed"] is False
|
||||
assert all(v is False for v in out["mutations"].values())
|
||||
|
||||
|
||||
def test_committed_sample_manifest_cpu_smoke() -> None:
|
||||
result = run_cli(
|
||||
"--manifest",
|
||||
"config/triage-roots.test.yaml",
|
||||
"--lane",
|
||||
"receipts",
|
||||
"--root",
|
||||
"openvino-doc-image-triage-npu/samples",
|
||||
"--limit",
|
||||
"2",
|
||||
"--dry-run",
|
||||
"--no-npu",
|
||||
"--json",
|
||||
)
|
||||
out = json.loads(result.stdout)
|
||||
assert result.returncode == 0
|
||||
assert out["ok"] is True
|
||||
assert out["lane"] == "receipts"
|
||||
assert out["dry_run"] is True
|
||||
assert out["files_processed"] == 2
|
||||
assert out["npu"] == {"busy_delta_us": None, "claimed": False, "proof_ok": False} or out["npu"]["claimed"] is False
|
||||
assert all(v is False for v in out["mutations"].values())
|
||||
assert all(item.get("raw_text_redacted", True) for item in out["items"])
|
||||
|
||||
|
||||
def test_audio_lane_no_npu_does_not_transcribe_or_claim(tmp_path: Path) -> None:
|
||||
approved = tmp_path / "voice"
|
||||
approved.mkdir()
|
||||
(approved / "memo.wav").write_bytes(b"not really wav; no-npu mode must not decode")
|
||||
man = manifest(tmp_path, lane="voice_memos", root=approved, exts=[".wav"])
|
||||
result = run_cli("--manifest", str(man), "--lane", "voice_memos", "--dry-run", "--no-npu", "--json")
|
||||
out = json.loads(result.stdout)
|
||||
assert result.returncode == 0
|
||||
item = out["items"][0]
|
||||
assert item["transcribed"] is False
|
||||
assert item["raw_transcript_logged"] is False
|
||||
assert out["npu"]["claimed"] is False
|
||||
assert all(v is False for v in out["mutations"].values())
|
||||
|
||||
|
||||
def test_external_whisper_url_fails_closed_before_audio_read(tmp_path: Path) -> None:
|
||||
module = load_script_module()
|
||||
with pytest.raises(module.FailClosed, match="whisper_url_not_loopback"):
|
||||
module.multipart_transcribe(tmp_path / "missing.wav", "http://example.com:18816/v1/audio/transcriptions", 0.01)
|
||||
|
||||
|
||||
def test_audio_lane_rejects_external_whisper_url(tmp_path: Path) -> None:
|
||||
approved = tmp_path / "voice"
|
||||
approved.mkdir()
|
||||
(approved / "memo.wav").write_bytes(b"synthetic audio bytes")
|
||||
man = manifest(tmp_path, lane="voice_memos", root=approved, exts=[".wav"])
|
||||
result = run_cli(
|
||||
"--manifest",
|
||||
str(man),
|
||||
"--lane",
|
||||
"voice_memos",
|
||||
"--dry-run",
|
||||
"--whisper-url",
|
||||
"https://example.com/v1/audio/transcriptions",
|
||||
"--json",
|
||||
)
|
||||
out = json.loads(result.stdout)
|
||||
assert result.returncode == 2
|
||||
assert out["ok"] is False
|
||||
assert out["error"] == "fail_closed"
|
||||
assert out["reason"] == "whisper_url_scheme_not_http"
|
||||
assert out["mutations"] == {
|
||||
"obsidian": False,
|
||||
"rag": False,
|
||||
"vector_db": False,
|
||||
"sends": False,
|
||||
"file_moves": False,
|
||||
"routing": False,
|
||||
"memory": False,
|
||||
"service_restarts": False,
|
||||
}
|
||||
|
||||
|
||||
def test_localhost_whisper_url_is_allowed() -> None:
|
||||
module = load_script_module()
|
||||
assert module.validate_local_whisper_url("http://localhost:18816/v1/audio/transcriptions")
|
||||
assert module.validate_local_whisper_url("http://127.0.0.1:18816/v1/audio/transcriptions")
|
||||
assert module.validate_local_whisper_url("http://[::1]:18816/v1/audio/transcriptions")
|
||||
@@ -0,0 +1,162 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
SCRIPT = Path(__file__).resolve().parents[1] / "scripts" / "npu-utilization-digest.py"
|
||||
spec = importlib.util.spec_from_file_location("npu_utilization_digest", SCRIPT)
|
||||
assert spec and spec.loader
|
||||
digest = importlib.util.module_from_spec(spec)
|
||||
sys.modules[spec.name] = digest
|
||||
spec.loader.exec_module(digest)
|
||||
|
||||
|
||||
def test_positive_sysfs_delta_required_for_proof(tmp_path, monkeypatch):
|
||||
busy = tmp_path / "busy"
|
||||
busy.write_text("100")
|
||||
|
||||
def fake_health(service, url, timeout, gate="none", mode="health_only"):
|
||||
return digest.ServiceRow(service=service, reachable=True, mode=mode, gate=gate), {"ok": True}
|
||||
|
||||
def fake_post(url, payload, timeout):
|
||||
busy.write_text("100")
|
||||
return 200, {"data": [{"embedding": [0.1, 0.2]}], "embedding_dim": 2, "npu_busy_delta_us": 7}
|
||||
|
||||
monkeypatch.setattr(digest, "health_row", fake_health)
|
||||
row = digest.probe_embeddings(1, busy_path=busy, post_json=fake_post)
|
||||
assert row.probe_ran is True
|
||||
assert row.proof_ok is False
|
||||
assert "no_positive_sysfs_delta" in row.warnings
|
||||
|
||||
|
||||
def test_embeddings_row_redacts_vectors(tmp_path):
|
||||
row = digest.ServiceRow(
|
||||
service="embeddings",
|
||||
reachable=True,
|
||||
probe_ran=True,
|
||||
proof_ok=True,
|
||||
calls=1,
|
||||
items=1,
|
||||
avg_ms=12.3,
|
||||
npu_delta_us=5,
|
||||
embedding_count=1,
|
||||
embedding_dim=3,
|
||||
mode="NPU",
|
||||
)
|
||||
summary = digest.build_summary([row], None, 5, "2026-06-05T14:20:00-07:00")
|
||||
text = digest.render_text(summary, [row])
|
||||
assert "embedding_count" not in text # counts are intentionally terse in text
|
||||
assert "0.1" not in text
|
||||
out = digest.write_jsonl(summary, [row], tmp_path)
|
||||
body = out.read_text()
|
||||
assert "embedding" in body # compact metadata key is okay
|
||||
assert "[0.1" not in body
|
||||
assert "embedding_dim" in body
|
||||
|
||||
|
||||
def test_classifier_dry_run_payload(tmp_path, monkeypatch):
|
||||
busy = tmp_path / "busy"
|
||||
busy.write_text("10")
|
||||
seen = {}
|
||||
|
||||
def fake_health(service, url, timeout, gate="none", mode="health_only"):
|
||||
return digest.ServiceRow(service=service, reachable=True, mode=mode, gate=gate), {"ok": True}
|
||||
|
||||
def fake_post(url, payload, timeout):
|
||||
seen.update(payload)
|
||||
busy.write_text("35")
|
||||
return 200, {"labels": {"tool_needed": True, "duplicate": False}, "npu_busy_delta_us": 25}
|
||||
|
||||
monkeypatch.setattr(digest, "health_row", fake_health)
|
||||
row = digest.probe_classifier(1, busy_path=busy, post_json=fake_post)
|
||||
assert seen["options"]["dry_run"] is True
|
||||
assert seen["options"]["include_evidence"] is False
|
||||
assert row.escalate == 1
|
||||
assert row.suppress == 0
|
||||
assert row.proof_ok is True
|
||||
|
||||
|
||||
def test_doc_triage_allowed_root_compact(tmp_path):
|
||||
row = digest.ServiceRow(
|
||||
service="doc_triage",
|
||||
reachable=True,
|
||||
probe_ran=True,
|
||||
proof_ok=True,
|
||||
files=1,
|
||||
avg_ms=9,
|
||||
npu_delta_us=11,
|
||||
allowed_roots_count=1,
|
||||
mode="NPU-via-embedding-service",
|
||||
gate="closed:private-root",
|
||||
)
|
||||
summary = digest.build_summary([row], None, 11, "2026-06-05T14:20:00-07:00")
|
||||
text = digest.render_text(summary, [row])
|
||||
assert "allowed_roots=1" in text
|
||||
assert str(tmp_path) not in text
|
||||
|
||||
|
||||
def test_advisory_gateway_health_only(monkeypatch):
|
||||
calls = []
|
||||
|
||||
def fake_get(url, timeout):
|
||||
return 200, {"ok": True}
|
||||
|
||||
def fake_post(url, payload, timeout):
|
||||
calls.append((url, payload))
|
||||
return 200, {}
|
||||
|
||||
monkeypatch.setattr(digest, "http_get_json", fake_get)
|
||||
monkeypatch.setattr(digest, "http_post_json", fake_post)
|
||||
row, _ = digest.health_row("advisory_gateway", digest.ADVISORY_HEALTH_URL, 1, gate="closed:advisory-post")
|
||||
assert row.reachable is True
|
||||
assert row.probe_ran is False
|
||||
assert row.mode == "health_only"
|
||||
assert calls == []
|
||||
|
||||
|
||||
def test_genai_loaded_false_skips_default_smoke(monkeypatch):
|
||||
def fake_health(service, url, timeout, gate="none", mode="health_only"):
|
||||
return digest.ServiceRow(service=service, reachable=True, mode=mode), {"ok": True, "loaded": False}
|
||||
|
||||
monkeypatch.setattr(digest, "health_row", fake_health)
|
||||
row = digest.probe_genai(1, include_smoke=False)
|
||||
assert row.probe_ran is False
|
||||
assert row.loaded is False
|
||||
assert row.reason == "skipped_cold_load"
|
||||
assert row.fallbacks == 1
|
||||
assert "skipped_cold_load" in row.warnings
|
||||
|
||||
|
||||
def test_disabled_proof_smokes_count_as_fallbacks(monkeypatch):
|
||||
def fake_health(service, url, timeout, gate="none", mode="health_only"):
|
||||
return digest.ServiceRow(service=service, reachable=True, mode=mode, gate=gate), {"ok": True, "loaded": True}
|
||||
|
||||
monkeypatch.setattr(digest, "health_row", fake_health)
|
||||
rows = [
|
||||
digest.probe_whisper(1, include_smoke=False),
|
||||
digest.probe_genai(1, include_smoke=False),
|
||||
digest.probe_doc_triage(1, include_smoke=False),
|
||||
]
|
||||
summary = digest.build_summary(rows, None, 0, "2026-06-05T14:20:00-07:00")
|
||||
assert summary["fallbacks"] == 3
|
||||
assert summary["warnings"] == {"skipped": 2, "smoke_disabled": 1}
|
||||
|
||||
|
||||
def test_jsonl_shape(tmp_path):
|
||||
rows = [digest.ServiceRow(service="embeddings", reachable=True, probe_ran=True, proof_ok=True, npu_delta_us=1)]
|
||||
summary = digest.build_summary(rows, None, 1, "2026-06-05T14:20:00-07:00")
|
||||
path = digest.write_jsonl(summary, rows, tmp_path)
|
||||
lines = [json.loads(line) for line in path.read_text().splitlines()]
|
||||
assert lines[0]["type"] == "summary"
|
||||
assert lines[1]["type"] == "service"
|
||||
assert lines[1]["service"] == "embeddings"
|
||||
|
||||
|
||||
def test_exit_codes(monkeypatch):
|
||||
rows = [digest.ServiceRow(service="embeddings", reachable=True, probe_ran=True, proof_ok=False, warnings=["no_positive_sysfs_delta"])]
|
||||
summary = digest.build_summary(rows, None, 0, "2026-06-05T14:20:00-07:00")
|
||||
monkeypatch.setattr(digest, "run", lambda args: (summary, rows))
|
||||
assert digest.main(["--no-write"]) == 0
|
||||
assert digest.main(["--no-write", "--strict-proof"]) == 2
|
||||
@@ -0,0 +1,170 @@
|
||||
import importlib.util
|
||||
import json
|
||||
import sys
|
||||
import types
|
||||
import unittest
|
||||
from argparse import Namespace
|
||||
from pathlib import Path
|
||||
from tempfile import TemporaryDirectory
|
||||
from typing import cast
|
||||
from unittest import mock
|
||||
|
||||
MODULE_PATH = Path(__file__).resolve().parents[1] / "scripts" / "npu_voice_audio_pipeline.py"
|
||||
|
||||
|
||||
def load_module():
|
||||
spec = importlib.util.spec_from_file_location("npu_voice_audio_pipeline", MODULE_PATH)
|
||||
assert spec is not None and spec.loader is not None
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
sys.modules[spec.name] = module
|
||||
spec.loader.exec_module(module)
|
||||
return cast(types.ModuleType, module)
|
||||
|
||||
|
||||
class NpuVoiceAudioPipelineTests(unittest.TestCase):
|
||||
def setUp(self):
|
||||
self.pipeline = load_module()
|
||||
|
||||
def test_rejects_relative_audio_path(self):
|
||||
with self.assertRaisesRegex(self.pipeline.PipelineError, "audio_path_must_be_absolute"):
|
||||
self.pipeline.validate_audio_path("memo.wav", max_bytes=1024, max_audio_seconds=300)
|
||||
|
||||
def test_rejects_symlink_audio_path(self):
|
||||
with TemporaryDirectory() as tmp:
|
||||
root = Path(tmp)
|
||||
target = root / "memo.wav"
|
||||
target.write_bytes(b"RIFFfake")
|
||||
link = root / "link.wav"
|
||||
link.symlink_to(target)
|
||||
with self.assertRaisesRegex(self.pipeline.PipelineError, "audio_path_must_not_be_symlink"):
|
||||
self.pipeline.validate_audio_path(str(link), max_bytes=1024, max_audio_seconds=None)
|
||||
|
||||
def test_compact_labels_unwraps_classifier_label_values(self):
|
||||
labels = self.pipeline.compact_labels(
|
||||
{
|
||||
"labels": {
|
||||
"workflow_category": {"value": "media"},
|
||||
"tool_needed": {"value": True},
|
||||
"urgency": {"value": "high"},
|
||||
"safety_confirmation_required": {"value": False},
|
||||
}
|
||||
}
|
||||
)
|
||||
self.assertEqual(labels["workflow_category"], "media")
|
||||
self.assertTrue(labels["tool_needed"])
|
||||
self.assertEqual(labels["urgency"], "high")
|
||||
self.assertFalse(labels["safety_confirmation_required"])
|
||||
|
||||
def test_gate_blocks_missing_npu_proof(self):
|
||||
action_worthy, atlas_gate, next_gate = self.pipeline.decide_gate(
|
||||
"remind me to review logs",
|
||||
{"tool_needed": True, "urgency": "normal", "safety_confirmation_required": False},
|
||||
whisper_proven=False,
|
||||
classifier_proven=True,
|
||||
)
|
||||
self.assertTrue(action_worthy)
|
||||
self.assertEqual(atlas_gate, "blocked_missing_npu_proof")
|
||||
self.assertEqual(next_gate, "npu_proof_required")
|
||||
|
||||
def test_loopback_endpoint_policy_accepts_local_urls(self):
|
||||
allowed = [
|
||||
"http://localhost:18816/v1/audio/transcriptions",
|
||||
"https://localhost:18816/v1/audio/transcriptions",
|
||||
"http://127.0.0.1:18816/v1/audio/transcriptions",
|
||||
"http://127.42.0.9:18816/v1/audio/transcriptions",
|
||||
"http://[::1]:18816/v1/audio/transcriptions",
|
||||
]
|
||||
for url in allowed:
|
||||
with self.subTest(url=url):
|
||||
self.assertEqual(self.pipeline.validate_loopback_endpoint(url, label="whisper"), url)
|
||||
|
||||
def test_loopback_endpoint_policy_rejects_remote_urls(self):
|
||||
rejected = [
|
||||
"http://example.com:18816/v1/audio/transcriptions",
|
||||
"https://10.0.0.5:18816/v1/audio/transcriptions",
|
||||
"http://192.168.1.10:18816/v1/audio/transcriptions",
|
||||
"http://[2001:db8::1]:18816/v1/audio/transcriptions",
|
||||
"file:///tmp/audio.wav",
|
||||
]
|
||||
for url in rejected:
|
||||
with self.subTest(url=url):
|
||||
with self.assertRaisesRegex(self.pipeline.PipelineError, "whisper_url_.*not_.*|whisper_url_scheme_not_allowed"):
|
||||
self.pipeline.validate_loopback_endpoint(url, label="whisper")
|
||||
|
||||
def test_run_pipeline_rejects_remote_url_before_audio_read(self):
|
||||
args = Namespace(
|
||||
audio="/tmp/does-not-exist-remote-rejection-smoke.ogg",
|
||||
id="voice-smoke",
|
||||
source="local_file",
|
||||
title="synthetic smoke",
|
||||
language="en",
|
||||
whisper_url="http://example.com:18816/v1/audio/transcriptions",
|
||||
classifier_url="http://127.0.0.1:18819/v1/classify",
|
||||
dry_run=True,
|
||||
include_transcript=False,
|
||||
include_transcript_preview_chars=0,
|
||||
include_raw=False,
|
||||
max_bytes=1024 * 1024,
|
||||
max_audio_seconds=300,
|
||||
max_transcript_chars=6000,
|
||||
timeout=1,
|
||||
)
|
||||
with self.assertRaisesRegex(self.pipeline.PipelineError, "whisper_url_host_not_loopback"):
|
||||
self.pipeline.run_pipeline(args)
|
||||
|
||||
def test_run_pipeline_compact_success_with_mocked_services(self):
|
||||
with TemporaryDirectory() as tmp:
|
||||
audio = Path(tmp) / "memo.ogg"
|
||||
audio.write_bytes(b"not-real-audio-but-services-are-mocked")
|
||||
args = Namespace(
|
||||
audio=str(audio),
|
||||
id="voice-smoke",
|
||||
source="local_file",
|
||||
title="synthetic smoke",
|
||||
language="en",
|
||||
whisper_url="http://127.0.0.1:18816/v1/audio/transcriptions",
|
||||
classifier_url="http://127.0.0.1:18819/v1/classify",
|
||||
dry_run=True,
|
||||
include_transcript=False,
|
||||
include_transcript_preview_chars=0,
|
||||
include_raw=False,
|
||||
max_bytes=1024 * 1024,
|
||||
max_audio_seconds=300,
|
||||
max_transcript_chars=6000,
|
||||
timeout=1,
|
||||
)
|
||||
busy_values = iter([100, 150, 150, 225])
|
||||
with mock.patch.object(self.pipeline, "read_npu_busy_us", side_effect=lambda: next(busy_values)):
|
||||
with mock.patch.object(
|
||||
self.pipeline,
|
||||
"post_whisper",
|
||||
return_value={"text": "remind me to check npu logs", "npu_busy_delta_us": 50},
|
||||
):
|
||||
with mock.patch.object(
|
||||
self.pipeline,
|
||||
"post_json",
|
||||
return_value={
|
||||
"dry_run": True,
|
||||
"labels": {
|
||||
"workflow_category": {"value": "media"},
|
||||
"tool_needed": {"value": True},
|
||||
"urgency": {"value": "normal"},
|
||||
"safety_confirmation_required": {"value": False},
|
||||
},
|
||||
"npu_busy_delta_us": 75,
|
||||
"sysfs_npu_busy_delta_us": 75,
|
||||
},
|
||||
):
|
||||
result = self.pipeline.run_pipeline(args)
|
||||
self.assertTrue(result["ok"])
|
||||
self.assertEqual(result["external_sends"], 0)
|
||||
self.assertEqual(result["writes"], 0)
|
||||
self.assertEqual(result["whisper_sysfs_delta_us"], 50)
|
||||
self.assertEqual(result["classifier_observed_sysfs_delta_us"], 75)
|
||||
self.assertEqual(result["atlas_gate"], "advisory_only_not_sent")
|
||||
self.assertNotIn("transcript", result)
|
||||
json.dumps(result)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
Reference in New Issue
Block a user