feat(npu): add local context gate advisory
This commit is contained in:
@@ -0,0 +1,89 @@
|
||||
# OpenVINO Context Gate
|
||||
|
||||
Local-only Atlas/Hermes context-gate advisory prototype.
|
||||
|
||||
This first slice is CLI-only and dry-run by design. It takes a non-private query,
|
||||
optionally asks the localhost classifier on `127.0.0.1:18819` for advisory labels,
|
||||
and emits a compact typed context bundle plan. It does not retrieve private
|
||||
content or change live Atlas/Hermes behavior.
|
||||
|
||||
## Safety invariants
|
||||
|
||||
Closed in v1:
|
||||
|
||||
- live Atlas/Hermes routing changes
|
||||
- memory writes
|
||||
- outbound sends
|
||||
- tool execution by the sidecar
|
||||
- service restarts
|
||||
- vector DB mutation or reindexing
|
||||
- private root broadening
|
||||
- live config changes
|
||||
|
||||
The CLI only plans which source classes an authoritative Atlas/Hermes agent might
|
||||
use later: `durable_memory`, `session_search`, `rag_search`, `repo_files`,
|
||||
`live_system`, `web`, or `no_retrieval`.
|
||||
|
||||
NPU proof is strict: `npu_verified=true` is only emitted when a live classifier
|
||||
request reports a positive endpoint NPU delta and a positive sysfs/endpoint sysfs
|
||||
busy delta. HTTP 200 alone is never treated as proof. Offline and fallback modes
|
||||
set `npu_verified=false` and include a warning.
|
||||
|
||||
## Usage
|
||||
|
||||
Live classifier path, with compact terminal output:
|
||||
|
||||
```bash
|
||||
python scripts/context-gate-advisory.py \
|
||||
--query "How do I check whether the RAG reranker is using the NPU?" \
|
||||
--format compact
|
||||
```
|
||||
|
||||
Deterministic offline smoke, safe for unit-test hosts without NPU services:
|
||||
|
||||
```bash
|
||||
python scripts/context-gate-advisory.py \
|
||||
--offline \
|
||||
--query "Write a haiku about Seattle rain." \
|
||||
--format compact-json
|
||||
```
|
||||
|
||||
Fallback plan if the classifier is down:
|
||||
|
||||
```bash
|
||||
python scripts/context-gate-advisory.py \
|
||||
--allow-offline-fallback \
|
||||
--query "Where did we leave the NPU context gate implementation plan?" \
|
||||
--context platform=kanban \
|
||||
--context repo_path=/home/will/lab/swarm \
|
||||
--format compact-json
|
||||
```
|
||||
|
||||
## Output shape
|
||||
|
||||
Full JSON includes:
|
||||
|
||||
- `schema=atlas_context_gate_plan_v1`
|
||||
- `dry_run=true`
|
||||
- `query_class`
|
||||
- `source_plan`
|
||||
- `bundle_plan`
|
||||
- `npu_proof`
|
||||
- closed `authority`
|
||||
- closed approval `gates`
|
||||
- compact `warnings`
|
||||
|
||||
Compact output intentionally avoids raw private snippets and raw JSON dumps:
|
||||
|
||||
```text
|
||||
ok=true schema=atlas_context_gate_plan_v1 bundle=OpsDebugBundle sources=live_system,repo_files,rag_search source_count=3 npu_verified=false classifier_delta_us=None outer_sysfs_delta_us=None gates=closed:route,memory,send,tools,restart,vector,private_roots,config warnings=offline_heuristic_classifier_no_npu_claim,npu_proof_inconclusive
|
||||
```
|
||||
|
||||
## Notes for reviewers
|
||||
|
||||
- No HTTP service or systemd unit is added in this slice.
|
||||
- The prototype does not call RAG, memory, session search, web, filesystem tools,
|
||||
or the advisory gateway. It only emits a plan.
|
||||
- Unit tests use fake/offline classifier results and do not require live NPU.
|
||||
- Optional live smoke may call only the local classifier endpoint and read
|
||||
`/sys/class/accel/accel0/device/npu_busy_time_us` for positive delta proof.
|
||||
Reference in New Issue
Block a user