90 lines
2.9 KiB
Markdown
90 lines
2.9 KiB
Markdown
# OpenVINO Context Gate
|
|
|
|
Local-only Atlas/Hermes context-gate advisory prototype.
|
|
|
|
This first slice is CLI-only and dry-run by design. It takes a non-private query,
|
|
optionally asks the localhost classifier on `127.0.0.1:18819` for advisory labels,
|
|
and emits a compact typed context bundle plan. It does not retrieve private
|
|
content or change live Atlas/Hermes behavior.
|
|
|
|
## Safety invariants
|
|
|
|
Closed in v1:
|
|
|
|
- live Atlas/Hermes routing changes
|
|
- memory writes
|
|
- outbound sends
|
|
- tool execution by the sidecar
|
|
- service restarts
|
|
- vector DB mutation or reindexing
|
|
- private root broadening
|
|
- live config changes
|
|
|
|
The CLI only plans which source classes an authoritative Atlas/Hermes agent might
|
|
use later: `durable_memory`, `session_search`, `rag_search`, `repo_files`,
|
|
`live_system`, `web`, or `no_retrieval`.
|
|
|
|
NPU proof is strict: `npu_verified=true` is only emitted when a live classifier
|
|
request reports a positive endpoint NPU delta and a positive sysfs/endpoint sysfs
|
|
busy delta. HTTP 200 alone is never treated as proof. Offline and fallback modes
|
|
set `npu_verified=false` and include a warning.
|
|
|
|
## Usage
|
|
|
|
Live classifier path, with compact terminal output:
|
|
|
|
```bash
|
|
python scripts/context-gate-advisory.py \
|
|
--query "How do I check whether the RAG reranker is using the NPU?" \
|
|
--format compact
|
|
```
|
|
|
|
Deterministic offline smoke, safe for unit-test hosts without NPU services:
|
|
|
|
```bash
|
|
python scripts/context-gate-advisory.py \
|
|
--offline \
|
|
--query "Write a haiku about Seattle rain." \
|
|
--format compact-json
|
|
```
|
|
|
|
Fallback plan if the classifier is down:
|
|
|
|
```bash
|
|
python scripts/context-gate-advisory.py \
|
|
--allow-offline-fallback \
|
|
--query "Where did we leave the NPU context gate implementation plan?" \
|
|
--context platform=kanban \
|
|
--context repo_path=/home/will/lab/swarm \
|
|
--format compact-json
|
|
```
|
|
|
|
## Output shape
|
|
|
|
Full JSON includes:
|
|
|
|
- `schema=atlas_context_gate_plan_v1`
|
|
- `dry_run=true`
|
|
- `query_class`
|
|
- `source_plan`
|
|
- `bundle_plan`
|
|
- `npu_proof`
|
|
- closed `authority`
|
|
- closed approval `gates`
|
|
- compact `warnings`
|
|
|
|
Compact output intentionally avoids raw private snippets and raw JSON dumps:
|
|
|
|
```text
|
|
ok=true schema=atlas_context_gate_plan_v1 bundle=OpsDebugBundle sources=live_system,repo_files,rag_search source_count=3 npu_verified=false classifier_delta_us=None outer_sysfs_delta_us=None gates=closed:route,memory,send,tools,restart,vector,private_roots,config warnings=offline_heuristic_classifier_no_npu_claim,npu_proof_inconclusive
|
|
```
|
|
|
|
## Notes for reviewers
|
|
|
|
- No HTTP service or systemd unit is added in this slice.
|
|
- The prototype does not call RAG, memory, session search, web, filesystem tools,
|
|
or the advisory gateway. It only emits a plan.
|
|
- Unit tests use fake/offline classifier results and do not require live NPU.
|
|
- Optional live smoke may call only the local classifier endpoint and read
|
|
`/sys/class/accel/accel0/device/npu_busy_time_us` for positive delta proof.
|