257 lines
10 KiB
Markdown
257 lines
10 KiB
Markdown
# Cron and n8n advisory classifier contract
|
|
|
|
Status: dry-run specification and integration examples
|
|
Scope: cron and n8n alert/event classification through the OpenVINO advisory gateway
|
|
Gateway: `http://172.19.0.1:18830` from `n8n-agent` and host-local cron on the current bridge-bound service. Override `NPU_ADVISORY_GATEWAY_URL=http://127.0.0.1:18830` only if a localhost-bound instance is explicitly running.
|
|
|
|
## Authority boundary
|
|
|
|
This contract is advisory only. It may recommend one of `suppress`, `log`, `summarize`, or `escalate`, but it must not perform the action itself.
|
|
|
|
Every integration must preserve these authority flags:
|
|
|
|
```json
|
|
{
|
|
"may_route": false,
|
|
"may_write_memory": false,
|
|
"may_send_external": false,
|
|
"may_process_private_dirs": false,
|
|
"may_execute_tools": false,
|
|
"may_restart_services": false
|
|
}
|
|
```
|
|
|
|
Allowed side effects in dry-run mode:
|
|
|
|
- read an explicit cron/n8n event payload;
|
|
- call the advisory gateway classifier/generator;
|
|
- write compact local stdout or n8n execution logs;
|
|
- store metadata-only advisory counters if an existing log sink already does so.
|
|
|
|
Forbidden without separate explicit approval:
|
|
|
|
- outbound sends/pages/Discord/Telegram/email;
|
|
- service restarts, command execution, or tool calls;
|
|
- Hermes/Atlas routing changes;
|
|
- memory writes;
|
|
- broad private-directory processing;
|
|
- vector database mutation or reindexing.
|
|
|
|
## Input event envelope
|
|
|
|
Cron and n8n producers should normalize events before classification. Keep this input small and avoid raw private payloads.
|
|
|
|
```json
|
|
{
|
|
"schema": "cron_n8n_event_v1",
|
|
"trace_id": "cron:service-health:2026-06-05T14:30:00Z",
|
|
"source": "cron",
|
|
"workflow": "npu-service-health",
|
|
"event_kind": "health_check",
|
|
"severity": "warning",
|
|
"subject": "openvino-reranker health check repeated warning",
|
|
"summary": "Two consecutive health probes reported timeout, no restart attempted.",
|
|
"dedupe_key": "service:openvino-reranker:timeout",
|
|
"observed_at": "2026-06-05T14:30:00Z",
|
|
"stale_after_s": 900,
|
|
"action_requested": false,
|
|
"dry_run": true
|
|
}
|
|
```
|
|
|
|
Field rules:
|
|
|
|
- `source`: `cron` or `n8n`.
|
|
- `workflow`: compact job/workflow name, not a private URL.
|
|
- `subject` + `summary`: the only text sent to the classifier.
|
|
- `dedupe_key`: stable non-secret key for duplicate detection by the caller.
|
|
- `stale_after_s`: caller-side freshness gate; stale events should not page.
|
|
- `action_requested`: true only when an upstream job is asking a human/Atlas to consider action.
|
|
- `dry_run`: must remain true for this phase.
|
|
|
|
## Gateway classifier call
|
|
|
|
The current gateway `/v1/advisory/classify` accepts explicit text and wraps the classifier response in `openvino_advisory_v1` with NPU proof and authority fields.
|
|
|
|
Host cron example for the current bridge-bound service:
|
|
|
|
```bash
|
|
curl -fsS http://172.19.0.1:18830/v1/advisory/classify \
|
|
-H 'Content-Type: application/json' \
|
|
-d '{
|
|
"trace_id":"cron:service-health:sample",
|
|
"text":"source=cron workflow=npu-service-health severity=warning kind=health_check subject=openvino-reranker repeated timeout summary=Two consecutive health probes reported timeout; no restart attempted; dry_run=true"
|
|
}' | jq '{schema, mode, trace_id, npu_ok: .npu_proof.ok, npu_delta: .npu_proof.npu_busy_delta_us, authority, labels: .result.labels}'
|
|
```
|
|
|
|
n8n Docker-bridge example:
|
|
|
|
```bash
|
|
curl -fsS http://172.19.0.1:18830/v1/advisory/classify \
|
|
-H 'Content-Type: application/json' \
|
|
-d '{"trace_id":"n8n:swarm-health:sample","text":"source=n8n workflow=swarm-health-watchdog severity=critical kind=health_check subject=multiple services unhealthy summary=Health probe failed for three services; dry_run=true"}' \
|
|
| jq '{mode, npu_ok: .npu_proof.ok, npu_delta: .npu_proof.npu_busy_delta_us, may_send_external: .authority.may_send_external}'
|
|
```
|
|
|
|
NPU proof gate: an HTTP 200 is not enough. Treat the classifier as NPU-backed only when `.npu_proof.ok == true` and `.npu_proof.npu_busy_delta_us > 0` for real inference.
|
|
|
|
## Advisory decision envelope
|
|
|
|
Cron/n8n wrappers should map the gateway response plus caller-side freshness/deduplication state into this compact decision envelope:
|
|
|
|
```json
|
|
{
|
|
"schema": "cron_n8n_advisory_decision_v1",
|
|
"trace_id": "cron:service-health:2026-06-05T14:30:00Z",
|
|
"source": "cron",
|
|
"workflow": "npu-service-health",
|
|
"dry_run": true,
|
|
"recommendation": "summarize",
|
|
"classification": "action_required",
|
|
"confidence": 0.84,
|
|
"reason_codes": ["warning_or_high_urgency", "fresh_event", "not_duplicate"],
|
|
"npu_proof": {"required": true, "ok": true, "npu_busy_delta_us": 1234},
|
|
"authority": {
|
|
"may_route": false,
|
|
"may_write_memory": false,
|
|
"may_send_external": false,
|
|
"may_process_private_dirs": false,
|
|
"may_execute_tools": false,
|
|
"may_restart_services": false
|
|
},
|
|
"next_gate": "human_or_atlas_review_required_before_any_side_effect"
|
|
}
|
|
```
|
|
|
|
Decision fields:
|
|
|
|
- `recommendation`: `suppress`, `log`, `summarize`, or `escalate`.
|
|
- `classification`: `duplicate`, `stale`, `no_op`, or `action_required` for v1 examples.
|
|
- `confidence`: use classifier urgency/category confidence when available; otherwise use a conservative wrapper score.
|
|
- `reason_codes`: compact machine-readable rationale, not raw payload text.
|
|
- `next_gate`: always a review/approval gate before side effects.
|
|
|
|
## Recommendation mapping
|
|
|
|
This is the v1 dry-run mapping. It is intentionally conservative and caller-side; the NPU classifier advises, the wrapper chooses a recommendation, and humans/Atlas retain authority.
|
|
|
|
| Caller/classifier signal | Classification | Recommendation | Dry-run behavior |
|
|
|---|---|---|---|
|
|
| Same `dedupe_key` observed inside caller cooldown | `duplicate` | `suppress` | Log compact duplicate count only. Do not send. |
|
|
| `observed_at + stale_after_s` is older than now | `stale` | `log` | Log stale event and age. Do not summarize/page. |
|
|
| Severity low/normal, no action requested, classifier urgency low/normal | `no_op` | `log` | Keep normal execution log only. |
|
|
| Warning/high urgency or action requested, NPU proof ok | `action_required` | `summarize` | Draft a local summary for review; no send/restart. |
|
|
| Critical severity or repeated failures and NPU proof ok | `action_required` | `escalate` | Recommend escalation to Atlas/human; wrapper still must not send/restart. |
|
|
| NPU proof missing or false | `action_required` or caller-specific | `log` | Log `npu_proof_failed`; do not claim NPU-backed advice. |
|
|
|
|
## Required examples
|
|
|
|
### Duplicate -> suppress
|
|
|
|
Input summary:
|
|
|
|
```json
|
|
{"source":"cron","workflow":"npu-service-health","severity":"warning","dedupe_key":"service:reranker:timeout","summary":"Same timeout as prior run inside cooldown.","dry_run":true}
|
|
```
|
|
|
|
Decision:
|
|
|
|
```json
|
|
{"classification":"duplicate","recommendation":"suppress","reason_codes":["dedupe_key_in_cooldown"],"next_gate":"none_in_dry_run"}
|
|
```
|
|
|
|
### Stale -> log
|
|
|
|
Input summary:
|
|
|
|
```json
|
|
{"source":"n8n","workflow":"swarm-health-watchdog","severity":"warning","observed_at":"older_than_stale_after","stale_after_s":900,"summary":"Delayed webhook replay for an old probe.","dry_run":true}
|
|
```
|
|
|
|
Decision:
|
|
|
|
```json
|
|
{"classification":"stale","recommendation":"log","reason_codes":["event_stale"],"next_gate":"none_in_dry_run"}
|
|
```
|
|
|
|
### No-op -> log
|
|
|
|
Input summary:
|
|
|
|
```json
|
|
{"source":"cron","workflow":"backup-check","severity":"normal","action_requested":false,"summary":"Backup completed and all expected files are present.","dry_run":true}
|
|
```
|
|
|
|
Decision:
|
|
|
|
```json
|
|
{"classification":"no_op","recommendation":"log","reason_codes":["normal_severity","no_action_requested"],"next_gate":"none_in_dry_run"}
|
|
```
|
|
|
|
### Action required -> summarize/escalate
|
|
|
|
Input summary:
|
|
|
|
```json
|
|
{"source":"n8n","workflow":"swarm-health-watchdog","severity":"critical","action_requested":true,"summary":"RAG and embeddings health failed repeatedly; no restart attempted.","dry_run":true}
|
|
```
|
|
|
|
Decision:
|
|
|
|
```json
|
|
{"classification":"action_required","recommendation":"escalate","reason_codes":["critical_severity","action_requested","fresh_event"],"next_gate":"human_or_atlas_review_required_before_any_side_effect"}
|
|
```
|
|
|
|
## Optional local summary draft
|
|
|
|
If the decision is `summarize` or `escalate`, a wrapper may request a bounded draft from `/v1/advisory/generate`:
|
|
|
|
```bash
|
|
curl -fsS http://172.19.0.1:18830/v1/advisory/generate \
|
|
-H 'Content-Type: application/json' \
|
|
-d '{"trace_id":"cron:service-health:sample","job":"summary","input":"Health check warning: openvino-reranker timed out twice; no restart attempted.","max_new_tokens":48}' \
|
|
| jq '{mode, trace_id, npu_ok: .npu_proof.ok, authority, draft: .result.draft_text, final_authority: .result.final_authority}'
|
|
```
|
|
|
|
The draft remains non-authoritative. It must not be automatically sent externally or written to memory.
|
|
|
|
## n8n integration pattern
|
|
|
|
Recommended node chain for dry-run workflows:
|
|
|
|
```text
|
|
Schedule/Webhook/Failure Trigger
|
|
-> Set normalized event envelope
|
|
-> HTTP Request POST /v1/advisory/classify
|
|
-> Code node maps decision envelope
|
|
-> IF node on recommendation
|
|
suppress/log: execution log only
|
|
summarize/escalate: optional local summary draft, then execution log only
|
|
```
|
|
|
|
The IF node must not connect to outbound messaging, service restart, memory write, or Hermes routing nodes until a separate approval changes the authority boundary.
|
|
|
|
See `../examples/n8n-advisory-dry-run-fragment.json` for a sanitized node fragment.
|
|
|
|
## Cron integration pattern
|
|
|
|
Cron jobs should call a wrapper script that prints one compact line and exits successfully unless the wrapper itself fails. The wrapper should not page or restart.
|
|
|
|
Example crontab shape:
|
|
|
|
```text
|
|
*/15 * * * * /home/will/lab/swarm/openvino-advisory-gateway/examples/cron-advisory-dry-run.sh npu-service-health warning health_check "openvino-reranker timeout twice" "service:openvino-reranker:timeout" >> /home/will/.local/state/npu-advisory/cron.log 2>&1
|
|
```
|
|
|
|
See `../examples/cron-advisory-dry-run.sh`.
|
|
|
|
## Verification checklist
|
|
|
|
- Gateway health is reachable on the intended interface.
|
|
- Classifier response includes `schema=openvino_advisory_v1`.
|
|
- `.authority.*` flags are all false for side-effect authority.
|
|
- `.npu_proof.ok` is true and `npu_busy_delta_us > 0` before claiming NPU-backed advice.
|
|
- Decision envelope is compact and contains only booleans/counts/paths/deltas/gates.
|
|
- Duplicate/stale/no-op/action-required examples remain dry-run only.
|
|
- No n8n workflow activation, outbound send, service restart, memory write, routing change, private-dir broadening, or vector DB mutation occurred.
|