Files
swarm-master/openvino-advisory-gateway/docs/cron-n8n-advisory-classifier.md
T
2026-06-05 15:52:43 -07:00

10 KiB

Cron and n8n advisory classifier contract

Status: dry-run specification and integration examples Scope: cron and n8n alert/event classification through the OpenVINO advisory gateway Gateway: http://172.19.0.1:18830 from n8n-agent and host-local cron on the current bridge-bound service. Override NPU_ADVISORY_GATEWAY_URL=http://127.0.0.1:18830 only if a localhost-bound instance is explicitly running.

Authority boundary

This contract is advisory only. It may recommend one of suppress, log, summarize, or escalate, but it must not perform the action itself.

Every integration must preserve these authority flags:

{
  "may_route": false,
  "may_write_memory": false,
  "may_send_external": false,
  "may_process_private_dirs": false,
  "may_execute_tools": false,
  "may_restart_services": false
}

Allowed side effects in dry-run mode:

  • read an explicit cron/n8n event payload;
  • call the advisory gateway classifier/generator;
  • write compact local stdout or n8n execution logs;
  • store metadata-only advisory counters if an existing log sink already does so.

Forbidden without separate explicit approval:

  • outbound sends/pages/Discord/Telegram/email;
  • service restarts, command execution, or tool calls;
  • Hermes/Atlas routing changes;
  • memory writes;
  • broad private-directory processing;
  • vector database mutation or reindexing.

Input event envelope

Cron and n8n producers should normalize events before classification. Keep this input small and avoid raw private payloads.

{
  "schema": "cron_n8n_event_v1",
  "trace_id": "cron:service-health:2026-06-05T14:30:00Z",
  "source": "cron",
  "workflow": "npu-service-health",
  "event_kind": "health_check",
  "severity": "warning",
  "subject": "openvino-reranker health check repeated warning",
  "summary": "Two consecutive health probes reported timeout, no restart attempted.",
  "dedupe_key": "service:openvino-reranker:timeout",
  "observed_at": "2026-06-05T14:30:00Z",
  "stale_after_s": 900,
  "action_requested": false,
  "dry_run": true
}

Field rules:

  • source: cron or n8n.
  • workflow: compact job/workflow name, not a private URL.
  • subject + summary: the only text sent to the classifier.
  • dedupe_key: stable non-secret key for duplicate detection by the caller.
  • stale_after_s: caller-side freshness gate; stale events should not page.
  • action_requested: true only when an upstream job is asking a human/Atlas to consider action.
  • dry_run: must remain true for this phase.

Gateway classifier call

The current gateway /v1/advisory/classify accepts explicit text and wraps the classifier response in openvino_advisory_v1 with NPU proof and authority fields.

Host cron example for the current bridge-bound service:

curl -fsS http://172.19.0.1:18830/v1/advisory/classify \
  -H 'Content-Type: application/json' \
  -d '{
    "trace_id":"cron:service-health:sample",
    "text":"source=cron workflow=npu-service-health severity=warning kind=health_check subject=openvino-reranker repeated timeout summary=Two consecutive health probes reported timeout; no restart attempted; dry_run=true"
  }' | jq '{schema, mode, trace_id, npu_ok: .npu_proof.ok, npu_delta: .npu_proof.npu_busy_delta_us, authority, labels: .result.labels}'

n8n Docker-bridge example:

curl -fsS http://172.19.0.1:18830/v1/advisory/classify \
  -H 'Content-Type: application/json' \
  -d '{"trace_id":"n8n:swarm-health:sample","text":"source=n8n workflow=swarm-health-watchdog severity=critical kind=health_check subject=multiple services unhealthy summary=Health probe failed for three services; dry_run=true"}' \
  | jq '{mode, npu_ok: .npu_proof.ok, npu_delta: .npu_proof.npu_busy_delta_us, may_send_external: .authority.may_send_external}'

NPU proof gate: an HTTP 200 is not enough. Treat the classifier as NPU-backed only when .npu_proof.ok == true and .npu_proof.npu_busy_delta_us > 0 for real inference.

Advisory decision envelope

Cron/n8n wrappers should map the gateway response plus caller-side freshness/deduplication state into this compact decision envelope:

{
  "schema": "cron_n8n_advisory_decision_v1",
  "trace_id": "cron:service-health:2026-06-05T14:30:00Z",
  "source": "cron",
  "workflow": "npu-service-health",
  "dry_run": true,
  "recommendation": "summarize",
  "classification": "action_required",
  "confidence": 0.84,
  "reason_codes": ["warning_or_high_urgency", "fresh_event", "not_duplicate"],
  "npu_proof": {"required": true, "ok": true, "npu_busy_delta_us": 1234},
  "authority": {
    "may_route": false,
    "may_write_memory": false,
    "may_send_external": false,
    "may_process_private_dirs": false,
    "may_execute_tools": false,
    "may_restart_services": false
  },
  "next_gate": "human_or_atlas_review_required_before_any_side_effect"
}

Decision fields:

  • recommendation: suppress, log, summarize, or escalate.
  • classification: duplicate, stale, no_op, or action_required for v1 examples.
  • confidence: use classifier urgency/category confidence when available; otherwise use a conservative wrapper score.
  • reason_codes: compact machine-readable rationale, not raw payload text.
  • next_gate: always a review/approval gate before side effects.

Recommendation mapping

This is the v1 dry-run mapping. It is intentionally conservative and caller-side; the NPU classifier advises, the wrapper chooses a recommendation, and humans/Atlas retain authority.

Caller/classifier signal Classification Recommendation Dry-run behavior
Same dedupe_key observed inside caller cooldown duplicate suppress Log compact duplicate count only. Do not send.
observed_at + stale_after_s is older than now stale log Log stale event and age. Do not summarize/page.
Severity low/normal, no action requested, classifier urgency low/normal no_op log Keep normal execution log only.
Warning/high urgency or action requested, NPU proof ok action_required summarize Draft a local summary for review; no send/restart.
Critical severity or repeated failures and NPU proof ok action_required escalate Recommend escalation to Atlas/human; wrapper still must not send/restart.
NPU proof missing or false action_required or caller-specific log Log npu_proof_failed; do not claim NPU-backed advice.

Required examples

Duplicate -> suppress

Input summary:

{"source":"cron","workflow":"npu-service-health","severity":"warning","dedupe_key":"service:reranker:timeout","summary":"Same timeout as prior run inside cooldown.","dry_run":true}

Decision:

{"classification":"duplicate","recommendation":"suppress","reason_codes":["dedupe_key_in_cooldown"],"next_gate":"none_in_dry_run"}

Stale -> log

Input summary:

{"source":"n8n","workflow":"swarm-health-watchdog","severity":"warning","observed_at":"older_than_stale_after","stale_after_s":900,"summary":"Delayed webhook replay for an old probe.","dry_run":true}

Decision:

{"classification":"stale","recommendation":"log","reason_codes":["event_stale"],"next_gate":"none_in_dry_run"}

No-op -> log

Input summary:

{"source":"cron","workflow":"backup-check","severity":"normal","action_requested":false,"summary":"Backup completed and all expected files are present.","dry_run":true}

Decision:

{"classification":"no_op","recommendation":"log","reason_codes":["normal_severity","no_action_requested"],"next_gate":"none_in_dry_run"}

Action required -> summarize/escalate

Input summary:

{"source":"n8n","workflow":"swarm-health-watchdog","severity":"critical","action_requested":true,"summary":"RAG and embeddings health failed repeatedly; no restart attempted.","dry_run":true}

Decision:

{"classification":"action_required","recommendation":"escalate","reason_codes":["critical_severity","action_requested","fresh_event"],"next_gate":"human_or_atlas_review_required_before_any_side_effect"}

Optional local summary draft

If the decision is summarize or escalate, a wrapper may request a bounded draft from /v1/advisory/generate:

curl -fsS http://172.19.0.1:18830/v1/advisory/generate \
  -H 'Content-Type: application/json' \
  -d '{"trace_id":"cron:service-health:sample","job":"summary","input":"Health check warning: openvino-reranker timed out twice; no restart attempted.","max_new_tokens":48}' \
  | jq '{mode, trace_id, npu_ok: .npu_proof.ok, authority, draft: .result.draft_text, final_authority: .result.final_authority}'

The draft remains non-authoritative. It must not be automatically sent externally or written to memory.

n8n integration pattern

Recommended node chain for dry-run workflows:

Schedule/Webhook/Failure Trigger
  -> Set normalized event envelope
  -> HTTP Request POST /v1/advisory/classify
  -> Code node maps decision envelope
  -> IF node on recommendation
      suppress/log: execution log only
      summarize/escalate: optional local summary draft, then execution log only

The IF node must not connect to outbound messaging, service restart, memory write, or Hermes routing nodes until a separate approval changes the authority boundary.

See ../examples/n8n-advisory-dry-run-fragment.json for a sanitized node fragment.

Cron integration pattern

Cron jobs should call a wrapper script that prints one compact line and exits successfully unless the wrapper itself fails. The wrapper should not page or restart.

Example crontab shape:

*/15 * * * * /home/will/lab/swarm/openvino-advisory-gateway/examples/cron-advisory-dry-run.sh npu-service-health warning health_check "openvino-reranker timeout twice" "service:openvino-reranker:timeout" >> /home/will/.local/state/npu-advisory/cron.log 2>&1

See ../examples/cron-advisory-dry-run.sh.

Verification checklist

  • Gateway health is reachable on the intended interface.
  • Classifier response includes schema=openvino_advisory_v1.
  • .authority.* flags are all false for side-effect authority.
  • .npu_proof.ok is true and npu_busy_delta_us > 0 before claiming NPU-backed advice.
  • Decision envelope is compact and contains only booleans/counts/paths/deltas/gates.
  • Duplicate/stale/no-op/action-required examples remain dry-run only.
  • No n8n workflow activation, outbound send, service restart, memory write, routing change, private-dir broadening, or vector DB mutation occurred.