feat(rag): add OpenVINO NPU embeddings service
This commit is contained in:
@@ -89,7 +89,7 @@
|
||||
<g><rect x="965" y="775" width="210" height="60" rx="9" fill="#0f172a"/><rect x="965" y="775" width="210" height="60" rx="9" fill="rgba(76,29,149,.4)" stroke="#a78bfa" stroke-width="1.6"/><text x="1070" y="802" text-anchor="middle" class="title">Obsidian / RAG</text><text x="1070" y="822" text-anchor="middle" class="port">:27123/:27124 + ChromaDB</text></g>
|
||||
|
||||
<!-- host local ai box -->
|
||||
<g><rect x="280" y="675" width="190" height="100" rx="10" fill="#0f172a"/><rect x="280" y="675" width="190" height="100" rx="10" fill="rgba(76,29,149,.4)" stroke="#a78bfa" stroke-width="1.8"/><text x="375" y="706" text-anchor="middle" class="title">host local AI</text><text x="375" y="730" text-anchor="middle" class="tiny">llama.cpp :18806</text><text x="375" y="752" text-anchor="middle" class="tiny">Ollama embed :18807</text></g>
|
||||
<g><rect x="280" y="675" width="210" height="120" rx="10" fill="#0f172a"/><rect x="280" y="675" width="210" height="120" rx="10" fill="rgba(76,29,149,.4)" stroke="#a78bfa" stroke-width="1.8"/><text x="385" y="706" text-anchor="middle" class="title">host local AI</text><text x="385" y="730" text-anchor="middle" class="tiny">llama.cpp :18806</text><text x="385" y="752" text-anchor="middle" class="tiny">Ollama fallback :18807</text><text x="385" y="774" text-anchor="middle" class="tiny">OpenVINO NPU embed :18817</text></g>
|
||||
|
||||
<!-- legend -->
|
||||
<g transform="translate(40,820)">
|
||||
@@ -104,7 +104,7 @@
|
||||
</div>
|
||||
<div class="cards">
|
||||
<div class="info"><h3>Monitoring model</h3><ul><li>• n8n direct probes critical ports</li><li>• agentmon aggregates Docker/OpenClaw snapshots</li><li>• n8n polls agentmon for stale/degraded state</li></ul></div>
|
||||
<div class="info"><h3>Operational endpoints</h3><ul><li>• n8n: 127.0.0.1:18808</li><li>• agentmon query/UI: 8081 / 8082</li><li>• local LLM/embed: 18806 / 18807</li></ul></div>
|
||||
<div class="info"><h3>Operational endpoints</h3><ul><li>• n8n: 127.0.0.1:18808</li><li>• agentmon query/UI: 8081 / 8082</li><li>• local LLM/embed: 18806 / 18817</li><li>• Ollama fallback: 18807</li></ul></div>
|
||||
<div class="info"><h3>Source paths</h3><ul><li>• Swarm repo: ~/lab/swarm</li><li>• Agentmon repo: ~/lab/agentmon</li><li>• Workflows: swarm-common/n8n-workflows</li></ul></div>
|
||||
</div>
|
||||
<div class="footer">Generated as repo documentation. Open locally in a browser; no JavaScript, all SVG inline.</div>
|
||||
|
||||
@@ -32,7 +32,8 @@ local AI/search/voice services
|
||||
+--> SearXNG :18803
|
||||
+--> Brave MCP :18802
|
||||
+--> llama.cpp :18806
|
||||
+--> Ollama embeddings :18807
|
||||
+--> Ollama embeddings :18807 (legacy/CPU fallback)
|
||||
+--> OpenVINO NPU embeddings :18817
|
||||
+--> Kokoro TTS :18805
|
||||
+--> Whisper NPU :18816
|
||||
```
|
||||
@@ -121,7 +122,8 @@ Docker services:
|
||||
Host/user services:
|
||||
|
||||
- `llama-server.service` — `:18806`, local llama.cpp OpenAI-compatible LLM
|
||||
- `ollama.service` — `:18807`, embeddings API
|
||||
- `ollama.service` — `:18807`, legacy/CPU embeddings API fallback
|
||||
- `openvino-embeddings.service` — `:18817`, OpenVINO NPU embeddings API (`/v1/embeddings`, `/api/embed`, `/api/embeddings`)
|
||||
- `docker-health-endpoint.service` — `:18809`, read-only container health for n8n
|
||||
- `obsidian-reindex-endpoint.service` — `:18810`, Obsidian/RAG reindex trigger
|
||||
- `url-content-extractor.service` — `:18812`, YouTube/PDF/web extraction
|
||||
@@ -143,7 +145,8 @@ RAG/vector store:
|
||||
|
||||
- ChromaDB path: `~/.hermes/data/rag-search/chroma/`
|
||||
- Reindex state/progress: `~/.hermes/data/rag-search/obsidian_index_state.json` and `obsidian_reindex_progress.json`
|
||||
- Embeddings backend: Ollama on `:18807`, normally `nomic-embed-text`
|
||||
- RAG query/reindex embedding backend: still Ollama on `:18807` with `nomic-embed-text` until a deliberate full Chroma rebuild/migration is run.
|
||||
- RAG/embedding health probe backend: OpenVINO NPU embeddings service on `:18817`, currently `bge-base-en-v1.5-int8-ov`.
|
||||
- Reindex endpoint: `POST :18810/reindex` for incremental updates, `POST :18810/reindex?full=true` for full semantic rebuilds, `GET :18810/semantic-health` to verify vectors plus a search smoke test.
|
||||
|
||||
## Monitoring model
|
||||
|
||||
Reference in New Issue
Block a user