feat(rag): add OpenVINO NPU embeddings service
This commit is contained in:
@@ -32,7 +32,8 @@ local AI/search/voice services
|
||||
+--> SearXNG :18803
|
||||
+--> Brave MCP :18802
|
||||
+--> llama.cpp :18806
|
||||
+--> Ollama embeddings :18807
|
||||
+--> Ollama embeddings :18807 (legacy/CPU fallback)
|
||||
+--> OpenVINO NPU embeddings :18817
|
||||
+--> Kokoro TTS :18805
|
||||
+--> Whisper NPU :18816
|
||||
```
|
||||
@@ -121,7 +122,8 @@ Docker services:
|
||||
Host/user services:
|
||||
|
||||
- `llama-server.service` — `:18806`, local llama.cpp OpenAI-compatible LLM
|
||||
- `ollama.service` — `:18807`, embeddings API
|
||||
- `ollama.service` — `:18807`, legacy/CPU embeddings API fallback
|
||||
- `openvino-embeddings.service` — `:18817`, OpenVINO NPU embeddings API (`/v1/embeddings`, `/api/embed`, `/api/embeddings`)
|
||||
- `docker-health-endpoint.service` — `:18809`, read-only container health for n8n
|
||||
- `obsidian-reindex-endpoint.service` — `:18810`, Obsidian/RAG reindex trigger
|
||||
- `url-content-extractor.service` — `:18812`, YouTube/PDF/web extraction
|
||||
@@ -143,7 +145,8 @@ RAG/vector store:
|
||||
|
||||
- ChromaDB path: `~/.hermes/data/rag-search/chroma/`
|
||||
- Reindex state/progress: `~/.hermes/data/rag-search/obsidian_index_state.json` and `obsidian_reindex_progress.json`
|
||||
- Embeddings backend: Ollama on `:18807`, normally `nomic-embed-text`
|
||||
- RAG query/reindex embedding backend: still Ollama on `:18807` with `nomic-embed-text` until a deliberate full Chroma rebuild/migration is run.
|
||||
- RAG/embedding health probe backend: OpenVINO NPU embeddings service on `:18817`, currently `bge-base-en-v1.5-int8-ov`.
|
||||
- Reindex endpoint: `POST :18810/reindex` for incremental updates, `POST :18810/reindex?full=true` for full semantic rebuilds, `GET :18810/semantic-health` to verify vectors plus a search smoke test.
|
||||
|
||||
## Monitoring model
|
||||
|
||||
Reference in New Issue
Block a user