docs: document NPU RAG embedding path
This commit is contained in:
+22
-7
@@ -225,14 +225,25 @@ Implemented 2026-05-13.
|
||||
|
||||
### Architecture
|
||||
|
||||
- **Vector store**: Hermes rag-search ChromaDB embedded at `~/.hermes/data/rag-search/chroma/` in the `obsidian` collection.
|
||||
- **Embeddings**: Ollama `nomic-embed-text` on port `18807` (768-dim vectors).
|
||||
- **Indexer**: `~/.hermes/skills/note-taking/rag-search/scripts/index_obsidian.py`
|
||||
- **Vector store**: Hermes rag-search ChromaDB embedded at `~/.hermes/data/rag-search/chroma/`; live Obsidian semantic endpoint uses collection `obsidian_bge_npu`.
|
||||
- **Embeddings**: OpenVINO Intel NPU service on `18817` using `bge-base-en-v1.5-int8-ov` (768-dim vectors). Legacy Ollama `nomic-embed-text` on `18807` remains available as rollback/comparison data.
|
||||
- **Indexer**: `~/.hermes/skills/note-taking/rag-search/scripts/reindex_obsidian.py`
|
||||
- **Chunking**: Markdown files are split by heading sections; long sections get sliding-window chunks (max 2000 chars, 200 char overlap). YAML frontmatter is extracted and stored as metadata.
|
||||
- **Search**: `~/.hermes/skills/note-taking/rag-search/scripts/search.py --index obsidian "query"`
|
||||
- **Cross-collection search**: `search.py "query"` now searches all three collections (`personal`, `docs`, `obsidian`) using the appropriate embedding backend per collection.
|
||||
- **Search**: `~/.hermes/skills/note-taking/rag-search/scripts/search.py --index obsidian "query"` or Hermes native `rag_search` tool.
|
||||
- **Cross-collection search**: `search.py "query"` searches `personal`, `docs`, and `obsidian` using the appropriate embedding backend per collection.
|
||||
|
||||
### Index stats (2026-05-13)
|
||||
### Live BGE/NPU state (2026-06-03)
|
||||
|
||||
- Collection: `obsidian_bge_npu`
|
||||
- Notes indexed: 194
|
||||
- Vector count: 466
|
||||
- Embedding backend: `http://127.0.0.1:18817`
|
||||
- Embedding model: `bge-base-en-v1.5-int8-ov`
|
||||
- OpenVINO device: Intel NPU via `openvino-embeddings.service`
|
||||
- Semantic health: `curl -fsS http://127.0.0.1:18810/semantic-health`
|
||||
- Embedding health: `curl -fsS http://127.0.0.1:18817/health`
|
||||
|
||||
### Legacy index stats (2026-05-13)
|
||||
|
||||
- 36 markdown files indexed
|
||||
- 231 chunks
|
||||
@@ -242,7 +253,7 @@ Implemented 2026-05-13.
|
||||
|
||||
### Incremental updates
|
||||
|
||||
- File content SHA-256 hashes tracked in `~/.hermes/data/rag-search/obsidian_index_state.json`.
|
||||
- File content SHA-256 hashes are tracked in the collection-specific state file, e.g. `~/.hermes/data/rag-search/obsidian_bge_npu_index_state.json` for the live BGE/NPU collection.
|
||||
- Only changed files are re-indexed on subsequent runs.
|
||||
- Deleted files have their chunks removed from ChromaDB.
|
||||
|
||||
@@ -267,6 +278,10 @@ curl -X POST http://127.0.0.1:18810/reindex | python3 -m json.tool
|
||||
~/.hermes/skills/note-taking/rag-search/venv/bin/python \
|
||||
~/.hermes/skills/note-taking/rag-search/scripts/search.py --index obsidian "health monitoring"
|
||||
|
||||
# Check live semantic and NPU embedding health
|
||||
curl -fsS http://127.0.0.1:18810/semantic-health | python3 -m json.tool
|
||||
curl -fsS http://127.0.0.1:18817/health | python3 -m json.tool
|
||||
|
||||
# Check ChromaDB data
|
||||
du -sh ~/.hermes/data/rag-search/chroma/
|
||||
|
||||
|
||||
+8
-1
@@ -64,9 +64,16 @@ Most service containers run on Will's laptop/host network and publish local/LAN
|
||||
|
||||
### Ollama
|
||||
- **Port:** `18807`
|
||||
- **Role:** embeddings runtime for OpenClaw memory search
|
||||
- **Role:** legacy/rollback embeddings runtime for memory/RAG search
|
||||
- **Model:** `nomic-embed-text`
|
||||
|
||||
### OpenVINO embeddings
|
||||
- **Port:** `18817`
|
||||
- **Unit:** `openvino-embeddings.service`
|
||||
- **Role:** default embeddings service for live Obsidian RAG via Intel NPU
|
||||
- **Model:** `bge-base-en-v1.5-int8-ov`
|
||||
- **Health:** `http://127.0.0.1:18817/health`
|
||||
|
||||
## Adjacent storage / infra
|
||||
|
||||
### MinIO
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
type: service-catalog
|
||||
created: 2026-05-14T14:50:46-07:00
|
||||
updated: 2026-05-27T12:12:06-07:00
|
||||
updated: 2026-06-03T21:31:01-07:00
|
||||
tags:
|
||||
- service-catalog
|
||||
- swarm
|
||||
@@ -13,7 +13,7 @@ tags:
|
||||
|
||||
Canonical index of local services, automation tools, Hermes capabilities, and where to find their operational docs.
|
||||
|
||||
> Generated by Atlas from live system inventory on `2026-05-14T14:50:46-07:00`; high-risk local AI/service rows refreshed on `2026-05-27T12:12:06-07:00`. Secrets are intentionally omitted.
|
||||
> Generated by Atlas from live system inventory on `2026-05-14T14:50:46-07:00`; high-risk local AI/service rows refreshed on `2026-05-27T12:12:06-07:00`; Obsidian/RAG embedding path refreshed on `2026-06-03T21:31:01-07:00`. Secrets are intentionally omitted.
|
||||
|
||||
## Quick links
|
||||
|
||||
@@ -53,7 +53,8 @@ Canonical index of local services, automation tools, Hermes capabilities, and wh
|
||||
| Whisper CPU | 18811 | OK 200 | Whisper.cpp CPU STT fallback | `http://127.0.0.1:18811/` |
|
||||
| URL extractor | 18812 | OK 200 | URL/PDF/YouTube content extractor | `http://127.0.0.1:18812/healthz` |
|
||||
| Voice memo processor | 18813 | OK 200 | Voice memo processor | `http://127.0.0.1:18813/healthz` |
|
||||
| RAG/embedding health | 18814 | OK 200 | RAG/Ollama/Obsidian health wrapper | `http://127.0.0.1:18814/healthz` |
|
||||
| RAG/embedding health | 18814 | OK 200 | RAG/OpenVINO/Obsidian health wrapper | `http://127.0.0.1:18814/healthz` |
|
||||
| OpenVINO embeddings | 18817 | OK 200 | Intel NPU embeddings service for live Obsidian RAG | `http://127.0.0.1:18817/health` |
|
||||
| Obsidian REST HTTP | 27123 | OK 200 | Obsidian Local REST API HTTP | `http://127.0.0.1:27123/` |
|
||||
|
||||
## Docker services
|
||||
@@ -90,7 +91,8 @@ Important known services:
|
||||
| `obsidian-reindex-endpoint.service` | Obsidian/RAG reindex endpoint on 18810 |
|
||||
| `url-content-extractor.service` | URL/PDF/YouTube extraction on 18812 |
|
||||
| `voice-memo-processor.service` | Voice memo processing on 18813 |
|
||||
| `rag-embedding-health.service` | RAG/Ollama/Obsidian health check wrapper on 18814 |
|
||||
| `rag-embedding-health.service` | RAG/OpenVINO/Obsidian health check wrapper on 18814 |
|
||||
| `openvino-embeddings.service` | Intel NPU BGE embedding service on 18817 |
|
||||
|
||||
Useful checks:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user