docs: document NPU RAG embedding path
This commit is contained in:
+22
-7
@@ -225,14 +225,25 @@ Implemented 2026-05-13.
|
|||||||
|
|
||||||
### Architecture
|
### Architecture
|
||||||
|
|
||||||
- **Vector store**: Hermes rag-search ChromaDB embedded at `~/.hermes/data/rag-search/chroma/` in the `obsidian` collection.
|
- **Vector store**: Hermes rag-search ChromaDB embedded at `~/.hermes/data/rag-search/chroma/`; live Obsidian semantic endpoint uses collection `obsidian_bge_npu`.
|
||||||
- **Embeddings**: Ollama `nomic-embed-text` on port `18807` (768-dim vectors).
|
- **Embeddings**: OpenVINO Intel NPU service on `18817` using `bge-base-en-v1.5-int8-ov` (768-dim vectors). Legacy Ollama `nomic-embed-text` on `18807` remains available as rollback/comparison data.
|
||||||
- **Indexer**: `~/.hermes/skills/note-taking/rag-search/scripts/index_obsidian.py`
|
- **Indexer**: `~/.hermes/skills/note-taking/rag-search/scripts/reindex_obsidian.py`
|
||||||
- **Chunking**: Markdown files are split by heading sections; long sections get sliding-window chunks (max 2000 chars, 200 char overlap). YAML frontmatter is extracted and stored as metadata.
|
- **Chunking**: Markdown files are split by heading sections; long sections get sliding-window chunks (max 2000 chars, 200 char overlap). YAML frontmatter is extracted and stored as metadata.
|
||||||
- **Search**: `~/.hermes/skills/note-taking/rag-search/scripts/search.py --index obsidian "query"`
|
- **Search**: `~/.hermes/skills/note-taking/rag-search/scripts/search.py --index obsidian "query"` or Hermes native `rag_search` tool.
|
||||||
- **Cross-collection search**: `search.py "query"` now searches all three collections (`personal`, `docs`, `obsidian`) using the appropriate embedding backend per collection.
|
- **Cross-collection search**: `search.py "query"` searches `personal`, `docs`, and `obsidian` using the appropriate embedding backend per collection.
|
||||||
|
|
||||||
### Index stats (2026-05-13)
|
### Live BGE/NPU state (2026-06-03)
|
||||||
|
|
||||||
|
- Collection: `obsidian_bge_npu`
|
||||||
|
- Notes indexed: 194
|
||||||
|
- Vector count: 466
|
||||||
|
- Embedding backend: `http://127.0.0.1:18817`
|
||||||
|
- Embedding model: `bge-base-en-v1.5-int8-ov`
|
||||||
|
- OpenVINO device: Intel NPU via `openvino-embeddings.service`
|
||||||
|
- Semantic health: `curl -fsS http://127.0.0.1:18810/semantic-health`
|
||||||
|
- Embedding health: `curl -fsS http://127.0.0.1:18817/health`
|
||||||
|
|
||||||
|
### Legacy index stats (2026-05-13)
|
||||||
|
|
||||||
- 36 markdown files indexed
|
- 36 markdown files indexed
|
||||||
- 231 chunks
|
- 231 chunks
|
||||||
@@ -242,7 +253,7 @@ Implemented 2026-05-13.
|
|||||||
|
|
||||||
### Incremental updates
|
### Incremental updates
|
||||||
|
|
||||||
- File content SHA-256 hashes tracked in `~/.hermes/data/rag-search/obsidian_index_state.json`.
|
- File content SHA-256 hashes are tracked in the collection-specific state file, e.g. `~/.hermes/data/rag-search/obsidian_bge_npu_index_state.json` for the live BGE/NPU collection.
|
||||||
- Only changed files are re-indexed on subsequent runs.
|
- Only changed files are re-indexed on subsequent runs.
|
||||||
- Deleted files have their chunks removed from ChromaDB.
|
- Deleted files have their chunks removed from ChromaDB.
|
||||||
|
|
||||||
@@ -267,6 +278,10 @@ curl -X POST http://127.0.0.1:18810/reindex | python3 -m json.tool
|
|||||||
~/.hermes/skills/note-taking/rag-search/venv/bin/python \
|
~/.hermes/skills/note-taking/rag-search/venv/bin/python \
|
||||||
~/.hermes/skills/note-taking/rag-search/scripts/search.py --index obsidian "health monitoring"
|
~/.hermes/skills/note-taking/rag-search/scripts/search.py --index obsidian "health monitoring"
|
||||||
|
|
||||||
|
# Check live semantic and NPU embedding health
|
||||||
|
curl -fsS http://127.0.0.1:18810/semantic-health | python3 -m json.tool
|
||||||
|
curl -fsS http://127.0.0.1:18817/health | python3 -m json.tool
|
||||||
|
|
||||||
# Check ChromaDB data
|
# Check ChromaDB data
|
||||||
du -sh ~/.hermes/data/rag-search/chroma/
|
du -sh ~/.hermes/data/rag-search/chroma/
|
||||||
|
|
||||||
|
|||||||
+8
-1
@@ -64,9 +64,16 @@ Most service containers run on Will's laptop/host network and publish local/LAN
|
|||||||
|
|
||||||
### Ollama
|
### Ollama
|
||||||
- **Port:** `18807`
|
- **Port:** `18807`
|
||||||
- **Role:** embeddings runtime for OpenClaw memory search
|
- **Role:** legacy/rollback embeddings runtime for memory/RAG search
|
||||||
- **Model:** `nomic-embed-text`
|
- **Model:** `nomic-embed-text`
|
||||||
|
|
||||||
|
### OpenVINO embeddings
|
||||||
|
- **Port:** `18817`
|
||||||
|
- **Unit:** `openvino-embeddings.service`
|
||||||
|
- **Role:** default embeddings service for live Obsidian RAG via Intel NPU
|
||||||
|
- **Model:** `bge-base-en-v1.5-int8-ov`
|
||||||
|
- **Health:** `http://127.0.0.1:18817/health`
|
||||||
|
|
||||||
## Adjacent storage / infra
|
## Adjacent storage / infra
|
||||||
|
|
||||||
### MinIO
|
### MinIO
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
---
|
---
|
||||||
type: service-catalog
|
type: service-catalog
|
||||||
created: 2026-05-14T14:50:46-07:00
|
created: 2026-05-14T14:50:46-07:00
|
||||||
updated: 2026-05-27T12:12:06-07:00
|
updated: 2026-06-03T21:31:01-07:00
|
||||||
tags:
|
tags:
|
||||||
- service-catalog
|
- service-catalog
|
||||||
- swarm
|
- swarm
|
||||||
@@ -13,7 +13,7 @@ tags:
|
|||||||
|
|
||||||
Canonical index of local services, automation tools, Hermes capabilities, and where to find their operational docs.
|
Canonical index of local services, automation tools, Hermes capabilities, and where to find their operational docs.
|
||||||
|
|
||||||
> Generated by Atlas from live system inventory on `2026-05-14T14:50:46-07:00`; high-risk local AI/service rows refreshed on `2026-05-27T12:12:06-07:00`. Secrets are intentionally omitted.
|
> Generated by Atlas from live system inventory on `2026-05-14T14:50:46-07:00`; high-risk local AI/service rows refreshed on `2026-05-27T12:12:06-07:00`; Obsidian/RAG embedding path refreshed on `2026-06-03T21:31:01-07:00`. Secrets are intentionally omitted.
|
||||||
|
|
||||||
## Quick links
|
## Quick links
|
||||||
|
|
||||||
@@ -53,7 +53,8 @@ Canonical index of local services, automation tools, Hermes capabilities, and wh
|
|||||||
| Whisper CPU | 18811 | OK 200 | Whisper.cpp CPU STT fallback | `http://127.0.0.1:18811/` |
|
| Whisper CPU | 18811 | OK 200 | Whisper.cpp CPU STT fallback | `http://127.0.0.1:18811/` |
|
||||||
| URL extractor | 18812 | OK 200 | URL/PDF/YouTube content extractor | `http://127.0.0.1:18812/healthz` |
|
| URL extractor | 18812 | OK 200 | URL/PDF/YouTube content extractor | `http://127.0.0.1:18812/healthz` |
|
||||||
| Voice memo processor | 18813 | OK 200 | Voice memo processor | `http://127.0.0.1:18813/healthz` |
|
| Voice memo processor | 18813 | OK 200 | Voice memo processor | `http://127.0.0.1:18813/healthz` |
|
||||||
| RAG/embedding health | 18814 | OK 200 | RAG/Ollama/Obsidian health wrapper | `http://127.0.0.1:18814/healthz` |
|
| RAG/embedding health | 18814 | OK 200 | RAG/OpenVINO/Obsidian health wrapper | `http://127.0.0.1:18814/healthz` |
|
||||||
|
| OpenVINO embeddings | 18817 | OK 200 | Intel NPU embeddings service for live Obsidian RAG | `http://127.0.0.1:18817/health` |
|
||||||
| Obsidian REST HTTP | 27123 | OK 200 | Obsidian Local REST API HTTP | `http://127.0.0.1:27123/` |
|
| Obsidian REST HTTP | 27123 | OK 200 | Obsidian Local REST API HTTP | `http://127.0.0.1:27123/` |
|
||||||
|
|
||||||
## Docker services
|
## Docker services
|
||||||
@@ -90,7 +91,8 @@ Important known services:
|
|||||||
| `obsidian-reindex-endpoint.service` | Obsidian/RAG reindex endpoint on 18810 |
|
| `obsidian-reindex-endpoint.service` | Obsidian/RAG reindex endpoint on 18810 |
|
||||||
| `url-content-extractor.service` | URL/PDF/YouTube extraction on 18812 |
|
| `url-content-extractor.service` | URL/PDF/YouTube extraction on 18812 |
|
||||||
| `voice-memo-processor.service` | Voice memo processing on 18813 |
|
| `voice-memo-processor.service` | Voice memo processing on 18813 |
|
||||||
| `rag-embedding-health.service` | RAG/Ollama/Obsidian health check wrapper on 18814 |
|
| `rag-embedding-health.service` | RAG/OpenVINO/Obsidian health check wrapper on 18814 |
|
||||||
|
| `openvino-embeddings.service` | Intel NPU BGE embedding service on 18817 |
|
||||||
|
|
||||||
Useful checks:
|
Useful checks:
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user