Implement rag-search skill for semantic search
Add new skill for semantic search across personal state files and external documentation using ChromaDB and sentence-transformers. Components: - search.py: Main search interface (--index, --top-k flags) - index_personal.py: Index ~/.claude/state files - index_docs.py: Index external docs (git repos) - add_doc_source.py: Manage doc sources - test_rag.py: Test suite (5/5 passing) Features: - Two indexes: personal (116 chunks) and docs (k0s: 846 chunks) - all-MiniLM-L6-v2 embeddings (384 dimensions) - ChromaDB persistent storage - JSON output with ranked results and metadata Documentation: - Added to component-registry.json with triggers - Added /rag command alias - Updated skills/README.md - Resolved fc-013 (vector database for agent memory) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
123
skills/rag-search/SKILL.md
Normal file
123
skills/rag-search/SKILL.md
Normal file
@@ -0,0 +1,123 @@
|
||||
---
|
||||
name: rag-search
|
||||
description: Semantic search across personal state files and external documentation
|
||||
triggers: [search, find, lookup, what did, how did, when did, past decisions, previous, documentation, docs]
|
||||
---
|
||||
|
||||
# RAG Search Skill
|
||||
|
||||
Semantic search across two indexes:
|
||||
- **personal**: Your state files, memory, decisions, preferences
|
||||
- **docs**: External documentation (k0s, ArgoCD, etc.)
|
||||
|
||||
## When to Use
|
||||
|
||||
- "What decisions did I make about X?"
|
||||
- "How did I configure Y?"
|
||||
- "What does the k0s documentation say about Z?"
|
||||
- "Find my past notes on..."
|
||||
- Cross-referencing personal context with official docs
|
||||
|
||||
## Scripts
|
||||
|
||||
All scripts use the venv at `~/.claude/skills/rag-search/venv/`.
|
||||
|
||||
### Search (Primary Interface)
|
||||
|
||||
```bash
|
||||
# Search both indexes
|
||||
~/.claude/skills/rag-search/venv/bin/python \
|
||||
~/.claude/skills/rag-search/scripts/search.py "query"
|
||||
|
||||
# Search specific index
|
||||
~/.claude/skills/rag-search/scripts/search.py --index personal "query"
|
||||
~/.claude/skills/rag-search/scripts/search.py --index docs "query"
|
||||
|
||||
# Control result count
|
||||
~/.claude/skills/rag-search/scripts/search.py --top-k 10 "query"
|
||||
```
|
||||
|
||||
### Index Management
|
||||
|
||||
```bash
|
||||
# Reindex personal state files
|
||||
~/.claude/skills/rag-search/venv/bin/python \
|
||||
~/.claude/skills/rag-search/scripts/index_personal.py
|
||||
|
||||
# Index all doc sources
|
||||
~/.claude/skills/rag-search/venv/bin/python \
|
||||
~/.claude/skills/rag-search/scripts/index_docs.py --all
|
||||
|
||||
# Index specific doc source
|
||||
~/.claude/skills/rag-search/scripts/index_docs.py --source k0s
|
||||
```
|
||||
|
||||
### Adding Doc Sources
|
||||
|
||||
```bash
|
||||
# Add a git-based doc source
|
||||
~/.claude/skills/rag-search/venv/bin/python \
|
||||
~/.claude/skills/rag-search/scripts/add_doc_source.py \
|
||||
--id "argocd" \
|
||||
--name "ArgoCD Documentation" \
|
||||
--type git \
|
||||
--url "https://github.com/argoproj/argo-cd.git" \
|
||||
--path "docs/" \
|
||||
--glob "**/*.md"
|
||||
|
||||
# List configured sources
|
||||
~/.claude/skills/rag-search/scripts/add_doc_source.py --list
|
||||
```
|
||||
|
||||
## Output Format
|
||||
|
||||
Search returns JSON:
|
||||
|
||||
```json
|
||||
{
|
||||
"query": "your search query",
|
||||
"results": [
|
||||
{
|
||||
"rank": 1,
|
||||
"score": 0.847,
|
||||
"source": "personal",
|
||||
"file": "memory/decisions.json",
|
||||
"chunk": "Relevant text content...",
|
||||
"metadata": {"date": "2025-01-15"}
|
||||
}
|
||||
],
|
||||
"searched_collections": ["personal", "docs"],
|
||||
"total_chunks_searched": 1847
|
||||
}
|
||||
```
|
||||
|
||||
## Search Strategy
|
||||
|
||||
1. **Start broad** - Use general terms first
|
||||
2. **Refine if needed** - Add specific keywords if results aren't relevant
|
||||
3. **Cross-reference** - When both personal and docs results appear, synthesize them
|
||||
4. **Cite sources** - Include file paths and dates in your answers
|
||||
|
||||
## Example Workflow
|
||||
|
||||
User asks: "How should I configure ArgoCD sync?"
|
||||
|
||||
1. Search both indexes:
|
||||
```bash
|
||||
search.py "ArgoCD sync configuration"
|
||||
```
|
||||
|
||||
2. If personal results exist, prioritize those (user's past decisions)
|
||||
|
||||
3. Supplement with docs results for official guidance
|
||||
|
||||
4. Synthesize answer:
|
||||
> Based on your previous decision (decisions.json, 2025-01-15), you configured ArgoCD with auto-sync enabled but self-heal disabled. The ArgoCD docs recommend this for production environments where you want automatic deployment but manual intervention for drift correction.
|
||||
|
||||
## Maintenance
|
||||
|
||||
Indexes should be refreshed periodically:
|
||||
- Personal: After significant state changes
|
||||
- Docs: After tool version upgrades
|
||||
|
||||
A systemd timer can automate this (see design doc for setup).
|
||||
Reference in New Issue
Block a user