Skip to content

Commit fe8feca

Browse files
madeinoz67claude
andcommitted
docs: add RAG advanced search guide and CLI commands
New docs/rag/advanced-search.md: - Reranker (cross-encoder for +30-40% accuracy) - Hybrid Search (BM25 + dense for +20% recall) - HyDE (hypothetical document embeddings) - Multi-Query (variant generation with RRF) - Query Classifier (adaptive retrieval routing) - Configuration reference for all features Updated docs/reference/cli.md: - Added RAG Operations section with rag-cli commands - search, ingest, get-chunk, list, health - Image search commands - Updated AI-friendly summary Navigation: Added RAG Advanced to LKAP menu Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 61f4a20 commit fe8feca

File tree

3 files changed

+399
-4
lines changed

3 files changed

+399
-4
lines changed

docs/rag/advanced-search.md

Lines changed: 279 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,279 @@
1+
---
2+
title: "Advanced RAG Search"
3+
description: "Advanced retrieval techniques including reranking, hybrid search, HyDE, and multi-query"
4+
---
5+
6+
# Advanced RAG Search
7+
8+
The RAG system includes several advanced retrieval techniques that improve search accuracy and recall. These are enabled by default and can be configured via environment variables.
9+
10+
## Overview
11+
12+
| Feature | Purpose | Improvement |
13+
|---------|---------|-------------|
14+
| **Reranker** | Cross-encoder relevance scoring | +30-40% accuracy |
15+
| **Hybrid Search** | BM25 + dense vector fusion | +20% recall |
16+
| **HyDE** | Hypothetical document expansion | Better for short queries |
17+
| **Multi-Query** | Query variant generation | Covers more interpretations |
18+
| **Query Classifier** | Adaptive retrieval routing | Right strategy per query |
19+
20+
## Search Pipeline
21+
22+
```
23+
Query → Query Classifier → [HyDE?] → [Multi-Query?] → Hybrid Search → Reranker → Results
24+
↓ ↓ ↓ ↓ ↓
25+
Type: factual Expand 3 variants Dense+BM25 Cross-encoder
26+
conceptual short merged RRF top-k
27+
```
28+
29+
---
30+
31+
## Reranker
32+
33+
Cross-encoder reranking improves retrieval accuracy by 30-40% by using a more precise (but slower) model to score candidates.
34+
35+
### How It Works
36+
37+
1. **Initial retrieval**: Bi-encoder (fast, approximates relevance) returns top-20
38+
2. **Reranking**: Cross-encoder (slow, precise) scores each candidate
39+
3. **Final results**: Top-10 by cross-encoder score
40+
41+
### When Reranking Helps
42+
43+
- **High-precision requirements** - Need the most relevant results
44+
- **Complex queries** - Multiple concepts or ambiguous terms
45+
- **Production systems** - "Never skip reranking for production RAG"
46+
47+
### Configuration
48+
49+
```bash
50+
# Enable/disable reranking (default: true)
51+
MADEINOZ_KNOWLEDGE_RERANKER_ENABLED=true
52+
53+
# Cross-encoder model
54+
MADEINOZ_KNOWLEDGE_RERANKER_MODEL=BAAI/bge-reranker-base
55+
56+
# Candidates to rerank (default: 20)
57+
MADEINOZ_KNOWLEDGE_RERANKER_TOP_K=20
58+
59+
# Final results after reranking (default: 10)
60+
MADEINOZ_KNOWLEDGE_RERANKER_FINAL_K=10
61+
62+
# Provider: local, openrouter, cohere (default: local)
63+
MADEINOZ_KNOWLEDGE_RERANKER_PROVIDER=local
64+
```
65+
66+
---
67+
68+
## Hybrid Search
69+
70+
Combines vector similarity (dense) with keyword matching (sparse/BM25) for better recall, especially for:
71+
72+
- **Acronyms and proper nouns** - "GPT-4" vs "GPT" and "4"
73+
- **Exact phrase matching** - "machine learning" as phrase
74+
- **Rare terms** - Not well-represented in embedding space
75+
76+
### How It Works
77+
78+
1. **Dense retrieval**: Vector similarity via Qdrant
79+
2. **Sparse retrieval**: BM25 keyword matching via Qdrant text index
80+
3. **Fusion**: Reciprocal Rank Fusion (RRF) combines results
81+
82+
### RRF Formula
83+
84+
```
85+
score(d) = sum(1 / (k + rank)) for each result list
86+
k = 60 (dampens rank impact)
87+
```
88+
89+
### Configuration
90+
91+
```bash
92+
# Enable hybrid search (default: true)
93+
MADEINOZ_KNOWLEDGE_HYBRID_ENABLED=true
94+
95+
# Weight for dense vs sparse (default: 0.7 = favor dense)
96+
MADEINOZ_KNOWLEDGE_HYBRID_ALPHA=0.7
97+
98+
# RRF constant k (default: 60)
99+
MADEINOZ_KNOWLEDGE_HYBRID_RRF_K=60
100+
```
101+
102+
---
103+
104+
## HyDE (Hypothetical Document Embeddings)
105+
106+
Generates a hypothetical answer to the query, then retrieves documents similar to that hypothetical document. Best for short, ambiguous queries.
107+
108+
### How It Works
109+
110+
1. **Query**: "login issues"
111+
2. **Hypothetical doc**: "Authentication errors may occur due to invalid credentials, expired sessions, or password reset requirements..."
112+
3. **Retrieve**: Find docs similar to hypothetical
113+
114+
### When HyDE Helps
115+
116+
- Short, ambiguous queries ("login issues")
117+
- Queries with little domain terminology
118+
- Documents are more verbose than queries
119+
120+
### When to Skip HyDE
121+
122+
- Long, specific queries (already contain good keywords)
123+
- Latency is critical (requires LLM call)
124+
- LLM might hallucinate domain terminology
125+
126+
### Configuration
127+
128+
```bash
129+
# Enable HyDE expansion (default: true)
130+
MADEINOZ_KNOWLEDGE_HYDE_ENABLED=true
131+
132+
# Min query tokens to trigger HyDE (default: 10)
133+
MADEINOZ_KNOWLEDGE_HYDE_MIN_QUERY_TOKENS=10
134+
135+
# Max tokens in hypothetical document (default: 200)
136+
MADEINOZ_KNOWLEDGE_HYDE_MAX_HYPOTHETICAL_TOKENS=200
137+
```
138+
139+
### Cost
140+
141+
- Adds LLM call per query (~$0.002 per query at 200 tokens)
142+
- For 10K queries/day: ~$600/month
143+
144+
---
145+
146+
## Multi-Query Variants
147+
148+
Generates multiple query variants/rephrasings, retrieves for each, then merges results using RRF.
149+
150+
### How It Works
151+
152+
1. **Original query**: "How do I configure SPI?"
153+
2. **Variants**:
154+
- "SPI configuration settings"
155+
- "Set up Serial Peripheral Interface"
156+
- "SPI master/slave setup guide"
157+
3. **Retrieve** for each variant
158+
4. **Merge** with RRF
159+
160+
### When Multi-Query Helps
161+
162+
- Complex queries with multiple interpretations
163+
- Queries that might match different terminology
164+
- Single query retrieval yields poor results
165+
166+
### When to Skip
167+
168+
- Simple, well-defined queries
169+
- Exact match queries (keywords, IDs)
170+
- Latency is critical
171+
172+
### Configuration
173+
174+
```bash
175+
# Enable multi-query (default: true)
176+
MADEINOZ_KNOWLEDGE_MULTI_QUERY_ENABLED=true
177+
178+
# Min query length to trigger (default: 10)
179+
MADEINOZ_KNOWLEDGE_MULTI_QUERY_MIN_LENGTH=10
180+
181+
# Number of variants to generate (default: 3)
182+
MADEINOZ_KNOWLEDGE_MULTI_QUERY_NUM_VARIANTS=3
183+
```
184+
185+
---
186+
187+
## Query Classifier
188+
189+
Classifies query type and routes to appropriate retrieval strategy.
190+
191+
### Query Types
192+
193+
| Type | Description | Best Strategy |
194+
|------|-------------|---------------|
195+
| **factual** | Specific facts, exact matches | Keyword retriever |
196+
| **procedural** | How-to, step-by-step | Hierarchical retriever |
197+
| **conceptual** | Explanations, understanding | Vector retriever |
198+
| **comparative** | Comparing options | Multi-document |
199+
| **temporal** | Time-sensitive | Time-filtered |
200+
| **ambiguous** | Needs clarification | Ask clarification |
201+
202+
### Classification Methods
203+
204+
- **Rule-based** (fast, no LLM cost) - Default
205+
- **LLM-based** (more accurate, higher cost)
206+
- **Hybrid** (rules first, LLM for ambiguous)
207+
208+
### Configuration
209+
210+
```bash
211+
# Enable classification (default: true)
212+
MADEINOZ_KNOWLEDGE_QUERY_CLASSIFIER_ENABLED=true
213+
214+
# Use LLM for classification (default: false = rule-based)
215+
MADEINOZ_KNOWLEDGE_QUERY_CLASSIFIER_USE_LLM=false
216+
```
217+
218+
---
219+
220+
## Complete Configuration Reference
221+
222+
Add these to your `.env` file:
223+
224+
```bash
225+
# Reranker
226+
MADEINOZ_KNOWLEDGE_RERANKER_ENABLED=true
227+
MADEINOZ_KNOWLEDGE_RERANKER_MODEL=BAAI/bge-reranker-base
228+
MADEINOZ_KNOWLEDGE_RERANKER_TOP_K=20
229+
MADEINOZ_KNOWLEDGE_RERANKER_FINAL_K=10
230+
MADEINOZ_KNOWLEDGE_RERANKER_PROVIDER=local
231+
232+
# Hybrid Search
233+
MADEINOZ_KNOWLEDGE_HYBRID_ENABLED=true
234+
MADEINOZ_KNOWLEDGE_HYBRID_ALPHA=0.7
235+
MADEINOZ_KNOWLEDGE_HYBRID_RRF_K=60
236+
237+
# HyDE
238+
MADEINOZ_KNOWLEDGE_HYDE_ENABLED=true
239+
MADEINOZ_KNOWLEDGE_HYDE_MIN_QUERY_TOKENS=10
240+
MADEINOZ_KNOWLEDGE_HYDE_MAX_HYPOTHETICAL_TOKENS=200
241+
242+
# Multi-Query
243+
MADEINOZ_KNOWLEDGE_MULTI_QUERY_ENABLED=true
244+
MADEINOZ_KNOWLEDGE_MULTI_QUERY_MIN_LENGTH=10
245+
MADEINOZ_KNOWLEDGE_MULTI_QUERY_NUM_VARIANTS=3
246+
247+
# Query Classifier
248+
MADEINOZ_KNOWLEDGE_QUERY_CLASSIFIER_ENABLED=true
249+
MADEINOZ_KNOWLEDGE_QUERY_CLASSIFIER_USE_LLM=false
250+
```
251+
252+
---
253+
254+
## Performance vs Accuracy Trade-offs
255+
256+
| Configuration | Latency | Accuracy | Cost | Use Case |
257+
|--------------|---------|----------|------|----------|
258+
| All enabled | High | Best | High | Production, critical |
259+
| Reranker only | Medium | Good | Low | Balanced |
260+
| Hybrid only | Low | Good | Free | Speed priority |
261+
| All disabled | Fastest | Baseline | Free | Development |
262+
263+
## Disabling Features
264+
265+
For faster development iteration, disable advanced features:
266+
267+
```bash
268+
# Fast development mode
269+
MADEINOZ_KNOWLEDGE_RERANKER_ENABLED=false
270+
MADEINOZ_KNOWLEDGE_HYBRID_ENABLED=false
271+
MADEINOZ_KNOWLEDGE_HYDE_ENABLED=false
272+
MADEINOZ_KNOWLEDGE_MULTI_QUERY_ENABLED=false
273+
```
274+
275+
## Related Documentation
276+
277+
- [RAG Quickstart](quickstart.md) - Basic search usage
278+
- [RAG Configuration](configuration.md) - Qdrant and Ollama setup
279+
- [RAG Troubleshooting](troubleshooting.md) - Common issues

0 commit comments

Comments
 (0)