@@ -19,7 +19,18 @@ For learning the fundamentals interactively, this notebook covers:
1919- Tuning BM25 parameters (k1, b)
2020- Understanding RRF fusion
2121
22- ## 2. ` chat_with_hybrid_kb.py `
22+ ## 2. ` hybrid_rerank_demo.py `
23+
24+ ** NEW!** Demonstrates three-stage retrieval with cross-encoder reranking:
25+ - Initial retrieval: Fast RRF fusion (dense + sparse)
26+ - Reranking: Accurate cross-encoder scoring of top candidates
27+ - Comparing all methods: dense, sparse, hybrid, and reranked
28+ - Shows how reranking reorders RRF results for better accuracy
29+
30+ Install additional dependency: ` uv pip install "rerankers[transformers]" `
31+
32+ ## 3. ` chat_with_hybrid_kb.py `
33+
2334For cut & paste ready, application patterns, an advanced conversational AI demo with:
2435- Chat history tracking (MessageDB)
2536- Knowledge base with hybrid search (DataDB + BM25)
@@ -182,6 +193,65 @@ results = hybrid.execute(
182193
183194Hybrid Search is best for general-purpose search; it's the best of both worlds.
184195
196+ ### Reranked Hybrid Search (RRF + Cross-Encoder)
197+
198+ ``` python
199+ from ogbujipt.retrieval import RerankedHybridSearch, BM25Search
200+ from rerankers import Reranker
201+
202+ # Initialize cross-encoder reranker
203+ reranker = Reranker(model_name = ' BAAI/bge-reranker-base' )
204+
205+ # Combine hybrid search with reranking
206+ reranked = RerankedHybridSearch(
207+ strategies = [dense_search, BM25Search()],
208+ reranker = reranker,
209+ rerank_top_k = 20 , # Rerank top 20 from initial retrieval
210+ k = 60 # RRF constant
211+ )
212+
213+ results = reranked.execute(
214+ query = ' machine learning algorithms' ,
215+ backends = [knowledge_db],
216+ limit = 5
217+ )
218+ ```
219+
220+ ** Why reranking?**
221+ - ** Speed + Accuracy** : Fast initial retrieval (RRF), slow but accurate final ranking (cross-encoder)
222+ - ** Better than RRF alone** : Cross-encoders see query-document interactions that embeddings miss
223+ - ** Efficient** : Only rerank top-K candidates (e.g., top 20), not entire corpus
224+
225+ ** Popular reranker models:**
226+ - ` BAAI/bge-reranker-base ` : Good balance of speed and quality (recommended)
227+ - ` BAAI/bge-reranker-large ` : Higher quality, slower
228+ - ` cross-encoder/ms-marco-MiniLM-L-12-v2 ` : Faster, decent quality
229+ - ` zeroentropy/zerank-2 ` : Instruction-following, multilingual
230+ - Requires ` trust_remote_code=True ` and ` batch_size=1 ` (padding token issue)
231+ - Example:
232+ ``` python
233+ reranker = Reranker(
234+ model_name = ' zeroentropy/zerank-2' ,
235+ model_kwargs = {' trust_remote_code' : True },
236+ batch_size = 1 # Required: model lacks padding token
237+ )
238+ ```
239+
240+ ** Installation:**
241+ ```bash
242+ uv pip install " rerankers[transformers]"
243+ ```
244+
245+ ** Troubleshooting:**
246+
247+ * Error: "Cannot handle batch sizes > 1 if no padding token is defined"*
248+ - Some models (e.g., zerank-2) don't have a padding token configured
249+ - Solution: Set ` batch_size=1 ` when creating the Reranker
250+ - Trade-off: Slower processing (one document at a time) but will work
251+ - See ` hybrid_rerank_demo.py ` for model-specific configurations
252+
253+ See ` hybrid_rerank_demo.py ` for a complete example comparing all methods.
254+
185255## Sparse Vector Storage (Advanced)
186256
187257For storing sparse vectors directly (e.g., precomputed BM25 vectors):
@@ -317,8 +387,8 @@ uv pip install -U . # From OgbujiPT root directory
317387
318388- Read the [ Phase 2 Architecture Notes] ( ../../ARCHITECTURE.md ) (if available)
319389- Explore combining with graph RAG using [ Onya] ( https://github.com/OoriData/Onya )
320- - Implement reranking with cross-encoders
321390- Add query expansion or pseudo-relevance feedback
391+ - Experiment with different reranker models for your domain
322392
323393# References
324394
0 commit comments