Skip to content

Commit cd5398b

Browse files
committed
Add reranker model support in hybrid search
1 parent ace030a commit cd5398b

File tree

5 files changed

+594
-7
lines changed

5 files changed

+594
-7
lines changed

.github/workflows/main.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,10 @@ jobs:
2929
- name: Install dependencies
3030
run: |
3131
python -m pip install --upgrade pip
32-
pip install ruff pytest pytest-mock pytest-asyncio respx
33-
pip install pgvector asyncpg
32+
# pip install ruff pytest pytest-mock pytest-asyncio respx
3433
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
35-
# Install OgbujiPT itself
36-
pip install -U .
34+
# Install OgbujiPT itself, plus dependencies needed to run the full CI, with test suite
35+
pip install -U ".[testall]"
3736
3837
- name: Lint with ruff
3938
run: |

demo/pg-hybrid/README.md

Lines changed: 72 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,18 @@ For learning the fundamentals interactively, this notebook covers:
1919
- Tuning BM25 parameters (k1, b)
2020
- Understanding RRF fusion
2121

22-
## 2. `chat_with_hybrid_kb.py`
22+
## 2. `hybrid_rerank_demo.py`
23+
24+
**NEW!** Demonstrates three-stage retrieval with cross-encoder reranking:
25+
- Initial retrieval: Fast RRF fusion (dense + sparse)
26+
- Reranking: Accurate cross-encoder scoring of top candidates
27+
- Comparing all methods: dense, sparse, hybrid, and reranked
28+
- Shows how reranking reorders RRF results for better accuracy
29+
30+
Install additional dependency: `uv pip install "rerankers[transformers]"`
31+
32+
## 3. `chat_with_hybrid_kb.py`
33+
2334
For cut & paste ready, application patterns, an advanced conversational AI demo with:
2435
- Chat history tracking (MessageDB)
2536
- Knowledge base with hybrid search (DataDB + BM25)
@@ -182,6 +193,65 @@ results = hybrid.execute(
182193

183194
Hybrid Search is best for general-purpose search; it's the best of both worlds.
184195

196+
### Reranked Hybrid Search (RRF + Cross-Encoder)
197+
198+
```python
199+
from ogbujipt.retrieval import RerankedHybridSearch, BM25Search
200+
from rerankers import Reranker
201+
202+
# Initialize cross-encoder reranker
203+
reranker = Reranker(model_name='BAAI/bge-reranker-base')
204+
205+
# Combine hybrid search with reranking
206+
reranked = RerankedHybridSearch(
207+
strategies=[dense_search, BM25Search()],
208+
reranker=reranker,
209+
rerank_top_k=20, # Rerank top 20 from initial retrieval
210+
k=60 # RRF constant
211+
)
212+
213+
results = reranked.execute(
214+
query='machine learning algorithms',
215+
backends=[knowledge_db],
216+
limit=5
217+
)
218+
```
219+
220+
**Why reranking?**
221+
- **Speed + Accuracy**: Fast initial retrieval (RRF), slow but accurate final ranking (cross-encoder)
222+
- **Better than RRF alone**: Cross-encoders see query-document interactions that embeddings miss
223+
- **Efficient**: Only rerank top-K candidates (e.g., top 20), not entire corpus
224+
225+
**Popular reranker models:**
226+
- `BAAI/bge-reranker-base`: Good balance of speed and quality (recommended)
227+
- `BAAI/bge-reranker-large`: Higher quality, slower
228+
- `cross-encoder/ms-marco-MiniLM-L-12-v2`: Faster, decent quality
229+
- `zeroentropy/zerank-2`: Instruction-following, multilingual
230+
- Requires `trust_remote_code=True` and `batch_size=1` (padding token issue)
231+
- Example:
232+
```python
233+
reranker = Reranker(
234+
model_name='zeroentropy/zerank-2',
235+
model_kwargs={'trust_remote_code': True},
236+
batch_size=1 # Required: model lacks padding token
237+
)
238+
```
239+
240+
**Installation:**
241+
```bash
242+
uv pip install "rerankers[transformers]"
243+
```
244+
245+
**Troubleshooting:**
246+
247+
*Error: "Cannot handle batch sizes > 1 if no padding token is defined"*
248+
- Some models (e.g., zerank-2) don't have a padding token configured
249+
- Solution: Set `batch_size=1` when creating the Reranker
250+
- Trade-off: Slower processing (one document at a time) but will work
251+
- See `hybrid_rerank_demo.py` for model-specific configurations
252+
253+
See `hybrid_rerank_demo.py` for a complete example comparing all methods.
254+
185255
## Sparse Vector Storage (Advanced)
186256

187257
For storing sparse vectors directly (e.g., precomputed BM25 vectors):
@@ -317,8 +387,8 @@ uv pip install -U . # From OgbujiPT root directory
317387

318388
- Read the [Phase 2 Architecture Notes](../../ARCHITECTURE.md) (if available)
319389
- Explore combining with graph RAG using [Onya](https://github.com/OoriData/Onya)
320-
- Implement reranking with cross-encoders
321390
- Add query expansion or pseudo-relevance feedback
391+
- Experiment with different reranker models for your domain
322392

323393
# References
324394

0 commit comments

Comments
 (0)