| Deepseek | Deepseek + Contextual RAG | |
|---|---|---|
| Accuracy* | ||
| Relevance* | ||
| Latency |
| Query classification -> | Retrieval -> | Reranking -> | Repacking -> | Summarisation |
|---|---|---|---|---|
| Determining whether retrieval is necessary for a given input query | Efficiently obtaining relevant documents for the query | Refining the order of retrieved documents based on their relevance to the query | Organizing the retrieved documents into a structured one for better generation | Extracting key information from the repacked document and eliminating redundancies |
Implementation, based on Anthropic's Introducing Contextual Retrieval, 2024.
The project is mostly based on LlamaIndex for retrieval. First time using LlamaIndex. Easy integration, hefty documentation.
| Pipeline component | specification | motivation |
|---|---|---|
| Chunking strategy | semantic level chunking with chunk boundaries at document sections, chunk overlap ? | document structure |
| Embedding method | E5 | great for long and dense legal text |
| Choice of vector db | ||
| Hybrid retrieval | BM25(based on TF-IDF) + embeddings | precise word/phrase matching + sentence-level embeddings |
| Number of chunks (k) | 20 | recommended by Anthropic (not empirially tested here) |
Deployment notes:
- ! Note: FAISS is unstable on python 3.11. Use python 3.10.
- Manual evaluation metrics from Deriu et al, ....