|
Deepseek |
Deepseek + Contextual RAG |
| Accuracy* |
|
|
| Relevance* |
|
|
| Latency |
|
|
| Query classification -> |
Retrieval -> |
Reranking -> |
Repacking -> |
Summarisation |
| Determining whether retrieval is necessary for a given input query |
Efficiently obtaining relevant documents for the query |
Refining the order of retrieved documents based on their relevance to the query |
Organizing the retrieved documents into a structured one for better generation |
Extracting key information from the repacked document and eliminating redundancies |
The project is mostly based on LlamaIndex for retrieval. First time using LlamaIndex. Easy integration, hefty documentation.
| Pipeline component |
specification |
motivation |
| Chunking strategy |
semantic level chunking with chunk boundaries at document sections, chunk overlap ? |
document structure |
| Embedding method |
E5 |
great for long and dense legal text |
| Choice of vector db |
|
|
| Hybrid retrieval |
BM25(based on TF-IDF) + embeddings |
precise word/phrase matching + sentence-level embeddings |
| Number of chunks (k) |
20 |
recommended by Anthropic (not empirially tested here) |
Deployment notes:
- ! Note: FAISS is unstable on python 3.11. Use python 3.10.