Skip to content

siim2mary/RAG-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Retrieval-Augmented Generation (RAG) System for Legal Document Q&A This project demonstrates how to build a Retrieval-Augmented Generation (RAG) pipeline that can answer questions about legal documents and retrieve relevant clauses using open-source tools like LangChain, FAISS, and Hugging Face Transformers.

πŸ” Use case: Automating legal document understanding β€” including contract clause retrieval, regulatory Q&A, and compliance insights.

πŸš€ Features βœ… Load and process legal contract datasets.

πŸ“„ Intelligent document chunking using RecursiveCharacterTextSplitter.

πŸ” Semantic search via dense embeddings and FAISS vector store.

πŸ€– Query answering using LLMs like Mistral-7B or other Hugging Face-hosted models.

πŸ“š End-to-end RAG pipeline using LangChain.

βš™οΈ Modular and extendable β€” plug in your own datasets, models, or prompts.

🧠 Technologies Used LangChain – for chaining retrieval and LLM generation.

SentenceTransformers – for generating text embeddings (MiniLM-L6-v2).

FAISS – for fast similarity search over document chunks.

Transformers (Hugging Face) – to load and run LLMs.

BitsAndBytes – for 4-bit quantized LLM loading.

Google Colab – for development and GPU experimentation.

Folder structure: . β”œβ”€β”€ RAG_project3.ipynb # Jupyter Notebook with full pipeline β”œβ”€β”€ data/ # (optional) directory for PDF or text contracts β”œβ”€β”€ README.md # You are here!

About

RAG projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published