- In this end to end project I have built a RAG app using ObjectBox Vector Databse and LangChain. RAG techniques allow us to augment a language model's knowledge base actively, ensuring your AI can access and reason with your data and the very latest information. With ObjectBox you can do that, without the data ever needing to leave the device.
- You can check the project live here
- This project showcase the implementation of an advanced RAG system that uses Objectbox vectordatabse and Groq's LLAM3 model as an llm to retrieve information from different PDF documents.
Steps I followed:
- I have used the
PyPdfDirectoryLoaderfrom thelangchain_communitydocument loader to load the PDF documents from theus-census-datadirectory. - transformed each text into a chunk of
1000using theRecursiveCharacterTextSplitterimported from thelangchain.text_splitter - stored the vector embeddings which were made using the
HuggingFaceBgeEmbeddingsusing theObjectBoxvector store. - setup the llm
ChatGroqwith the model nameLlama3-8b-8192 - Setup
ChatPromptTemplate - Setup
vector_embeddingfunction to enbedd the documents and store them in theObjectBoxvectorstore - finally created the
document_chainandretrieval_chainfor chaining llm to prompt andretrievertodocument_chainrespectively
- langchain==0.1.20
- langchain-community==0.0.38
- langchain-core==0.1.52
- langchain-groq==0.1.3
- langchain-objectbox
- python-dotenv==1.0.1
- pypdf==4.2.0
- Prerequisites
- Git
- Command line familiarity
- Clone the Repository:
git clone https://github.com/NebeyouMusie/End-to-End-RAG-Project-using-ObjectBox-and-LangChain.git - Create and Activate Virtual Environment (Recommended)
python -m venv venvsource venv/bin/activate
- Navigate to the projects directory
cd ./End-to-End-RAG-Project-using-ObjectBox-and-LangChainusing your terminal - Install Libraries:
pip install -r requirements.txt - Navigate to the app directory
cd ./appusing your terminal - run
streamlit run app.py - open the link displayed in the terminal on your preferred browser
- As I have already embedded the documents you don't need to click on the
Embedd Documentsbutton/ But, if it's not working then you need to click on theEmbedd Documentsbutton and wait until the documnets are processed - Enter your question from the PDFs found in the
us-census-datadirectory
- Collaborations are welcomed ❤️
- I would like to thank Krish Naik
- LinkedIn: Nebeyou Musie
- Gmail: [email protected]
- Telegram: Nebeyou Musie
