Repository for RAG Project

Project completed by:

Ansh Harjai (ah7163)
Ritvik Vasantha Kumar (rv2459)

This repository contains all the necessary files to run the Retrieval-Augmented Generation (RAG) project.

Project Overview

In this project, we fine-tuned the Llama-3-8b-instruct model on Google Colab using a custom dataset stored in data.json. The fine-tuned model was saved to Hugging Face for potential use (link for the fine-tuned model: https://huggingface.co/Data-harjai/ai_project_fine_tuned_llama). However, due to performance constraints when running the fine-tuned model locally, we implemented the RAG pipeline using the Llama-3-70b-chat-hf model, hosted online on Together.ai.

A video demonstration of the Gradio app has been uploaded to this repository, showcasing the interactions with the Large Language Model (LLM), where messages and contextual information are sent, and the model responds accordingly.

Workflow

1. Data Extraction

The source data is extracted from the ROS 2 Documentation GitHub Repository.
The repository's text files are cleaned and stored in MongoDB.
Coding files are also stored in MongoDB without any preprocessing. All other non-text files in the repository are ignored.

2. Storing Data in MongoDB

Each file is saved in MongoDB along with metadata such as:
- file_name
- url
- repo_name

3. Chunking and Embedding Creation

The extracted data is divided into smaller chunks.
Each chunk is converted into a 300-dimensional embedding vector.
These embeddings, along with their associated payloads, are stored in the Qdrant Vector Database.

4. Query Processing

When a user submits a query, the following steps are performed:
1. Retrieve the 5 most similar embeddings from Qdrant based on the query.
2. Create a prompt consisting of:
  - A system message
  - The user query
  - The retrieved context from Qdrant.

5. Generating a Response

The constructed prompt is sent to the LLM.
The LLM processes the prompt and generates a response.
The response is displayed to the user via the Gradio interface.

Files in the Repository

data.json
Contains the custom dataset used for fine-tuning the model.
fine_tuning_notebook.ipynb
The Python notebook used for fine-tuning the Llama-3-8b-instruct model.
fine_tuning_notebook.ipynb
The Python notebook used for fine-tuning the Llama-3-8b-instruct model.
Embedding_demo.ipynb
The Python notebook used to demonstrate the embedding process.
Extraction_demo.ipynb
The Python notebook used to demonstrate the data extraction process.
llm_connection.ipynb
The Python notebook used to demonstrate retrieval and response from the llm.
Gradio App Video
A video showcasing the working of the Gradio interface for user interactions with the RAG system.
Screenshots
This folder has the screenshots of the work done to complete this project.
Docker Files
Docker files to run the project.

How to Run

Clone the repository.
Build the container using: docker build -t ai_demo_run_1 .
Run the docker compose file.
create a docker network using this command: docker network create my_network
Run mongodb container on this network: docker run -d --name mongodb --network my_network -p 27017:27017 mongo:5.0
Run qdrant container on this network: docker run -d --name qdrant --network my_network -p 6333:6333 qdrant/qdrant:v1.3.0
Run the app container using your clearml credentials: docker run -p 8501:8501 -e CLEARML_API_ACCESS_KEY=YOUR_API_KEY -e CLEARML_API_SECRET_KEY=YOUR_API_SECRET -e CLEARML_API_HOST=YOUR_SERVER_URL ai_demo_run_1
Then open the gradio app on your browser and run the initialize database command to start the etl pipeline, and initialize a qdrant vector database.
Finally you can ask questions to the LLM.

This repository covers the full RAG workflow, demonstrating the use of modern tools to efficiently retrieve context-aware responses using LLMs.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Screenshots		Screenshots
app		app
.DS_Store		.DS_Store
Demo_LLM.mov		Demo_LLM.mov
Embedding_demo.ipynb		Embedding_demo.ipynb
Extraction_demo.ipynb		Extraction_demo.ipynb
clearmmll.txt		clearmmll.txt
data.json		data.json
docker-compose.yml		docker-compose.yml
fine-tune.ipynb		fine-tune.ipynb
llm_connection.ipynb		llm_connection.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Repository for RAG Project

Project Overview

Workflow

1. Data Extraction

2. Storing Data in MongoDB

3. Chunking and Embedding Creation

4. Query Processing

5. Generating a Response

Files in the Repository

How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Repository for RAG Project

Project Overview

Workflow

1. Data Extraction

2. Storing Data in MongoDB

3. Chunking and Embedding Creation

4. Query Processing

5. Generating a Response

Files in the Repository

How to Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages