hostingLLM: Multi-modal RAG Pipeline

A basic, end-to-end Retrieval-Augmented Generation (RAG) pipeline that combines image and text retrieval with generative AI. Uses Google Gemini for multi-modal generation, OpenAI CLIP for embeddings, and ChromaDB for vector search.

Features

Multi-modal (image + text) semantic search and extraction
Google Gemini LLM integration for document analysis
OpenAI CLIP for image and text embeddings
ChromaDB for fast vector search
Docker support for easy deployment

Setup

Clone the repo:
```
git clone <your-repo-url>
cd hostingLLM
```

Install dependencies:

pip install -r requirements.txt

Or use Docker:

docker build -t hostingllm .
docker run --env-file .env hostingllm

Set up your .env file:
```
GOOGLE_API_KEY=your_google_api_key_here
```

Usage

1. Index your images

python index_image.py

This will embed all images in the docs/ folder and store them in ChromaDB.

2. Retrieve and generate

Edit retrieve_and_generate.py to set your query and prompt, then run:

python retrieve_and_generate.py

3. Gemini single-image demo

python geminivllm.py

This runs a simple Gemini demo on a single image and prompt.

4. Main demo (optional, text-only RAG)

python main.py

This runs a text-only RAG pipeline using SentenceTransformers, FAISS, and vllm.

Environment Variables

GOOGLE_API_KEY: Your Google Gemini API key (required)

Folder Structure

index_image.py — Indexes images into ChromaDB
retrieve_and_generate.py — Retrieves relevant images and runs Gemini
geminivllm.py — Gemini single-image demo
main.py — (Optional) Text-only RAG demo
docs/ — Sample images and test files

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
.dockerignore		.dockerignore
.gitignore		.gitignore
README.md		README.md
dockerfile		dockerfile
geminivllm.py		geminivllm.py
index_image.py		index_image.py
main.py		main.py
requirements.txt		requirements.txt
retrieve_and_generate.py		retrieve_and_generate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hostingLLM: Multi-modal RAG Pipeline

Features

Setup

Usage

1. Index your images

2. Retrieve and generate

3. Gemini single-image demo

4. Main demo (optional, text-only RAG)

Environment Variables

Folder Structure

License

About

Uh oh!

Releases

Packages

Languages

MeghneelG0/Multi-Modal-RAG

Folders and files

Latest commit

History

Repository files navigation

hostingLLM: Multi-modal RAG Pipeline

Features

Setup

Usage

1. Index your images

2. Retrieve and generate

3. Gemini single-image demo

4. Main demo (optional, text-only RAG)

Environment Variables

Folder Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages