Gemini-Powered RAG Chatbot using LangChain

This project demonstrates a Retrieval-Augmented Generation (RAG) pipeline using:

Google's Gemini 1.5 Flash (for chat and embeddings),
FAISS (for vector search),
LangChain (for orchestration),
and LangSmith (for tracing/debugging).

Project Overview

The RAG pipeline has two main components:

1. Indexing Pipeline

Used to preprocess and store documents in a searchable format.

Steps:


Load --> Split --> Embed --> Store

Load: Load documents using Document loader
Split: (optional) Use text splitters for large texts
Embed: Use Gemini embeddings (embedding-001)
Store: Store vectors in FAISS for fast retrieval

2. Retrieval + Generation

Used at inference time to fetch relevant content and generate answers.

Flow:


Question --> Retrieve --> Prompt --> LLM --> Answer

Retrieve: Pull top-k relevant chunks from FAISS
Prompt: Combine question + retrieved content
LLM: Use Gemini 1.5 Flash to generate the answer
Answer: Return final response to user

How to Run

Prerequisites

Install dependencies:

pip install -r requirements.txt

.env File

Create a .env file in the root directory:

GOOGLE_API_KEY=your_google_api_key
LANGCHAIN_API_KEY=your_langsmith_api_key
LANGCHAIN_PROJECT=your_project_name   # optional

Run the App

python rag_app.py

You'll be prompted to ask a question, and the system will return a generated answer along with the retrieved source documents.

FAISS Index

The FAISS vector index is saved locally as:

my_faiss_index/

You can reuse it later without re-indexing documents.

LangSmith Integration

The code includes optional LangSmith support:

Enables tracing and monitoring your LLM pipeline.
Helps debug prompt flow and understand model behavior.

📌 Setup your LangSmith account to get the LANGCHAIN_API_KEY.

Google Gemini API Billing

Be cautious about API usage:

Google offers free tier via AI Studio.
Use budget alerts in your Google Cloud Console.
Monitor usage and avoid unnecessary charges.

🔗 Gemini Pricing

Tech Stack

Tool	Purpose
Gemini Flash	Fast LLM for generation
FAISS	Vector store for retrieval
LangChain	Orchestrates RAG pipeline
LangSmith	Logs/traces LLM executions
dotenv	Loads API keys from `.env`

Future Improvements

Add PDF/Text file loaders
Integrate a UI (e.g., Streamlit or Gradio)
Add evaluation metrics with LangSmith
Implement text chunking and metadata

Example Query

Ask a question: What is RAG?

 Answer: RAG stands for Retrieval-Augmented Generation...

Sources:
[1] RAG stands for Retrieval-Augmented Generation.
[2] LangChain is a framework...

References

https://python.langchain.com/docs/tutorials/rag/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemini-Powered RAG Chatbot using LangChain

Project Overview

1. Indexing Pipeline

2. Retrieval + Generation

How to Run

Prerequisites

.env File

Run the App

FAISS Index

LangSmith Integration

Google Gemini API Billing

Tech Stack

Future Improvements

Example Query

References

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Gemini-Powered RAG Chatbot using LangChain

Project Overview

1. Indexing Pipeline

2. Retrieval + Generation

How to Run

Prerequisites

.env File

Run the App

FAISS Index

LangSmith Integration

Google Gemini API Billing

Tech Stack

Future Improvements

Example Query

References