This project demonstrates a Retrieval-Augmented Generation (RAG) pipeline using:
- Google's Gemini 1.5 Flash (for chat and embeddings),
- FAISS (for vector search),
- LangChain (for orchestration),
- and LangSmith (for tracing/debugging).
The RAG pipeline has two main components:
Used to preprocess and store documents in a searchable format.
Steps:
Load --> Split --> Embed --> Store
- Load: Load documents using
Documentloader - Split: (optional) Use text splitters for large texts
- Embed: Use Gemini embeddings (
embedding-001) - Store: Store vectors in FAISS for fast retrieval
Used at inference time to fetch relevant content and generate answers.
Flow:
Question --> Retrieve --> Prompt --> LLM --> Answer
- Retrieve: Pull top-k relevant chunks from FAISS
- Prompt: Combine question + retrieved content
- LLM: Use Gemini 1.5 Flash to generate the answer
- Answer: Return final response to user
Install dependencies:
pip install -r requirements.txtCreate a .env file in the root directory:
GOOGLE_API_KEY=your_google_api_key
LANGCHAIN_API_KEY=your_langsmith_api_key
LANGCHAIN_PROJECT=your_project_name # optional
python rag_app.pyYou'll be prompted to ask a question, and the system will return a generated answer along with the retrieved source documents.
The FAISS vector index is saved locally as:
my_faiss_index/
You can reuse it later without re-indexing documents.
The code includes optional LangSmith support:
- Enables tracing and monitoring your LLM pipeline.
- Helps debug prompt flow and understand model behavior.
📌 Setup your LangSmith account to get the LANGCHAIN_API_KEY.
Be cautious about API usage:
- Google offers free tier via AI Studio.
- Use budget alerts in your Google Cloud Console.
- Monitor usage and avoid unnecessary charges.
| Tool | Purpose |
|---|---|
| Gemini Flash | Fast LLM for generation |
| FAISS | Vector store for retrieval |
| LangChain | Orchestrates RAG pipeline |
| LangSmith | Logs/traces LLM executions |
| dotenv | Loads API keys from .env |
- Add PDF/Text file loaders
- Integrate a UI (e.g., Streamlit or Gradio)
- Add evaluation metrics with LangSmith
- Implement text chunking and metadata
Ask a question: What is RAG?
Answer: RAG stands for Retrieval-Augmented Generation...
Sources:
[1] RAG stands for Retrieval-Augmented Generation.
[2] LangChain is a framework...