RAGS is a modular, production-ready pipeline for Retrieval-Augmented Generation (RAG) using Python, HuggingFace Transformers, FAISS, and PEFT (LoRA). It supports document embedding, retrieval, LLM fine-tuning, and automated evaluation, with a focus on clarity, reproducibility, and extensibility. The codebase is clean, minimal, and easy to extend.
- Document Embedding:
- Converts
.txt
or.jsonl
files into FAISS vector indexes using Sentence Transformers. - Auto-chunks text and supports flexible input formats.
- Converts
- Retrieval Pipeline:
- Uses FAISS for fast similarity search over embedded documents.
- LLM Fine-Tuning:
- Fine-tunes language models (e.g., GPT-2) on custom datasets using HuggingFace and PEFT (LoRA).
- Supports incremental retraining: just set
model_name
to your previous model folder.
- Evaluation:
- Automated similarity-based evaluation and accuracy reporting.
- Results are written back to the same CSV for easy tracking.
- Interactive Chat:
- Terminal chat interface for live Q&A with your RAG pipeline.
- CLI Entry Point:
- All major tasks can be run from
app.py
or individual scripts.
- All major tasks can be run from
├── app.py # CLI entry point (optional)
├── requirements.txt # Python dependencies
├── README.md # Project documentation
├── data/ # Raw and processed data
│ ├── raw/ # Source text files
│ └── ...
├── embeddings/ # FAISS indexes (per file, with datetime)
├── models/ # Fine-tuned LLMs (with datetime)
├── reports/ # Evaluation CSVs
├── src/
│ ├── embedder.py # Embedding pipeline
│ ├── corpus_jsonl_gen.py # Text-to-JSONL converter
│ ├── evaluation.py # Evaluation utilities
│ ├── train_llm.py # LLM fine-tuning (LoRA)
│ ├── pipeline.py # RAG pipeline (retrieval + generation)
│ ├── interface.py # Terminal chat & batch QA
│ └── ...
├── run_finetune.py # Script to launch fine-tuning
└── testing.py # Evaluation script
- Install dependencies:
pip install -r requirements.txt
- Prepare data:
- Place your
.txt
or.jsonl
files indata/raw/
. - For
.txt
files, the pipeline will auto-convert to.jsonl
.
- Place your
Create FAISS indexes from your data:
from src.embedder import vector_from_jsonl
vector_from_jsonl('data/raw/yourfile.txt') # or .jsonl
Output: embeddings/yourfile_<datetime>/index.faiss
Edit src/train_llm.py
:
- First time:
- Set
model_name = "gpt2-medium"
(or any HuggingFace model).
- Set
- Retraining:
- Set
model_name = "models/fine_tuned_rag_model_<datetime>"
to continue training from your previous checkpoint.
- Set
Or use the script:
python run_finetune.py
Data: Expects data/fine_tune_dataset.jsonl
with prompt
and completion
fields.
Output: models/fine_tuned_rag_model_<datetime>/
Start a terminal chat session with your RAG pipeline:
from src.interface import terminalchat
from src.pipeline import pipelinefn
qa_chain = pipelinefn()
terminalchat(qa_chain)
Type your questions, type 'exit' to quit. Only the best answer is shown.
Run batch QA and evaluation:
from src.interface import improve
from src.evaluation import evaluate_and_save, report_eval
improve(qa_chain, your_dataframe, csv_path='reports/improveX.csv')
evaluate_and_save(X)
report_eval(X)
Results are written back to reports/improveX.csv
(adds Similarity
and Correct
columns).
Or use the CLI/testing script:
python testing.py
- All evaluation and improvement results are stored in the same CSV for simplicity.
- The embedding pipeline only creates standalone indexes (no global index).
- Fine-tuned models and embeddings are versioned by datetime for reproducibility.
- The codebase is modular—extend or swap components as needed.
- Python 3.8+
- HuggingFace Transformers
- sentence-transformers
- peft
- datasets
- pandas
- faiss-cpu
MIT License
- HuggingFace Transformers & Datasets
- PEFT (LoRA)
- FAISS
- Sentence Transformers