This example demonstrates training a Retrieval-Augmented Generation (RAG) agent using Agent-Lightning with retrieval capabilities. The agent answers multi-hop questions from a tiny MuSiQue dataset by retrieving and reasoning over Wikipedia passages.
This example can run on a single GPU for demonstration purposes.
Step 1: Set up the environment. It is recommended to setup with uv and activate the virtual environment with:
uv sync --frozen --extra apo --group agents --group torch-gpu-stable --extra verl --group rag
source .venv/bin/activateStep 2: Prepare the tiny dataset.
pip install gdown
# tiny training dataset
cd examples/rag
gdown --fuzzy "https://drive.google.com/file/d/1Pq4Ag8zVoN8gUtLu0LcBfY35Dm5zL0hq/view?usp=drive_link" \
-O dataset_tiny.parquet
# chunks_candidate_tiny.pkl
gdown --fuzzy "https://drive.google.com/file/d/1REXCpRLbeZu1KfWWKhIGEQe_WNHUOBkS/view?usp=drive_link" \
-O chunks_candidate_tiny.pkl
# index_hnsw_faiss_n32e40_tiny.index
gdown --fuzzy "https://drive.google.com/file/d/1f6P-h_8KSRhe5pqDHWbRQWvUhTygfZ-c/view?usp=drive_link" \
-O index_hnsw_faiss_n32e40_tiny.indexStep 3: Start the MCP server. Open a terminal and run:
python wiki_retriever_mcp.pyStep 4: Start training. Open another terminal and run:
python train_rag.py| File/Directory | Description |
|---|---|
rag_agent.py |
RAG agent example using the OpenAI Agents SDK, with debugging utils |
train_rag.py |
Initiates the GRPO training process |
metric_utils.py |
Scoring utilities for exact match, F1 score, and response parsing |
wiki_retriever_mcp.py |
MCP server for Wikipedia retrieval |
To enable semantic retrieval with this MCP server, you need two files:
- FAISS index file (
.index) - Chunk list file (
.pkl)
These two files work together: the FAISS index stores the vector embeddings and their mapping to integer IDs, while the pickle file stores the actual text chunks. The integer IDs in the index correspond exactly to the positions in the chunk list.
First, you need a collection of text passages (chunks). For example, you can download a Wikipedia-based dataset such as wiki18_100w.zip from the FlashRAG_dataset or use other pre-split corpora.
- Use a sentence embedding model (e.g.,
BAAI/bge-large-en-v1.5) to encode each chunk into a vector. - Build a FAISS index from these vectors.
- In this example, we use an HNSW index (Hierarchical Navigable Small World graph), which supports efficient approximate nearest-neighbor search.
- The index stores only embeddings and integer IDs (no raw text).
- Store the raw text chunks in a Python list.
- Save this list with
pickle. - The index ID returned by FAISS corresponds to the list index in this file. For example, if FAISS search returns
I[0][i] = 12345, then the corresponding text chunk ischunks[12345].
-
nq_hnsw_faiss_n32e40.index- Type: FAISS HNSW index
- Contains:
- Vector embeddings
- Graph structure for fast search
- Integer IDs mapping to chunk positions
-
nq_list.pkl- Type: Pickled Python list
- Element type: string (or dict with text + metadata, depending on preprocessing)
- Example:
[ "The Eiffel Tower is located in Paris, France.", "Albert Einstein developed the theory of relativity.", ... ]
Warning: The following example demonstrates a small-scale workflow only. In practice, for large datasets, you should encode the text in batches and incrementally add them to the index.
import faiss
import pickle
from sentence_transformers import SentenceTransformer
# 1. Prepare your text chunks (list of strings)
chunk_texts = [
"The Eiffel Tower is located in Paris, France.",
"Albert Einstein developed the theory of relativity.",
"Python is a popular programming language.",
# ... more chunks
]
# 2. Load embedding model
model = SentenceTransformer("BAAI/bge-large-en-v1.5")
# 3. Encode text chunks into embeddings
embeddings = model.encode(chunk_texts, normalize_embeddings=True)
# 4. Build FAISS HNSW index
dim = embeddings.shape[1]
index = faiss.IndexHNSWFlat(dim, 32) # 32 neighbors by default
index.hnsw.efConstruction = 40 # efConstruction parameter
index.add(embeddings)
# 5. Save FAISS index
faiss.write_index(index, "nq_hnsw_faiss_n32e40.index")
# 6. Save chunk list
with open("nq_list.pkl", "wb") as f:
pickle.dump(chunk_texts, f)
print("Index and chunk list saved successfully.")