This example demonstrates how to train a text-to-SQL agent on the Spider dataset using Agent-Lightning with reinforcement learning. It's compatible with Agent-lightning v0.2 or later.
This example depends on LangChain v0.x and several SQL-related libraries. Install the required dependencies with:
pip install "langgraph<1.0" "langchain[openai]<1.0" "langchain-community" "langchain-text-splitters<1.0" "sqlparse" "nltk"Additionally, follow the installation guide to install Agent-Lightning and VERL-related dependencies.
Detailed dataset preparation instructions are available in the How to Train a SQL Agent guide.
| File/Directory | Description |
|---|---|
train_sql_agent.py |
Training script for SQL agents with support for multiple model configurations (Qwen, LLaMA, fast mode for CI) |
sql_agent.py |
SQL agent implementation using LangGraph and LangChain, with debugging capabilities |
data/ |
Directory containing the Spider dataset files |
spider_eval/ |
Evaluation utilities for assessing SQL agent performance |
Train a SQL agent using the Qwen2.5-Coder-1.5B-Instruct model with the following command. This requires a single node with at least one 40GB GPU:
python train_sql_agent.py qwenIf you want to use an NPU for training, please refer to the Launch Training with NPUS section in How to Train a SQL Agent.
To test and debug the SQL agent interactively:
python sql_agent.pyThis command requires an OpenAI-compatible API service. Configure your service endpoint and credentials using the OPENAI_API_BASE and OPENAI_API_KEY environment variables.