This example demonstrates training a visual reasoning agent on the ChartQA dataset using Agent-Lightning with the VERL algorithm and LangGraph framework. The agent answers questions about charts through a multi-step workflow with self-refinement.
This example requires a single node with at least one 40GB GPU. Install dependencies with:
uv sync --frozen \
--group dev \
--group experiment \
--group image \
--group langchain \
--group vllm-0-10-2 \
--group torch-gpu-stableCurrently vLLM 0.10.2 is the only tested version. You might see issues like cu_seqlens_q must be on CUDA or flash-attn installation failures if you use other versions. (See vllm-project/vllm#27340)
Download the ChartQA dataset and prepare it for training:
cd examples/chartqa
python prepare_data.pyThis downloads the ChartQA dataset from HuggingFace (HuggingFaceM4/ChartQA), saves images locally, and creates parquet files for training/testing. No HuggingFace token is required (public dataset).
Dataset Statistics:
- Training: ~18,000 chart question-answer pairs
- Test: ~2,500 pairs
- Chart types: Bar, line, pie, scatter, etc.
| File/Directory | Description |
|---|---|
chartqa_agent.py |
Chart reasoning agent using LangGraph with multi-step workflow (observe → extract → calculate → check → refine) |
train_chartqa_agent.py |
Training script using VERL algorithm with configurable hyperparameters (debug, qwen) |
debug_chartqa_agent.py |
Debugging script to test the agent with cloud APIs or a local vLLM proxy |
prepare_data.py |
Script to download ChartQA dataset from HuggingFace and prepare parquet files |
prompts.py |
Prompt templates for the agent workflow |
multimodal_utils.py |
Utility functions for encoding images to base64 |
env_var.py |
Environment variables and configurations |
data/ |
Directory containing images and parquet files after download |
For quick testing with OpenAI or other cloud APIs (no local GPU required):
export OPENAI_API_KEY=<your-api-key>
python debug_chartqa_agent.pyFor other providers (Azure, etc.), set OPENAI_API_BASE:
export OPENAI_API_BASE=https://your-resource.openai.azure.com/v1
export OPENAI_MODEL=gpt-4o
python debug_chartqa_agent.pyTo test the agent with a local vLLM server and LLMProxy:
# Start a vLLM server (specify image path for VLM)
export CHARTQA_DATA_DIR=<path to chartqa data>
vllm serve Qwen/Qwen2-VL-2B-Instruct \
--gpu-memory-utilization 0.6 \
--max-model-len 4096 \
--allowed-local-media-path $CHARTQA_DATA_DIR \
--enable-prefix-caching \
--port 8088
# Run the agent with LLMProxy
USE_LLM_PROXY=1 \
OPENAI_API_BASE=http://localhost:8088/v1 \
OPENAI_MODEL=Qwen/Qwen2-VL-2B-Instruct \
python debug_chartqa_agent.pypython train_chartqa_agent.py debug --n-runners 2You can also use an external store server (recommended for distributed setups), first start the store:
agl store --port 4747Then run the training script with the external store address:
AGL_MANAGED_STORE=0 python train_chartqa_agent.py qwen --external-store-address http://localhost:4747If you want to track experiments with Weights & Biases, set the WANDB_API_KEY environment variable before training.
The script automatically launches agent workers and the training server. The agent workers execute chart reasoning rollouts using the vision-language model, while the training server applies the VERL algorithm (GRPO) to improve the model based on answer accuracy rewards.