Thanks to ShorterBetter; this work is modified and extended on top of their code base.
Paper: Beyond Token Length: Step Pruner for Efficient and Accurate Reasoning in Large Language Models
Official repository for paper: Beyond Token Length: Step Pruner for Efficient and Accurate Reasoning in Large Language Models.
Model:
SP-1.5B: ModelScope Model
SP-7B: ModelScope Model
-
Clone the repository with VERL submodule:
git clone --recursive https://github.com/your-username/StepPruner.git cd StepPruner -
Install VERL dependencies:
Follow the official VERL installation guide for detailed instructions. The basic installation involves:
# Install from source cd verl pip install -e . # Install additional dependencies for training backends pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 pip install flash-attn --no-build-isolation pip install vllm>=0.8.0 # For rollout generation
- Install additional dependencies:
cd .. # Back to root pip install -r requirements.txt
Training datasets are prepared and located in:
- Location:
/deepscaler/data/
- We use lighteval to evaluta our model.
The training scripts are located in scripts/train/ and include:
sb_7b.sh- Training script for 7B parameter modelssb_1.5B.sh- Training script for 1.5B parameter models
-
Configure your environment variables
-
Customize reward function (optional):
- Edit
/StepPruner/verl/verl/workers/reward_manager/naive.py
- Edit
-
Run training:
# For 7B model bash scripts/train/sb_7b.sh # For 1.5B model bash scripts/train/sb_1.5B.sh