Skip to content

josecols/ft-llms-tillicum

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fine-tuning LLMs on Custom Datasets

Example project for fine-tuning, running inference, and evaluating LLMs using TorchTune on UW's Tillicum.

Getting Started · Usage · Additional Resources

Getting Started

This project uses TorchTune to demonstrate how to fine-tune LLMs with LoRA on Tillicum's multi-GPU nodes.

Prerequisites

Installation

  1. SSH into Tillicum and connect to a GPU instance:

    salloc --gpus=1
  2. Clone the repository:

    git clone https://github.com/josecols/ft-llms-tillicum.git ft-llms
    cd ft-llms
  3. Set up the project path environment variable:

    Add the following line to your ~/.bashrc file (replace the path with your actual project location):

    export FT_LLMS_ROOT=/gpfs/scrubbed/<netid>/projects/ft-llms

    Then reload your bash configuration:

    source ~/.bashrc
  4. Enable the conda module:

    module load conda
  5. Create and activate conda environment:

    conda create -n ft-llms python=3.12
    conda activate ft-llms
  6. Install TorchTune dependencies:

    pip install torch torchvision torchao
  7. Install TorchTune:

    pip install torchtune
  8. Install WanDB to track fine-tuning jobs:

    pip install wandb

    Note: You will need to authenticate your WanDB account for the first time. Additionally, modify the metric_logger section of the configuration file configs/llama3_2_3B_cuda.yaml with your WanDB project details.

  9. Install HuggingFace libraries to run inference tasks:

    pip install transformers peft accelerate
  10. Install the ROUGE score package to run evaluation tasks:

    pip install rouge-score
  11. Download the NLTK data (required by the ROUGE package):

    python -c "import nltk; nltk.download('punkt_tab')"

Usage

Prepare the Dataset

This workshop demo uses the PLOS dataset from BioLaySumm 2025 to fine-tune a model for lay summarization of biomedical articles.

To download and prepare the dataset for training, run the following command:

python scripts/prepare_dataset.py --use-abstracts

Note

You can omit the --use-abstracts flag if you prefer to train with the full article texts as input. However, you might need to adjust the training configuration to prevent out-of-memory errors.

Download the Model

The demo uses the Llama 3.2 3B model, but you can choose a different model if you prefer. To see all the supported model configurations, run tune ls.

tune download meta-llama/Llama-3.2-3B-Instruct --output-dir models/Llama-3.2-3B-Instruct --ignore-patterns "original/consolidated.00.pth" --hf-token <your-token>

Note

Make sure to replace <your-token> with your HuggingFace token. You can also set it via the HF_TOKEN environment variable

Fine-Tuning

Run single-node fine-tuning on 8 GPUs:

sbatch tasks/train_8_gpus.slurm

There is also a multi-node example script (tasks/train_16_gpus.slurm) that you can adapt for various distributed setups.

To check the job's progress, use the squeue -u <netid> command.

Inference

Run the following command to generate the summaries with the fine-tuned model:

sbatch tasks/inference.slurm

Evaluation

Run the following command to evaluate the model summaries against the gold-standard:

sbatch tasks/eval.slurm

Additional Resources

About

Fine-tuning, running inference, and evaluating LLMs using TorchTune on UW's Tillicum.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors