GitHub - josecols/ft-llms-tillicum: Fine-tuning, running inference, and evaluating LLMs using TorchTune on UW's Tillicum.

Fine-tuning LLMs on Custom Datasets

Example project for fine-tuning, running inference, and evaluating LLMs using TorchTune on UW's Tillicum.

Getting Started · Usage · Additional Resources

Getting Started

This project uses TorchTune to demonstrate how to fine-tune LLMs with LoRA on Tillicum's multi-GPU nodes.

Prerequisites

Hugging Face account with Llama 3.2 access.
SSH access to Tillicum.

Installation

SSH into Tillicum and connect to a GPU instance:
```
salloc --gpus=1
```

Clone the repository:

git clone https://github.com/josecols/ft-llms-tillicum.git ft-llms
cd ft-llms

Set up the project path environment variable:

Add the following line to your ~/.bashrc file (replace the path with your actual project location):
```
export FT_LLMS_ROOT=/gpfs/scrubbed/<netid>/projects/ft-llms
```
Then reload your bash configuration:
```
source ~/.bashrc
```
Enable the conda module:
```
module load conda
```

Create and activate conda environment:

conda create -n ft-llms python=3.12
conda activate ft-llms

Install TorchTune dependencies:
```
pip install torch torchvision torchao
```
Install TorchTune:
```
pip install torchtune
```
Install WanDB to track fine-tuning jobs:
```
pip install wandb
```
Note: You will need to authenticate your WanDB account for the first time. Additionally, modify the metric_logger section of the configuration file configs/llama3_2_3B_cuda.yaml with your WanDB project details.
Install HuggingFace libraries to run inference tasks:
```
pip install transformers peft accelerate
```
Install the ROUGE score package to run evaluation tasks:
```
pip install rouge-score
```
Download the NLTK data (required by the ROUGE package):
```
python -c "import nltk; nltk.download('punkt_tab')"
```

Usage

Prepare the Dataset

This workshop demo uses the PLOS dataset from BioLaySumm 2025 to fine-tune a model for lay summarization of biomedical articles.

To download and prepare the dataset for training, run the following command:

python scripts/prepare_dataset.py --use-abstracts

Note

You can omit the --use-abstracts flag if you prefer to train with the full article texts as input. However, you might need to adjust the training configuration to prevent out-of-memory errors.

Download the Model

The demo uses the Llama 3.2 3B model, but you can choose a different model if you prefer. To see all the supported model configurations, run tune ls.

tune download meta-llama/Llama-3.2-3B-Instruct --output-dir models/Llama-3.2-3B-Instruct --ignore-patterns "original/consolidated.00.pth" --hf-token <your-token>

Note

Make sure to replace <your-token> with your HuggingFace token. You can also set it via the HF_TOKEN environment variable

Fine-Tuning

Run single-node fine-tuning on 8 GPUs:

sbatch tasks/train_8_gpus.slurm

There is also a multi-node example script (tasks/train_16_gpus.slurm) that you can adapt for various distributed setups.

To check the job's progress, use the squeue -u <netid> command.

Inference

Run the following command to generate the summaries with the fine-tuned model:

sbatch tasks/inference.slurm

Evaluation

Run the following command to evaluate the model summaries against the gold-standard:

sbatch tasks/eval.slurm

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
configs		configs
scripts		scripts
tasks		tasks
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-tuning LLMs on Custom Datasets

Getting Started

Prerequisites

Installation

Usage

Prepare the Dataset

Download the Model

Fine-Tuning

Inference

Evaluation

Additional Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fine-tuning LLMs on Custom Datasets

Getting Started

Prerequisites

Installation

Usage

Prepare the Dataset

Download the Model

Fine-Tuning

Inference

Evaluation

Additional Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages