Skip to content

Commit f341a12

Browse files
committed
clean up ml tutorial
1 parent 680521c commit f341a12

File tree

5 files changed

+210
-119
lines changed

5 files changed

+210
-119
lines changed
File renamed without changes.

docs/guides/mlp_tutorials/llm-finetuning.md

Lines changed: 56 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -2,45 +2,54 @@
22

33
# LLM Finetuning Tutorial
44

5-
This tutorial will take the model from the [LLM Inference][ref-mlp-llm-inference-tutorial] tutorial and show you how to perform finetuning. This means that we take the model and train it on some new custom data to change its behavior.
5+
This tutorial will take the model from the [LLM Inference][ref-mlp-llm-inference-tutorial] tutorial and show you how to perform finetuning.
6+
This means that we take the model and train it on some new custom data to change its behavior.
67

7-
To complete the tutorial, we set up some extra libraries that will help us to update the state of the machine learning model. We also write a script that will allow us to unlock more of the performance offered by the cluster, by running our fine-tuning task on two or more nodes.
8+
To complete the tutorial, we set up some extra libraries that will help us to update the state of the machine learning model.
9+
We also write a script that will allow us to unlock more of the performance offered by the cluster, by running our fine-tuning task on two or more nodes.
810

911
### Prerequisites
1012

11-
This tutorial assumes you've already successfully completed the [LLM Inference][ref-mlp-llm-inference-tutorial] tutorial. For fine-tuning Gemma, we will rely on the NGC PyTorch container and the libraries we've already installed in the Python environment used previously.
13+
This tutorial assumes you've already successfully completed the [LLM Inference][ref-mlp-llm-inference-tutorial] tutorial.
14+
For fine-tuning Gemma, we will rely on the NGC PyTorch container and the libraries we've already installed in the Python environment used previously.
1215

1316
### Set up TRL
1417

15-
We will use HuggingFace TRL to fine-tune Gemma-7B on the [OpenAssistant dataset](https://huggingface.co/datasets/OpenAssistant/oasst_top1_2023-08-25). First, we need to update our Python environment with some extra libraries to support TRL. To do this, we can launch an interactive shell in the PyTorch container, just like we did in the previous tutorial. Then, we install `peft`:
18+
We will use HuggingFace TRL to fine-tune Gemma-7B on the [OpenAssistant dataset](https://huggingface.co/datasets/OpenAssistant/oasst_top1_2023-08-25).
19+
First, we need to update our Python environment with some extra libraries to support TRL.
20+
To do this, we can launch an interactive shell in the PyTorch container, just like we did in the previous tutorial.
21+
Then, we install `peft`:
1622

17-
```bash
18-
cd $SCRATCH/gemma-inference
19-
srun --environment=gemma-pytorch --container-workdir=$PWD --pty bash
20-
source ./gemma-venv/bin/activate
21-
python -m pip install peft==0.11.1
23+
```console
24+
$ cd $SCRATCH/gemma-inference
25+
$ srun --environment=gemma-pytorch --container-workdir=$PWD --pty bash
26+
$ source ./gemma-venv/bin/activate
27+
$ python -m pip install peft==0.11.1
2228
```
2329

24-
Next, we also need to clone and install the `trl` Git repository so that we have access to the fine-tuning scripts in it. For this purpose, we will install the package in editable mode in the virtual environment. This makes it available in python scripts independent of the current working directory and without creating a redundant copy of the files.
30+
Next, we also need to clone and install the `trl` Git repository so that we have access to the fine-tuning scripts in it.
31+
For this purpose, we will install the package in editable mode in the virtual environment.
32+
This makes it available in python scripts independent of the current working directory and without creating a redundant copy of the files.
2533

26-
```
27-
[cluster][user@cluster-ln001 ~]$ git clone https://github.com/huggingface/trl -b v0.7.11
28-
[cluster][user@cluster-ln001 ~]$ pip install -e ./trl # install in editable mode
34+
```console
35+
$ git clone https://github.com/huggingface/trl -b v0.7.11
36+
$ pip install -e ./trl # install in editable mode
2937
```
3038

3139
When this step is complete, you can exit the shell by typing `exit`.
3240

3341
### Finetune Gemma-7B
3442

35-
t this point, we can set up a fine-tuning script and start training Gemma-7B. Use your favorite text editor to create the file `fine-tune-gemma.sh` just outside the trl and gemma-venv directories:
43+
t this point, we can set up a fine-tuning script and start training Gemma-7B.
44+
Use your favorite text editor to create the file `fine-tune-gemma.sh` just outside the trl and gemma-venv directories:
3645

3746
```bash title="fine-tune-gemma.sh"
3847
#!/bin/bash
3948

4049
source ./gemma-venv/bin/activate
4150

4251
set -x
43-
52+
4453
export HF_HOME=$SCRATCH/huggingface
4554
export TRANSFORMERS_VERBOSITY=info
4655

@@ -60,16 +69,27 @@ accelerate launch --config_file trl/examples/accelerate_configs/multi_gpu.yaml \
6069
--gradient_accumulation_steps 1 \
6170
--learning_rate 2e-4 \
6271
--save_steps 200 \
63-
--max_steps 400 \
72+
--max_steps 400 \
6473
--use_peft \
6574
--lora_r 16 --lora_alpha 32 \
6675
--lora_target_modules q_proj k_proj v_proj o_proj \
6776
--output_dir gemma-finetuned-openassistant
6877
```
6978

70-
This script has quite a bit more content to unpack. We use HuggingFace accelerate to launch the fine-tuning process, so we need to make sure that accelerate understands which hardware is available and where. Setting this up will be useful in the long run because it means we can tell SLURM how much hardware to reserve, and this script will setup all the details for us.
79+
This script has quite a bit more content to unpack.
80+
We use HuggingFace accelerate to launch the fine-tuning process, so we need to make sure that accelerate understands which hardware is available and where.
81+
Setting this up will be useful in the long run because it means we can tell SLURM how much hardware to reserve, and this script will setup all the details for us.
7182

72-
The cluster has four GH200 chips per compute node. We can make them accessible to scripts run through srun/sbatch via the option `--gpus-per-node=4`. Then, we calculate how many processes accelerate should launch. We want to map each GPU to a separate process, this should be four processes per node. We multiply this by the number of nodes to obtain the total number of processes. Next, we use some bash magic to extract the name of the head node from SLURM environment variables. Accelerate expects one main node and launches tasks on the other nodes from this main node. Having sourced our python environment at the top of the script, we can then launch Gemma fine-tuning. The first four lines of the launch line are used to configure accelerate. Everything after that configures the `trl/examples/scripts/sft.py` Python script, which we use to train Gemma.
83+
The cluster has four GH200 chips per compute node.
84+
We can make them accessible to scripts run through srun/sbatch via the option `--gpus-per-node=4`.
85+
Then, we calculate how many processes accelerate should launch.
86+
We want to map each GPU to a separate process, this should be four processes per node.
87+
We multiply this by the number of nodes to obtain the total number of processes.
88+
Next, we use some bash magic to extract the name of the head node from SLURM environment variables.
89+
Accelerate expects one main node and launches tasks on the other nodes from this main node.
90+
Having sourced our python environment at the top of the script, we can then launch Gemma fine-tuning.
91+
The first four lines of the launch line are used to configure accelerate.
92+
Everything after that configures the `trl/examples/scripts/sft.py` Python script, which we use to train Gemma.
7393

7494
Next, we also need to create a short SLURM batch script to launch our fine-tuning script:
7595

@@ -87,26 +107,31 @@ set -x
87107
srun -ul --environment=gemma-pytorch --container-workdir=$PWD bash fine-tune-gemma.sh
88108
```
89109

90-
We set a few Slurm parameters like we already did in the previous tutorial. Note that we leave the number of nodes unspecified. This way, we can decide the number of nodes we want to use when we launch the batch job using Slurm.
110+
We set a few Slurm parameters like we already did in the previous tutorial.
111+
Note that we leave the number of nodes unspecified.
112+
This way, we can decide the number of nodes we want to use when we launch the batch job using Slurm.
91113

92-
Now that we've setup a fine-tuning script and a Slurm batch script, we can launch our fine-tuning job. We'll start out by launching it on two nodes. It should take about 10-15 minutes to fine-tune Gemma:
114+
Now that we've setup a fine-tuning script and a Slurm batch script, we can launch our fine-tuning job.
115+
We'll start out by launching it on two nodes.
116+
It should take about 10-15 minutes to fine-tune Gemma:
93117

94-
```
95-
[cluster][user@cluster-ln001 ~]$ sbatch --nodes=1 fine-tune-sft.sbatch
118+
```console
119+
$ sbatch --nodes=1 fine-tune-sft.sbatch
96120
```
97121

98122
### Compare finetuned Gemma against default Gemma
99123

100-
We can reuse our python script from the first tutorial to do inference on the Gemma model that we just fine-tuned. Let's try out a different prompt in `gemma-inference.py`:
124+
We can reuse our python script from the first tutorial to do inference on the Gemma model that we just fine-tuned.
125+
Let's try out a different prompt in `gemma-inference.py`:
101126

102-
```
127+
```python
103128
input_text = "What are the 5 tallest mountains in the Swiss Alps?"
104129
```
105130

106131
We can run inference using our batch script from the previous tutorial:
107132

108-
```
109-
[cluster][user@cluster-ln001 ~]$ sbatch ./gemma-inference.sbatch
133+
```console
134+
$ sbatch ./gemma-inference.sbatch
110135
```
111136

112137
Inspecting the output should yield something like this:
@@ -126,7 +151,7 @@ the 5 tallest mountains in the Swiss Alps:
126151

127152
Next, we can update the model line in our Python inference script to use the model that we just fine-tuned:
128153

129-
```
154+
```python
130155
model = AutoModelForCausalLM.from_pretrained("gemma-finetuned-openassistant/checkpoint-400", device_map="auto")
131156
```
132157

@@ -157,13 +182,14 @@ n canton of Switzerland, and it is a popular destination for mountaineers and hi
157182
These mountains are all located in the Swiss Alps, and they are a popular destination for mountaineers and hikers. If you are planning a trip to the Swiss Alps, be sure to check out these mountains and plan your itinerary accordingly.
158183
```
159184

160-
Your output may look different after fine-tuning, but in general you will see that the fine-tuned model generates more verbose output. Double-checking the output reveals that the list of mountains produced by Gemma is not actually correct. The following table lists the 5 tallest Swiss peaks, according to Wikipedia.
161-
185+
Your output may look different after fine-tuning, but in general you will see that the fine-tuned model generates more verbose output.
186+
Double-checking the output reveals that the list of mountains produced by Gemma is not actually correct.
187+
These are the 5 tallest Swiss peaks according to Wikipedia:
162188

163189
1. Dufourspitze 4,634m
164190
2. Nordend 4,609m
165191
3. Zumsteinspitz 4,563m
166192
4. Signalkuppe 4,554m
167193
5. Dom 4,545m
168194

169-
This is an important reminder that machine-learning models like Gemma need extra checks to confirm any generated outputs.
195+
This is an important reminder that machine-learning models like Gemma need extra checks to confirm any generated outputs.

0 commit comments

Comments
 (0)