Skip to content

Commit b764c66

Browse files
authored
Merge pull request #160 from NYU-RTS/ml_ai
first pass on llm example
2 parents 10400bd + cb21e9f commit b764c66

File tree

2 files changed

+179
-3
lines changed

2 files changed

+179
-3
lines changed

docs/hpc/08_ml_ai_hpc/03_llm_on_hpc.md

Lines changed: 0 additions & 3 deletions
This file was deleted.
Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# Run a Hugging Face model
2+
3+
Here we provide an example of how one can run a Hugging Face Large-language model (LLM) on the NYU Greene cluster
4+
5+
## Prepare environment
6+
### Create project directory
7+
8+
After [logging on to a Greene login node](../02_connecting_to_hpc/01_connecting_to_hpc.mdx), make a directory for this project:
9+
```bash
10+
[NetID@log-1 ~]$ mkdir -p /scratch/NetID/llm_example
11+
[NetID@log-1 ~]$ cd /scratch/NetID/llm_example
12+
```
13+
:::note
14+
You'll need to replace NetID above with your NetID
15+
:::
16+
17+
### Move to a compute node
18+
Some of the following steps can require significant resources, so we'll move to a compute node. This way we won't overload the login node we're on.
19+
```bash
20+
[NetID@log-1 llm_example]$ srun --cpus-per-task=2 --mem=10GB --time=04:00:00 --pty /bin/bash
21+
```
22+
23+
### Copy appropriate overlay file to the project directory
24+
```bash
25+
[NetID@cm001 llm_example]$ cp -rp /scratch/work/public/overlay-fs-ext3/overlay-50G-10M.ext3.gz .
26+
[NetID@cm001 llm_example]$ gunzip overlay-50G-10M.ext3.gz
27+
```
28+
29+
### Launch Singularity container in read/write mode
30+
```bash
31+
[NetID@cm001 llm_example]$ singularity exec --overlay overlay-50G-10M.ext3:rw /scratch/work/public/singularity/cuda12.1.1-cudnn8.9.0-devel-ubuntu22.04.2.sif /bin/bash
32+
```
33+
34+
### Install miniconda in the container
35+
```bash
36+
Singularity> wget --no-check-certificate https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
37+
Singularity> bash Miniforge3-Linux-x86_64.sh -b -p /ext3/miniforge3
38+
```
39+
40+
### Create environment script
41+
Use an editor like nano or vim to create the file `/ext3/env.sh`. The contents should be:
42+
```bash
43+
#!/bin/bash
44+
45+
unset -f which
46+
47+
source /ext3/miniforge3/etc/profile.d/conda.sh
48+
export PATH=/ext3/miniforge3/bin:$PATH
49+
export PYTHONPATH=/ext3/miniforge3/bin:$PATH
50+
```
51+
52+
### Activate the environment
53+
```bash
54+
Singularity> source /ext3/env.sh
55+
```
56+
57+
### Install packages in environment
58+
```bash
59+
Singularity> conda config --remove channels defaults
60+
Singularity> conda update -n base conda -y
61+
Singularity> conda clean --all --yes
62+
Singularity> conda install pip -y
63+
Singularity> pip install torch numpy transformers
64+
```
65+
66+
### Exit from Singularity and the compute node
67+
```bash
68+
Singularity> exit
69+
[NetID@cm001 llm_example]$ exit
70+
```
71+
72+
:::tip
73+
You can find more information about using Singularity and Conda on our HPC systems in our documentation [Singularity with Conda](https://sites.google.com/nyu.edu/nyu-hpc/hpc-systems/greene/software/singularity-with-miniconda).
74+
:::
75+
76+
## Prepare script
77+
Create a python script using the following code from sections 1-9 and save it in a file called `huggingface.py`:
78+
79+
1. Import necessary modules:
80+
```python
81+
import torch
82+
import numpy as np
83+
from transformers import AutoTokenizer, AutoModel
84+
```
85+
86+
1. Create a list of reviews:
87+
```python
88+
texts = ["How do I get a replacement Medicare card?",
89+
"What is the monthly premium for Medicare Part B?",
90+
"How do I terminate my Medicare Part B (medical insurance)?",
91+
"How do I sign up for Medicare?",
92+
"Can I sign up for Medicare Part B if I am working and have health insurance through an employer?",
93+
"How do I sign up for Medicare Part B if I already have Part A?"]
94+
```
95+
96+
1. Choose the model name from huggingface’s model hub and instantiate the model and tokenizer object for the given model. We are setting `output_hidden_states` as `True` as we want the output of the model to not only have loss, but also the embeddings for the sentences.
97+
```python
98+
model_name = 'cardiffnlp/twitter-roberta-base-sentiment'
99+
tokenizer = AutoTokenizer.from_pretrained(model_name)
100+
model = AutoModel.from_pretrained(model_name, output_hidden_states=True)
101+
```
102+
103+
1. Create the ids to be used in the model using the tokenizer object. We set the return_tensors as “pt” as we want to return the pytorch tensor of the ids:
104+
```python
105+
ids = tokenizer(texts, padding=True, return_tensors="pt")
106+
```
107+
108+
1. Set the device to cuda, and move the model and the tokenizer to cuda as well. Since, we will be extracting embeddings, we will only be performing a forward pass of the model and hence we will set the model to validation mode using `eval()`:
109+
```python
110+
device = 'cuda' if torch.cuda.is_available() else 'cpu'
111+
model.to(device)
112+
ids = ids.to(device)
113+
model.eval()
114+
```
115+
116+
1. Performing the forward pass and storing the output tuple in out:
117+
```python
118+
with torch.no_grad():
119+
out = model(**ids)
120+
```
121+
122+
1. Extracting the embeddings of each review from the last layer:
123+
```python
124+
last_hidden_states = out.last_hidden_state
125+
```
126+
127+
1. For the purpose of classification, we are extracting the CLS token which is the first embedding in the embedding list for each review:
128+
```python
129+
sentence_embedding = last_hidden_states[:, 0, :]
130+
```
131+
132+
1. We can check the shape of the final sentence embeddings for all the reviews. The output should look like `torch.Size([6, 768])`, where 6 is the batch size as we input 6 reviews as shown in step `2b`, and 768 is the embedding size of the RoBERTa model used.
133+
```python
134+
print("Shape of the batch embedding: {}".format(sentence_embedding.shape))
135+
```
136+
137+
## Prepare Sbatch file
138+
After saving the above code in a script called `huggingface.py`, create a file called `run.SBATCH` with the the following code:
139+
140+
```batch
141+
#!/bin/bash
142+
#SBATCH --nodes=1
143+
#SBATCH --ntasks-per-node=1
144+
#SBATCH --cpus-per-task=1
145+
#SBATCH --time=00:10:00
146+
#SBATCH --mem=64GB
147+
#SBATCH --gres=gpu
148+
#SBATCH --job-name=huggingface
149+
#SBATCH --output=huggingface.out
150+
151+
module purge
152+
153+
if [ -e /dev/nvidia0 ]; then nv="--nv"; fi
154+
155+
singularity exec $nv \
156+
--overlay /scratch/NetID/llm_example/overlay-50G-10M.ext3:rw \
157+
/scratch/work/public/singularity/cuda11.2.2-cudnn8-devel-ubuntu20.04.sif \
158+
/bin/bash -c "source /ext3/env.sh; python /scratch/NetID/llm_example/huggingface.py"
159+
```
160+
:::note
161+
You'll need to change `NetID` in the script above to your NetID.
162+
If you're using a different directory name and/or path you'll also need to update that in the script above.
163+
:::
164+
165+
## Run the run.SBATCH file
166+
```batch
167+
[NetID@log-1 llm_example]$ sbatch run.SBATCH
168+
```
169+
The output can be found in `huggingface.out`
170+
It should be something like:
171+
```
172+
Some weights of RobertaModel were not initialized from the model checkpoint at cardiffnlp/twitter-roberta-base-sentiment and are
173+
newly initialized: ['pooler.dense.bias', 'pooler.dense.weight']
174+
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
175+
Shape of the batch embedding: torch.Size([6, 768])
176+
```
177+
178+
## Acknowledgements
179+
Instructions are developed and provided by [Laiba Mehnaz](https://www.linkedin.com/in/laiba-mehnaz-a81455158/), a member of [AIfSR](https://www.linkedin.com/company/ai-for-scientific-research)

0 commit comments

Comments
 (0)