|
| 1 | +# Run a Hugging Face model |
| 2 | + |
| 3 | +Here we provide an example of how one can run a Hugging Face Large-language model (LLM) on the NYU Greene cluster |
| 4 | + |
| 5 | +## Prepare environment |
| 6 | +### Create project directory |
| 7 | + |
| 8 | +After [logging on to a Greene login node](../02_connecting_to_hpc/01_connecting_to_hpc.mdx), make a directory for this project: |
| 9 | +```bash |
| 10 | +[NetID@log-1 ~]$ mkdir -p /scratch/NetID/llm_example |
| 11 | +[NetID@log-1 ~]$ cd /scratch/NetID/llm_example |
| 12 | +``` |
| 13 | +:::note |
| 14 | +You'll need to replace NetID above with your NetID |
| 15 | +::: |
| 16 | + |
| 17 | +### Move to a compute node |
| 18 | +Some of the following steps can require significant resources, so we'll move to a compute node. This way we won't overload the login node we're on. |
| 19 | +```bash |
| 20 | +[NetID@log-1 llm_example]$ srun --cpus-per-task=2 --mem=10GB --time=04:00:00 --pty /bin/bash |
| 21 | +``` |
| 22 | + |
| 23 | +### Copy appropriate overlay file to the project directory |
| 24 | +```bash |
| 25 | +[NetID@cm001 llm_example]$ cp -rp /scratch/work/public/overlay-fs-ext3/overlay-50G-10M.ext3.gz . |
| 26 | +[NetID@cm001 llm_example]$ gunzip overlay-50G-10M.ext3.gz |
| 27 | +``` |
| 28 | + |
| 29 | +### Launch Singularity container in read/write mode |
| 30 | +```bash |
| 31 | +[NetID@cm001 llm_example]$ singularity exec --overlay overlay-50G-10M.ext3:rw /scratch/work/public/singularity/cuda12.1.1-cudnn8.9.0-devel-ubuntu22.04.2.sif /bin/bash |
| 32 | +``` |
| 33 | + |
| 34 | +### Install miniconda in the container |
| 35 | +```bash |
| 36 | +Singularity> wget --no-check-certificate https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh |
| 37 | +Singularity> bash Miniforge3-Linux-x86_64.sh -b -p /ext3/miniforge3 |
| 38 | +``` |
| 39 | + |
| 40 | +### Create environment script |
| 41 | +Use an editor like nano or vim to create the file `/ext3/env.sh`. The contents should be: |
| 42 | +```bash |
| 43 | +#!/bin/bash |
| 44 | + |
| 45 | +unset -f which |
| 46 | + |
| 47 | +source /ext3/miniforge3/etc/profile.d/conda.sh |
| 48 | +export PATH=/ext3/miniforge3/bin:$PATH |
| 49 | +export PYTHONPATH=/ext3/miniforge3/bin:$PATH |
| 50 | +``` |
| 51 | + |
| 52 | +### Activate the environment |
| 53 | +```bash |
| 54 | +Singularity> source /ext3/env.sh |
| 55 | +``` |
| 56 | + |
| 57 | +### Install packages in environment |
| 58 | +```bash |
| 59 | +Singularity> conda config --remove channels defaults |
| 60 | +Singularity> conda update -n base conda -y |
| 61 | +Singularity> conda clean --all --yes |
| 62 | +Singularity> conda install pip -y |
| 63 | +Singularity> pip install torch numpy transformers |
| 64 | +``` |
| 65 | + |
| 66 | +### Exit from Singularity and the compute node |
| 67 | +```bash |
| 68 | +Singularity> exit |
| 69 | +[NetID@cm001 llm_example]$ exit |
| 70 | +``` |
| 71 | + |
| 72 | +:::tip |
| 73 | +You can find more information about using Singularity and Conda on our HPC systems in our documentation [Singularity with Conda](https://sites.google.com/nyu.edu/nyu-hpc/hpc-systems/greene/software/singularity-with-miniconda). |
| 74 | +::: |
| 75 | + |
| 76 | +## Prepare script |
| 77 | +Create a python script using the following code from sections 1-9 and save it in a file called `huggingface.py`: |
| 78 | + |
| 79 | +1. Import necessary modules: |
| 80 | + ```python |
| 81 | + import torch |
| 82 | + import numpy as np |
| 83 | + from transformers import AutoTokenizer, AutoModel |
| 84 | + ``` |
| 85 | + |
| 86 | +1. Create a list of reviews: |
| 87 | + ```python |
| 88 | + texts = ["How do I get a replacement Medicare card?", |
| 89 | + "What is the monthly premium for Medicare Part B?", |
| 90 | + "How do I terminate my Medicare Part B (medical insurance)?", |
| 91 | + "How do I sign up for Medicare?", |
| 92 | + "Can I sign up for Medicare Part B if I am working and have health insurance through an employer?", |
| 93 | + "How do I sign up for Medicare Part B if I already have Part A?"] |
| 94 | + ``` |
| 95 | + |
| 96 | +1. Choose the model name from huggingface’s model hub and instantiate the model and tokenizer object for the given model. We are setting `output_hidden_states` as `True` as we want the output of the model to not only have loss, but also the embeddings for the sentences. |
| 97 | + ```python |
| 98 | + model_name = 'cardiffnlp/twitter-roberta-base-sentiment' |
| 99 | + tokenizer = AutoTokenizer.from_pretrained(model_name) |
| 100 | + model = AutoModel.from_pretrained(model_name, output_hidden_states=True) |
| 101 | + ``` |
| 102 | + |
| 103 | +1. Create the ids to be used in the model using the tokenizer object. We set the return_tensors as “pt” as we want to return the pytorch tensor of the ids: |
| 104 | + ```python |
| 105 | + ids = tokenizer(texts, padding=True, return_tensors="pt") |
| 106 | + ``` |
| 107 | + |
| 108 | +1. Set the device to cuda, and move the model and the tokenizer to cuda as well. Since, we will be extracting embeddings, we will only be performing a forward pass of the model and hence we will set the model to validation mode using `eval()`: |
| 109 | + ```python |
| 110 | + device = 'cuda' if torch.cuda.is_available() else 'cpu' |
| 111 | + model.to(device) |
| 112 | + ids = ids.to(device) |
| 113 | + model.eval() |
| 114 | + ``` |
| 115 | + |
| 116 | +1. Performing the forward pass and storing the output tuple in out: |
| 117 | + ```python |
| 118 | + with torch.no_grad(): |
| 119 | + out = model(**ids) |
| 120 | + ``` |
| 121 | + |
| 122 | +1. Extracting the embeddings of each review from the last layer: |
| 123 | + ```python |
| 124 | + last_hidden_states = out.last_hidden_state |
| 125 | + ``` |
| 126 | + |
| 127 | +1. For the purpose of classification, we are extracting the CLS token which is the first embedding in the embedding list for each review: |
| 128 | + ```python |
| 129 | + sentence_embedding = last_hidden_states[:, 0, :] |
| 130 | + ``` |
| 131 | + |
| 132 | +1. We can check the shape of the final sentence embeddings for all the reviews. The output should look like `torch.Size([6, 768])`, where 6 is the batch size as we input 6 reviews as shown in step `2b`, and 768 is the embedding size of the RoBERTa model used. |
| 133 | + ```python |
| 134 | + print("Shape of the batch embedding: {}".format(sentence_embedding.shape)) |
| 135 | + ``` |
| 136 | + |
| 137 | +## Prepare Sbatch file |
| 138 | +After saving the above code in a script called `huggingface.py`, create a file called `run.SBATCH` with the the following code: |
| 139 | + |
| 140 | +```batch |
| 141 | +#!/bin/bash |
| 142 | +#SBATCH --nodes=1 |
| 143 | +#SBATCH --ntasks-per-node=1 |
| 144 | +#SBATCH --cpus-per-task=1 |
| 145 | +#SBATCH --time=00:10:00 |
| 146 | +#SBATCH --mem=64GB |
| 147 | +#SBATCH --gres=gpu |
| 148 | +#SBATCH --job-name=huggingface |
| 149 | +#SBATCH --output=huggingface.out |
| 150 | +
|
| 151 | +module purge |
| 152 | +
|
| 153 | +if [ -e /dev/nvidia0 ]; then nv="--nv"; fi |
| 154 | +
|
| 155 | +singularity exec $nv \ |
| 156 | + --overlay /scratch/NetID/llm_example/overlay-50G-10M.ext3:rw \ |
| 157 | + /scratch/work/public/singularity/cuda11.2.2-cudnn8-devel-ubuntu20.04.sif \ |
| 158 | + /bin/bash -c "source /ext3/env.sh; python /scratch/NetID/llm_example/huggingface.py" |
| 159 | +``` |
| 160 | +:::note |
| 161 | +You'll need to change `NetID` in the script above to your NetID. |
| 162 | +If you're using a different directory name and/or path you'll also need to update that in the script above. |
| 163 | +::: |
| 164 | + |
| 165 | +## Run the run.SBATCH file |
| 166 | +```batch |
| 167 | +[NetID@log-1 llm_example]$ sbatch run.SBATCH |
| 168 | +``` |
| 169 | +The output can be found in `huggingface.out` |
| 170 | +It should be something like: |
| 171 | +``` |
| 172 | +Some weights of RobertaModel were not initialized from the model checkpoint at cardiffnlp/twitter-roberta-base-sentiment and are |
| 173 | + newly initialized: ['pooler.dense.bias', 'pooler.dense.weight'] |
| 174 | +You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. |
| 175 | +Shape of the batch embedding: torch.Size([6, 768]) |
| 176 | +``` |
| 177 | + |
| 178 | +## Acknowledgements |
| 179 | +Instructions are developed and provided by [Laiba Mehnaz](https://www.linkedin.com/in/laiba-mehnaz-a81455158/), a member of [AIfSR](https://www.linkedin.com/company/ai-for-scientific-research) |
0 commit comments