Skip to content

Commit 37ecdd6

Browse files
Pbinder/esm2 document (#846)
### Description Profiling for LoRA additions to ESM2. --------- Signed-off-by: Polina Binder <pbinder@nvidia.com>
1 parent 192e537 commit 37ecdd6

File tree

3 files changed

+16
-0
lines changed

3 files changed

+16
-0
lines changed
27.2 KB
Loading
24.8 KB
Loading

docs/docs/models/ESM-2/index.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,3 +141,19 @@ nodes. <sup>*</sup>*Note:* 15B model variants were trained on 64 GPUs with the B
141141

142142
Training ESM-3B on 256 NVIDIA A100s on 32 nodes achieved 96.85% of the theoretical linear throughput expected from
143143
extrapolating single-node (8 GPU) performance, representing a model flops utilization of 60.6% at 256 devices.
144+
145+
### LoRA Fine-tuning Performace
146+
147+
Fine-tuning ESM-3B and ESM-650M with LoRA achieves improvements in GPU utilization and training time over fine-tuning a full ESM2 model. In models with LoRA, the encoder and embedding layers are replaced with LoRA modules.
148+
149+
#### LoRA GPU Memory Usage
150+
151+
GPU memory usage decreases by a factor of 2.5 - 4 in a model fine-tuned with LoRA.
152+
153+
![ESM2 Memory Usage](../../assets/images/esm2/esm2_peft_memory_usage.png)
154+
155+
#### LoRA Scaling
156+
157+
The number of tokens processed per second increases by 25-80%.
158+
159+
![ESM2 Memory Usage](../../assets/images/esm2/esm2_peft_time.png)

0 commit comments

Comments
 (0)