Skip to content

Commit e79fdd7

Browse files
DOC: Tip on how to merge with DeepSpeed ZeRO-3 (#2446)
--------- Co-authored-by: Kashif Rasul <[email protected]>
1 parent 5b60154 commit e79fdd7

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

docs/source/accelerate/deepspeed.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -438,3 +438,21 @@ dataset['train'][label_column][:10]=['no complaint', 'no complaint', 'complaint'
438438
1. Merging when using PEFT and DeepSpeed is currently unsupported and will raise error.
439439
2. When using CPU offloading, the major gains from using PEFT to shrink the optimizer states and gradients to that of the adapter weights would be realized on CPU RAM and there won't be savings with respect to GPU memory.
440440
3. DeepSpeed Stage 3 and qlora when used with CPU offloading leads to more GPU memory usage when compared to disabling CPU offloading.
441+
442+
<Tip>
443+
444+
💡 When you have code that requires merging (and unmerging) of weights, try to manually collect the parameters with DeepSpeed Zero-3 beforehand:
445+
446+
```python
447+
import deepspeed
448+
449+
is_ds_zero_3 = ... # check if Zero-3
450+
451+
with deepspeed.zero.GatheredParameters(list(model.parameters()), enabled= is_ds_zero_3):
452+
model.merge_adapter()
453+
# do whatever is needed, then unmerge in the same context if unmerging is required
454+
...
455+
model.unmerge_adapter()
456+
```
457+
458+
</Tip>

0 commit comments

Comments
 (0)