Replies: 1 comment
-
The optimizer step and gradient zero-ing is handled internally within the deepspeed engine I think (not 100% sure), so it is okay to skip. I believe even if executed, these will be no-ops. I think to reduce confusion, we could do these but I don't have the bandwidth to test on a complete run. Would you be able to verify? cc @yiyixuxu |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
In this 1448 line of this script, when deepspeed is enabled, optimizer.step() and zero_grad() will no longer be executed. I have checked the related documentation of accelerate and found no similar introduction, is this correct?
diffusers/examples/cogvideo/train_cogvideox_lora.py
Lines 1448 to 1450 in d9029f2
Beta Was this translation helpful? Give feedback.
All reactions