Some questions about enabling deepspeed in cogvideo training scripts #9695

luyvlei · 2024-10-17T02:54:10Z

luyvlei
Oct 17, 2024

In this 1448 line of this script, when deepspeed is enabled, optimizer.step() and zero_grad() will no longer be executed. I have checked the related documentation of accelerate and found no similar introduction, is this correct?

diffusers/examples/cogvideo/train_cogvideox_lora.py

Lines 1448 to 1450 in d9029f2

    
           if accelerator.state.deepspeed_plugin is None: 
        
               optimizer.step() 
        
               optimizer.zero_grad()

a-r-r-o-w · 2024-10-17T07:55:30Z

a-r-r-o-w
Oct 17, 2024

The optimizer step and gradient zero-ing is handled internally within the deepspeed engine I think (not 100% sure), so it is okay to skip. I believe even if executed, these will be no-ops. I think to reduce confusion, we could do these but I don't have the bandwidth to test on a complete run. Would you be able to verify? cc @yiyixuxu

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some questions about enabling deepspeed in cogvideo training scripts #9695

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Some questions about enabling deepspeed in cogvideo training scripts #9695

Uh oh!

Uh oh!

luyvlei Oct 17, 2024

Replies: 1 comment

Uh oh!

a-r-r-o-w Oct 17, 2024

luyvlei
Oct 17, 2024

a-r-r-o-w
Oct 17, 2024