You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nowadays, the number of parameters in video generation models is increasing, and the video length is increasing. When training video models, it is difficult to fit a complete video sequence(200k~ tokens) on a single GPU. Some sequence parallel training technologies can solve this problem, such as the fastvideo training framework, but the imperfection of this framework makes it difficult to use. Can the diffusers framework support sequence parallel training?