-
Couldn't load subscription status.
- Fork 6.4k
[core] LTX Video #10021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] LTX Video #10021
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| elif qk_norm == "rms_norm_across_heads": | ||
| # LTX applies qk norm across all heads | ||
| self.norm_q = RMSNorm(dim_head * heads, eps=eps) | ||
| self.norm_k = RMSNorm(dim_head * kv_heads, eps=eps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DN6 Should I follow your approach with Mochi and create a separate attention class for LTX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok but we want to be more careful, ideally, we do that as part of carefully planned-out refactor
but maybe it would be safe to just inherit form Attention for now? e.g. we wrote code like this with the assumption in mind we only have one attention class
https://github.com/huggingface/diffusers/blob/e47cc1fc1a89a5375c322d296cd122fe71ab859f/src/diffusers/pipelines/pag/pag_utils.py#L57C39-L57C48
cc @DN6 here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess Attention stays here for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding!
|
Thanks for the reviews! Waiting for confirmation of where we should host weights, so I can update documentation accordingly. Currently they are under my account, but once we move them, we should be good to merge |
|
I'm wondering if it might be easy to incorporate STG (Spatiotemporal Skip Guidance) into LTX-Video pipeline. The improvements in video quality look significant, examples are impressive. Here's the links STG Project, STGuidance GitHub, ComfyUI-LTXTricks. Looks like it can also apply to Mochi, SVD and Open-Sora. Could be that missing ingredient... Adds params stg_mode, stg_scale, stg_block_idx, do_rescaling & rescaling_scale. |
|
Hey, thanks for the suggestion! We do plan to incorporate STG and other methods of guidance by isolating it out into its separate component. I do not like the idea of adding more parameters to the pipeline |
* from original file mixin for ltx * undo config mapping fn changes * update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One request about naming, LTX -> LTXV
|
Testing, I get this error: |
|
@tin2tin I think it should be |
|
@a-r-r-o-w I just used the test code in the first post, which doesn't specify that name. So, I guess something internally is calling a wrongly named operator? |
|
Just to confirm, you have installed diffusers from the main branch, yes? |
|
Also, the config files on the LTX repo were updated recently. Could you ensure you're using the latest commit of those configs so that the right model names are being pointed to? |
|
Yes, I'm on the latest main branch, and I downloaded the LTX repo yesterday, and it seems like the name change was 4 days ago. Checking dependencies... |
|
@a-r-r-o-w minor typo here, |
* transformer * make style & make fix-copies * transformer * add transformer tests * 80% vae * make style * make fix-copies * fix * undo cogvideox changes * update * update * match vae * add docs * t2v pipeline working; scheduler needs to be checked * docs * add pipeline test * update * update * make fix-copies * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * update * copy t2v to i2v pipeline * update * apply review suggestions * update * make style * remove framewise encoding/decoding * pack/unpack latents * image2video * update * make fix-copies * update * update * rope scale fix * debug layerwise code * remove debug * Apply suggestions from code review Co-authored-by: YiYi Xu <[email protected]> * propagate precision changes to i2v pipeline * remove downcast * address review comments * fix comment * address review comments * [Single File] LTX support for loading original weights (#10135) * from original file mixin for ltx * undo config mapping fn changes * update * add single file to pipelines * update docs * Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py * Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py * rename classes based on ltx review * point to original repository for inference * make style * resolve conflicts correctly --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: YiYi Xu <[email protected]>
T2V:
I2V: