Commit c10bdd9
Add LTX 2.0 Video Pipelines (#12915)
* Initial LTX 2.0 transformer implementation
* Add tests for LTX 2 transformer model
* Get LTX 2 transformer tests working
* Rename LTX 2 compile test class to have LTX2
* Remove RoPE debug print statements
* Get LTX 2 transformer compile tests passing
* Fix LTX 2 transformer shape errors
* Initial script to convert LTX 2 transformer to diffusers
* Add more LTX 2 transformer audio arguments
* Allow LTX 2 transformer to be loaded from local path for conversion
* Improve dummy inputs and add test for LTX 2 transformer consistency
* Fix LTX 2 transformer bugs so consistency test passes
* Initial implementation of LTX 2.0 video VAE
* Explicitly specify temporal and spatial VAE scale factors when converting
* Add initial LTX 2.0 video VAE tests
* Add initial LTX 2.0 video VAE tests (part 2)
* Get diffusers implementation on par with official LTX 2.0 video VAE implementation
* Initial LTX 2.0 vocoder implementation
* Use RMSNorm implementation closer to original for LTX 2.0 video VAE
* start audio decoder.
* init registration.
* up
* simplify and clean up
* up
* Initial LTX 2.0 text encoder implementation
* Rough initial LTX 2.0 pipeline implementation
* up
* up
* up
* up
* Add imports for LTX 2.0 Audio VAE
* Conversion script for LTX 2.0 Audio VAE Decoder
* Add Audio VAE logic to T2V pipeline
* Duplicate scheduler for audio latents
* Support num_videos_per_prompt for prompt embeddings
* LTX 2.0 scheduler and full pipeline conversion
* Add script to test full LTX2Pipeline T2V inference
* Fix pipeline return bugs
* Add LTX 2 text encoder and vocoder to ltx2 subdirectory __init__
* Fix more bugs in LTX2Pipeline.__call__
* Improve CPU offload support
* Fix pipeline audio VAE decoding dtype bug
* Fix video shape error in full pipeline test script
* Get LTX 2 T2V pipeline to produce reasonable outputs
* Make LTX 2.0 scheduler more consistent with original code
* Fix typo when applying scheduler fix in T2V inference script
* Refactor Audio VAE to be simpler and remove helpers (#7)
* remove resolve causality axes stuff.
* remove a bunch of helpers.
* remove adjust output shape helper.
* remove the use of audiolatentshape.
* move normalization and patchify out of pipeline.
* fix
* up
* up
* Remove unpatchify and patchify ops before audio latents denormalization (#9)
---------
Co-authored-by: dg845 <[email protected]>
* Add support for I2V (#8)
* start i2v.
* up
* up
* up
* up
* up
* remove uniform strategy code.
* remove unneeded code.
* Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11)
* test i2v.
* Move Video and Audio Text Encoder Connectors to Transformer (#12)
* Denormalize audio latents in I2V pipeline (analogous to T2V change)
* Initial refactor to put video and audio text encoder connectors in transformer
* Get LTX 2 transformer tests working after connector refactor
* precompute run_connectors,.
* fixes
* Address review comments
* Calculate RoPE double precisions freqs using torch instead of np
* Further simplify LTX 2 RoPE freq calc
* Make connectors a separate module (#18)
* remove text_encoder.py
* address yiyi's comments.
* up
* up
* up
* up
---------
Co-authored-by: sayakpaul <[email protected]>
* up (#19)
* address initial feedback from lightricks team (#16)
* cross_attn_timestep_scale_multiplier to 1000
* implement split rope type.
* up
* propagate rope_type to rope embed classes as well.
* up
* When using split RoPE, make sure that the output dtype is same as input dtype
* Fix apply split RoPE shape error when reshaping x to 4D
* Add export_utils file for exporting LTX 2.0 videos with audio
* Tests for T2V and I2V (#6)
* add ltx2 pipeline tests.
* up
* up
* up
* up
* remove content
* style
* Denormalize audio latents in I2V pipeline (analogous to T2V change)
* Initial refactor to put video and audio text encoder connectors in transformer
* Get LTX 2 transformer tests working after connector refactor
* up
* up
* i2v tests.
* up
* Address review comments
* Calculate RoPE double precisions freqs using torch instead of np
* Further simplify LTX 2 RoPE freq calc
* revert unneded changes.
* up
* up
* update to split style rope.
* up
---------
Co-authored-by: Daniel Gu <[email protected]>
* up
* use export util funcs.
* Point original checkpoint to LTX 2.0 official checkpoint
* Allow the I2V pipeline to accept image URLs
* make style and make quality
* remove function map.
* remove args.
* update docs.
* update doc entries.
* disable ltx2_consistency test
* Simplify LTX 2 RoPE forward by removing coords is None logic
* make style and make quality
* Support LTX 2.0 audio VAE encoder
* Apply suggestions from code review
Co-authored-by: Sayak Paul <[email protected]>
* Remove print statement in audio VAE
* up
* Fix bug when calculating audio RoPE coords
* Ltx 2 latent upsample pipeline (#12922)
* Initial implementation of LTX 2.0 latent upsampling pipeline
* Add new LTX 2.0 spatial latent upsampler logic
* Add test script for LTX 2.0 latent upsampling
* Add option to enable VAE tiling in upsampling test script
* Get latent upsampler working with video latents
* Fix typo in BlurDownsample
* Add latent upsample pipeline docstring and example
* Remove deprecated pipeline VAE slicing/tiling methods
* make style and make quality
* When returning latents, return unpacked and denormalized latents for T2V and I2V
* Add model_cpu_offload_seq for latent upsampling pipeline
---------
Co-authored-by: Daniel Gu <[email protected]>
* Fix latent upsampler filename in LTX 2 conversion script
* Add latent upsample pipeline to LTX 2 docs
* Add dummy objects for LTX 2 latent upsample pipeline
* Set default FPS to official LTX 2 ckpt default of 24.0
* Set default CFG scale to official LTX 2 ckpt default of 4.0
* Update LTX 2 pipeline example docstrings
* make style and make quality
* Remove LTX 2 test scripts
* Fix LTX 2 upsample pipeline example docstring
* Add logic to convert and save a LTX 2 upsampling pipeline
* Document LTX2VideoTransformer3DModel forward pass
---------
Co-authored-by: sayakpaul <[email protected]>1 parent dab000e commit c10bdd9
File tree
33 files changed
+9513
-0
lines changed- docs/source/en
- api
- models
- pipelines
- scripts
- src/diffusers
- models
- autoencoders
- transformers
- pipelines
- ltx2
- utils
- tests
- models
- autoencoders
- transformers
- pipelines/ltx2
33 files changed
+9513
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
367 | 367 | | |
368 | 368 | | |
369 | 369 | | |
| 370 | + | |
| 371 | + | |
370 | 372 | | |
371 | 373 | | |
372 | 374 | | |
| |||
443 | 445 | | |
444 | 446 | | |
445 | 447 | | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
446 | 452 | | |
447 | 453 | | |
448 | 454 | | |
| |||
678 | 684 | | |
679 | 685 | | |
680 | 686 | | |
| 687 | + | |
| 688 | + | |
681 | 689 | | |
682 | 690 | | |
683 | 691 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
0 commit comments