Skip to content

Commit 98479a9

Browse files
yaoqihyiyixuxu
andauthored
LTX Video 0.9.8 long multi prompt (#12614)
* LTX Video 0.9.8 long multi prompt * Further align comfyui - Added the “LTXEulerAncestralRFScheduler” scheduler, aligned with [sample_euler_ancestral_RF](https://github.com/comfyanonymous/ComfyUI/blob/7d6103325e1c97aa54f963253e3e7f1d6da6947f/comfy/k_diffusion/sampling.py#L234) - Updated the LTXI2VLongMultiPromptPipeline.from_pretrained() method: - Now uses LTXEulerAncestralRFScheduler by default, for better compatibility with the ComfyUI LTXV workflow. - Changed the default value of cond_strength from 1.0 to 0.5, aligning with ComfyUI’s default. - Optimized cross-window overlap blending: moved the latent-space guidance injection to before the UNet and after each step, aligned with[KSamplerX0Inpaint]([ComfyUI/comfy/samplers.py at master · comfyanonymous/ComfyUI](https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/samplers.py#L391)) - Adjusted the default value of skip_steps_sigma_threshold to 1. * align with diffusers contribute rule * Add new pipelines and update imports * Enhance LTXI2VLongMultiPromptPipeline with noise rescaling Refactor LTXI2VLongMultiPromptPipeline to improve documentation and add noise rescaling functionality. * Clean up comments in scheduling_ltx_euler_ancestral_rf.py Removed design notes and limitations from the implementation. * Enhance video generation example with scheduler Updated LTXI2VLongMultiPromptPipeline example to include LTXEulerAncestralRFScheduler for ComfyUI parity. * clean up * style * copies * import ltx scheduler * copies * fix * fix more * up up * up up up * up upup * Apply suggestions from code review * Update docs/source/en/api/pipelines/ltx_video.md * Update docs/source/en/api/pipelines/ltx_video.md --------- Co-authored-by: yiyixuxu <yixu310@gmail.com>
1 parent ade1059 commit 98479a9

File tree

9 files changed

+1848
-3
lines changed

9 files changed

+1848
-3
lines changed

docs/source/en/api/pipelines/ltx_video.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ export_to_video(video, "output.mp4", fps=24)
136136
- The recommended dtype for the transformer, VAE, and text encoder is `torch.bfloat16`. The VAE and text encoder can also be `torch.float32` or `torch.float16`.
137137
- For guidance-distilled variants of LTX-Video, set `guidance_scale` to `1.0`. The `guidance_scale` for any other model should be set higher, like `5.0`, for good generation quality.
138138
- For timestep-aware VAE variants (LTX-Video 0.9.1 and above), set `decode_timestep` to `0.05` and `image_cond_noise_scale` to `0.025`.
139-
- For variants that support interpolation between multiple conditioning images and videos (LTX-Video 0.9.5 and above), use similar images and videos for the best results. Divergence from the conditioning inputs may lead to abrupt transitionts in the generated video.
139+
- For variants that support interpolation between multiple conditioning images and videos (LTX-Video 0.9.5 and above), use similar images and videos for the best results. Divergence from the conditioning inputs may lead to abrupt transitions in the generated video.
140140

141141
- LTX-Video 0.9.7 includes a spatial latent upscaler and a 13B parameter transformer. During inference, a low resolution video is quickly generated first and then upscaled and refined.
142142

@@ -329,7 +329,7 @@ export_to_video(video, "output.mp4", fps=24)
329329

330330
<details>
331331
<summary>Show example code</summary>
332-
332+
333333
```python
334334
import torch
335335
from diffusers import LTXConditionPipeline, LTXLatentUpsamplePipeline
@@ -474,6 +474,12 @@ export_to_video(video, "output.mp4", fps=24)
474474

475475
</details>
476476

477+
## LTXI2VLongMultiPromptPipeline
478+
479+
[[autodoc]] LTXI2VLongMultiPromptPipeline
480+
- all
481+
- __call__
482+
477483
## LTXPipeline
478484

479485
[[autodoc]] LTXPipeline

src/diffusers/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -353,6 +353,7 @@
353353
"KDPM2AncestralDiscreteScheduler",
354354
"KDPM2DiscreteScheduler",
355355
"LCMScheduler",
356+
"LTXEulerAncestralRFScheduler",
356357
"PNDMScheduler",
357358
"RePaintScheduler",
358359
"SASolverScheduler",
@@ -538,6 +539,7 @@
538539
"LongCatImageEditPipeline",
539540
"LongCatImagePipeline",
540541
"LTXConditionPipeline",
542+
"LTXI2VLongMultiPromptPipeline",
541543
"LTXImageToVideoPipeline",
542544
"LTXLatentUpsamplePipeline",
543545
"LTXPipeline",
@@ -1088,6 +1090,7 @@
10881090
KDPM2AncestralDiscreteScheduler,
10891091
KDPM2DiscreteScheduler,
10901092
LCMScheduler,
1093+
LTXEulerAncestralRFScheduler,
10911094
PNDMScheduler,
10921095
RePaintScheduler,
10931096
SASolverScheduler,
@@ -1252,6 +1255,7 @@
12521255
LongCatImageEditPipeline,
12531256
LongCatImagePipeline,
12541257
LTXConditionPipeline,
1258+
LTXI2VLongMultiPromptPipeline,
12551259
LTXImageToVideoPipeline,
12561260
LTXLatentUpsamplePipeline,
12571261
LTXPipeline,

src/diffusers/pipelines/__init__.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,7 @@
288288
"LTXImageToVideoPipeline",
289289
"LTXConditionPipeline",
290290
"LTXLatentUpsamplePipeline",
291+
"LTXI2VLongMultiPromptPipeline",
291292
]
292293
_import_structure["lumina"] = ["LuminaPipeline", "LuminaText2ImgPipeline"]
293294
_import_structure["lumina2"] = ["Lumina2Pipeline", "Lumina2Text2ImgPipeline"]
@@ -729,7 +730,13 @@
729730
LEditsPPPipelineStableDiffusionXL,
730731
)
731732
from .longcat_image import LongCatImageEditPipeline, LongCatImagePipeline
732-
from .ltx import LTXConditionPipeline, LTXImageToVideoPipeline, LTXLatentUpsamplePipeline, LTXPipeline
733+
from .ltx import (
734+
LTXConditionPipeline,
735+
LTXI2VLongMultiPromptPipeline,
736+
LTXImageToVideoPipeline,
737+
LTXLatentUpsamplePipeline,
738+
LTXPipeline,
739+
)
733740
from .lucy import LucyEditPipeline
734741
from .lumina import LuminaPipeline, LuminaText2ImgPipeline
735742
from .lumina2 import Lumina2Pipeline, Lumina2Text2ImgPipeline

src/diffusers/pipelines/ltx/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
_import_structure["modeling_latent_upsampler"] = ["LTXLatentUpsamplerModel"]
2626
_import_structure["pipeline_ltx"] = ["LTXPipeline"]
2727
_import_structure["pipeline_ltx_condition"] = ["LTXConditionPipeline"]
28+
_import_structure["pipeline_ltx_i2v_long_multi_prompt"] = ["LTXI2VLongMultiPromptPipeline"]
2829
_import_structure["pipeline_ltx_image2video"] = ["LTXImageToVideoPipeline"]
2930
_import_structure["pipeline_ltx_latent_upsample"] = ["LTXLatentUpsamplePipeline"]
3031

@@ -39,6 +40,7 @@
3940
from .modeling_latent_upsampler import LTXLatentUpsamplerModel
4041
from .pipeline_ltx import LTXPipeline
4142
from .pipeline_ltx_condition import LTXConditionPipeline
43+
from .pipeline_ltx_i2v_long_multi_prompt import LTXI2VLongMultiPromptPipeline
4244
from .pipeline_ltx_image2video import LTXImageToVideoPipeline
4345
from .pipeline_ltx_latent_upsample import LTXLatentUpsamplePipeline
4446

0 commit comments

Comments
 (0)