You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/api/pipelines/ltx_video.md
+90-3Lines changed: 90 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,12 +31,93 @@ Available models:
31
31
32
32
| Model name | Recommended dtype |
33
33
|:-------------:|:-----------------:|
34
-
|[`LTX Video 0.9.0`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.safetensors)|`torch.bfloat16`|
35
-
|[`LTX Video 0.9.1`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.1.safetensors)|`torch.bfloat16`|
36
-
|[`LTX Video 0.9.5`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.5.safetensors)|`torch.bfloat16`|
34
+
|[`LTX Video 2B 0.9.0`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.safetensors)|`torch.bfloat16`|
35
+
|[`LTX Video 2B 0.9.1`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.1.safetensors)|`torch.bfloat16`|
36
+
|[`LTX Video 2B 0.9.5`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.5.safetensors)|`torch.bfloat16`|
37
+
|[`LTX Video 13B 0.9.7`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltxv-13b-0.9.7-dev.safetensors)|`torch.bfloat16`|
38
+
|[`LTX Video Spatial Upscaler 0.9.7`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltxv-spatial-upscaler-0.9.7.safetensors)|`torch.bfloat16`|
37
39
38
40
Note: The recommended dtype is for the transformer component. The VAE and text encoders can be either `torch.float32`, `torch.bfloat16` or `torch.float16` but the recommended dtype is `torch.bfloat16` as used in the original repository.
39
41
42
+
## Using LTX Video 13B 0.9.7
43
+
44
+
LTX Video 0.9.7 comes with a spatial latent upscaler and a 13B parameter transformer. The inference involves generating a low resolution video first, which is very fast, followed by upscaling and refining the generated video.
45
+
46
+
```python
47
+
import torch
48
+
from diffusers import LTXConditionPipeline, LTXLatentUpsamplePipeline
49
+
from diffusers.pipelines.ltx.pipeline_ltx_condition import LTXVideoCondition
50
+
from diffusers.utils import export_to_video, load_video
prompt ="The video depicts a winding mountain road covered in snow, with a single vehicle traveling along it. The road is flanked by steep, rocky cliffs and sparse vegetation. The landscape is characterized by rugged terrain and a river visible in the distance. The scene captures the solitude and beauty of a winter drive through a mountainous region."
# Part 3. Denoise the upscaled video with few steps to improve texture (optional, but recommended)
99
+
# No extra conditioning is passed, so this effectively is a low-step refinement of the upscaled video
100
+
video = pipe(
101
+
prompt=prompt,
102
+
negative_prompt=negative_prompt,
103
+
width=upscaled_width,
104
+
height=upscaled_height,
105
+
num_frames=num_frames,
106
+
denoise_strength=0.4, # Effectively, 4 inference steps out of 10
107
+
num_inference_steps=10,
108
+
latents=upscaled_latents,
109
+
decode_timestep=0.05,
110
+
image_cond_noise_scale=0.025,
111
+
generator=torch.Generator().manual_seed(0),
112
+
output_type="pil",
113
+
).frames[0]
114
+
115
+
# Part 4. Downscale the video to the expected resolution
116
+
video = [frame.resize((expected_width, expected_height)) for frame in video]
117
+
118
+
export_to_video(video, "output.mp4", fps=24)
119
+
```
120
+
40
121
## Loading Single Files
41
122
42
123
Loading the original LTX Video checkpoints is also possible with [`~ModelMixin.from_single_file`]. We recommend using `from_single_file` for the Lightricks series of models, as they plan to release multiple models in the future in the single file format.
0 commit comments