Skip to content

Commit 11fd809

Browse files
authored
Merge branch 'main' into more-flux-lora-tests
2 parents e03084a + ec9bfa9 commit 11fd809

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+7590
-379
lines changed

.github/workflows/nightly_tests.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -238,12 +238,13 @@ jobs:
238238
239239
run_flax_tpu_tests:
240240
name: Nightly Flax TPU Tests
241-
runs-on: docker-tpu
241+
runs-on:
242+
group: gcp-ct5lp-hightpu-8t
242243
if: github.event_name == 'schedule'
243244

244245
container:
245246
image: diffusers/diffusers-flax-tpu
246-
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --privileged
247+
options: --shm-size "16gb" --ipc host --privileged ${{ vars.V5_LITEPOD_8_ENV}} -v /mnt/hf_cache:/mnt/hf_cache
247248
defaults:
248249
run:
249250
shell: bash
@@ -519,4 +520,4 @@ jobs:
519520
# if: always()
520521
# run: |
521522
# pip install slack_sdk tabulate
522-
# python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
523+
# python utils/log_reports.py >> $GITHUB_STEP_SUMMARY

.github/workflows/push_tests.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -161,11 +161,11 @@ jobs:
161161

162162
flax_tpu_tests:
163163
name: Flax TPU Tests
164-
runs-on: docker-tpu
164+
runs-on:
165+
group: gcp-ct5lp-hightpu-8t
165166
container:
166167
image: diffusers/diffusers-flax-tpu
167-
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --privileged
168-
defaults:
168+
options: --shm-size "16gb" --ipc host --privileged ${{ vars.V5_LITEPOD_8_ENV}} -v /mnt/hf_cache:/mnt/hf_cache defaults:
169169
run:
170170
shell: bash
171171
steps:

docs/source/en/_toctree.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -274,6 +274,8 @@
274274
title: LatteTransformer3DModel
275275
- local: api/models/lumina_nextdit2d
276276
title: LuminaNextDiT2DModel
277+
- local: api/models/ltx_video_transformer3d
278+
title: LTXVideoTransformer3DModel
277279
- local: api/models/mochi_transformer3d
278280
title: MochiTransformer3DModel
279281
- local: api/models/pixart_transformer2d
@@ -312,6 +314,8 @@
312314
title: AutoencoderKLAllegro
313315
- local: api/models/autoencoderkl_cogvideox
314316
title: AutoencoderKLCogVideoX
317+
- local: api/models/autoencoderkl_ltx_video
318+
title: AutoencoderKLLTXVideo
315319
- local: api/models/autoencoderkl_mochi
316320
title: AutoencoderKLMochi
317321
- local: api/models/asymmetricautoencoderkl
@@ -408,6 +412,8 @@
408412
title: Latte
409413
- local: api/pipelines/ledits_pp
410414
title: LEDITS++
415+
- local: api/pipelines/ltx_video
416+
title: LTX
411417
- local: api/pipelines/lumina
412418
title: Lumina-T2X
413419
- local: api/pipelines/marigold
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# AutoencoderKLLTXVideo
13+
14+
The 3D variational autoencoder (VAE) model with KL loss used in [LTX](https://huggingface.co/Lightricks/LTX-Video) was introduced by Lightricks.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import AutoencoderKLLTXVideo
20+
21+
vae = AutoencoderKLLTXVideo.from_pretrained("TODO/TODO", subfolder="vae", torch_dtype=torch.float32).to("cuda")
22+
```
23+
24+
## AutoencoderKLLTXVideo
25+
26+
[[autodoc]] AutoencoderKLLTXVideo
27+
- decode
28+
- encode
29+
- all
30+
31+
## AutoencoderKLOutput
32+
33+
[[autodoc]] models.autoencoders.autoencoder_kl.AutoencoderKLOutput
34+
35+
## DecoderOutput
36+
37+
[[autodoc]] models.autoencoders.vae.DecoderOutput
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# LTXVideoTransformer3DModel
13+
14+
A Diffusion Transformer model for 3D data from [LTX](https://huggingface.co/Lightricks/LTX-Video) was introduced by Lightricks.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import LTXVideoTransformer3DModel
20+
21+
transformer = LTXVideoTransformer3DModel.from_pretrained("TODO/TODO", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
22+
```
23+
24+
## LTXVideoTransformer3DModel
25+
26+
[[autodoc]] LTXVideoTransformer3DModel
27+
28+
## Transformer2DModelOutput
29+
30+
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License. -->
14+
15+
# LTX
16+
17+
[LTX Video](https://huggingface.co/Lightricks/LTX-Video) is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched. Trained on a large-scale dataset of diverse videos, the model generates high-resolution videos with realistic and varied content. We provide a model for both text-to-video as well as image + text-to-video usecases.
18+
19+
<Tip>
20+
21+
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers.md) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading.md#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
22+
23+
</Tip>
24+
25+
## Loading Single Files
26+
27+
Loading the original LTX Video checkpoints is also possible with [`~ModelMixin.from_single_file`].
28+
29+
```python
30+
import torch
31+
from diffusers import AutoencoderKLLTXVideo, LTXImageToVideoPipeline, LTXVideoTransformer3DModel
32+
33+
single_file_url = "https://huggingface.co/Lightricks/LTX-Video/ltx-video-2b-v0.9.safetensors"
34+
transformer = LTXVideoTransformer3DModel.from_single_file(single_file_url, torch_dtype=torch.bfloat16)
35+
vae = AutoencoderKLLTXVideo.from_single_file(single_file_url, torch_dtype=torch.bfloat16)
36+
pipe = LTXImageToVideoPipeline.from_pretrained("Lightricks/LTX-Video", transformer=transformer, vae=vae, torch_dtype=torch.bfloat16)
37+
38+
# ... inference code ...
39+
```
40+
41+
Alternatively, the pipeline can be used to load the weights with [~FromSingleFileMixin.from_single_file`].
42+
43+
```python
44+
import torch
45+
from diffusers import LTXImageToVideoPipeline
46+
from transformers import T5EncoderModel, T5Tokenizer
47+
48+
single_file_url = "https://huggingface.co/Lightricks/LTX-Video/ltx-video-2b-v0.9.safetensors"
49+
text_encoder = T5EncoderModel.from_pretrained("Lightricks/LTX-Video", subfolder="text_encoder", torch_dtype=torch.bfloat16)
50+
tokenizer = T5Tokenizer.from_pretrained("Lightricks/LTX-Video", subfolder="tokenizer", torch_dtype=torch.bfloat16)
51+
pipe = LTXImageToVideoPipeline.from_single_file(single_file_url, text_encoder=text_encoder, tokenizer=tokenizer, torch_dtype=torch.bfloat16)
52+
```
53+
54+
## LTXPipeline
55+
56+
[[autodoc]] LTXPipeline
57+
- all
58+
- __call__
59+
60+
## LTXImageToVideoPipeline
61+
62+
[[autodoc]] LTXImageToVideoPipeline
63+
- all
64+
- __call__
65+
66+
## LTXPipelineOutput
67+
68+
[[autodoc]] pipelines.ltx.pipeline_output.LTXPipelineOutput

0 commit comments

Comments
 (0)