[core] LTX Video #10021

a-r-r-o-w · 2024-11-26T01:08:12Z

T2V:

import torch
from diffusers import LTXPipeline
from diffusers.utils import export_to_video

pipe = LTXPipeline.from_pretrained("a-r-r-o-w/LTX-Video-diffusers", torch_dtype=torch.bfloat16)
pipe.to("cuda")

prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage"
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output.mp4", fps=24)

I2V:

import torch
from diffusers import LTXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image

pipe = LTXImageToVideoPipeline.from_pretrained("a-r-r-o-w/LTX-Video-diffusers", torch_dtype=torch.bfloat16)
pipe.to("cuda")

image = load_image(
    "https://huggingface.co/datasets/a-r-r-o-w/tiny-meme-dataset-captioned/resolve/main/images/8.png"
)
prompt = "A young girl stands calmly in the foreground, looking directly at the camera, as a house fire rages in the background. Flames engulf the structure, with smoke billowing into the air. Firefighters in protective gear rush to the scene, a fire truck labeled '38' visible behind them. The girl's neutral expression contrasts sharply with the chaos of the fire, creating a poignant and emotionally charged scene."
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    image=image,
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output.mp4", fps=24)

HuggingFaceDocBuilderDev · 2024-11-26T01:15:22Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

scripts/convert_ltx_to_diffusers.py

a-r-r-o-w · 2024-11-27T14:23:52Z

src/diffusers/models/attention_processor.py

+        elif qk_norm == "rms_norm_across_heads":
+            # LTX applies qk norm across all heads
+            self.norm_q = RMSNorm(dim_head * heads, eps=eps)
+            self.norm_k = RMSNorm(dim_head * kv_heads, eps=eps)


@DN6 Should I follow your approach with Mochi and create a separate attention class for LTX?

ok but we want to be more careful, ideally, we do that as part of carefully planned-out refactor
but maybe it would be safe to just inherit form Attention for now? e.g. we wrote code like this with the assumption in mind we only have one attention class
https://github.com/huggingface/diffusers/blob/e47cc1fc1a89a5375c322d296cd122fe71ab859f/src/diffusers/pipelines/pag/pag_utils.py#L57C39-L57C48

cc @DN6 here too

I guess Attention stays here for now

src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py

stevhliu

Thanks for adding!

docs/source/en/api/models/autoencoderkl_ltx.md

docs/source/en/api/pipelines/ltx.md

src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

src/diffusers/models/transformers/transformer_ltx.py

src/diffusers/pipelines/ltx/pipeline_ltx.py

a-r-r-o-w · 2024-12-04T15:22:27Z

Thanks for the reviews! Waiting for confirmation of where we should host weights, so I can update documentation accordingly. Currently they are under my account, but once we move them, we should be good to merge

Skquark · 2024-12-06T08:12:03Z

I'm wondering if it might be easy to incorporate STG (Spatiotemporal Skip Guidance) into LTX-Video pipeline. The improvements in video quality look significant, examples are impressive. Here's the links STG Project, STGuidance GitHub, ComfyUI-LTXTricks. Looks like it can also apply to Mochi, SVD and Open-Sora. Could be that missing ingredient... Adds params stg_mode, stg_scale, stg_block_idx, do_rescaling & rescaling_scale.

a-r-r-o-w · 2024-12-06T08:16:57Z

Hey, thanks for the suggestion!

We do plan to incorporate STG and other methods of guidance by isolating it out into its separate component. I do not like the idea of adding more parameters to the pipeline __call__ because it starts to become very confusing and bloated, so the design for integration is still a WIP on my end, but I plan to open a PR in the coming week or whenever I can get it to work with all our pipelines.

* from original file mixin for ltx * undo config mapping fn changes * update

src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

yoavhacohen

One request about naming, LTX -> LTXV

docs/source/en/api/pipelines/ltx.md

tin2tin · 2024-12-15T12:55:58Z

@a-r-r-o-w

Testing, I get this error:
AttributeError: module diffusers has no attribute LTXTransformer3DModel. Did you mean: 'LatteTransformer3DModel'

a-r-r-o-w · 2024-12-15T15:15:42Z

@tin2tin I think it should be LTXVideoTransformer3DModel

tin2tin · 2024-12-15T19:21:28Z

@a-r-r-o-w I just used the test code in the first post, which doesn't specify that name. So, I guess something internally is calling a wrongly named operator?

a-r-r-o-w · 2024-12-15T19:23:49Z

Just to confirm, you have installed diffusers from the main branch, yes?

a-r-r-o-w · 2024-12-15T19:27:55Z

Also, the config files on the LTX repo were updated recently. Could you ensure you're using the latest commit of those configs so that the right model names are being pointed to?

https://huggingface.co/Lightricks/LTX-Video/commits/main

tin2tin · 2024-12-16T11:00:45Z

Yes, I'm on the latest main branch, and I downloaded the LTX repo yesterday, and it seems like the name change was 4 days ago.

Error: Python: Traceback (most recent call last):
  File "C:\Users\xxx\Documents\Blender Projekter\LTX_video.blend\Text", line 5, in <module>
  File "C:\Users\xxx\AppData\Roaming\Python\Python311\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Roaming\Python\Python311\site-packages\diffusers\pipelines\pipeline_utils.py", line 902, in from_pretrained
    loaded_sub_model = load_sub_model(
                       ^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Roaming\Python\Python311\site-packages\diffusers\pipelines\pipeline_loading_utils.py", line 635, in load_sub_model
    class_obj, class_candidates = get_class_obj_and_candidates(
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Roaming\Python\Python311\site-packages\diffusers\pipelines\pipeline_loading_utils.py", line 319, in get_class_obj_and_candidates
    class_obj = getattr(library, class_name)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Roaming\Python\Python311\site-packages\diffusers\utils\import_utils.py", line 861, in __getattr__
    raise AttributeError(f"module {self.__name__} has no attribute {name}")
AttributeError: module diffusers has no attribute LTXTransformer3DModel. Did you mean: 'LatteTransformer3DModel'?

Checking dependencies...

Abhinay1997 · 2024-12-16T17:30:15Z

@a-r-r-o-w minor typo here, cond_mask should use mask_shape rather than shape here. https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/ltx/pipeline_ltx_image2video.py#L493

* transformer * make style & make fix-copies * transformer * add transformer tests * 80% vae * make style * make fix-copies * fix * undo cogvideox changes * update * update * match vae * add docs * t2v pipeline working; scheduler needs to be checked * docs * add pipeline test * update * update * make fix-copies * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * update * copy t2v to i2v pipeline * update * apply review suggestions * update * make style * remove framewise encoding/decoding * pack/unpack latents * image2video * update * make fix-copies * update * update * rope scale fix * debug layerwise code * remove debug * Apply suggestions from code review Co-authored-by: YiYi Xu <[email protected]> * propagate precision changes to i2v pipeline * remove downcast * address review comments * fix comment * address review comments * [Single File] LTX support for loading original weights (#10135) * from original file mixin for ltx * undo config mapping fn changes * update * add single file to pipelines * update docs * Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py * Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py * rename classes based on ltx review * point to original repository for inference * make style * resolve conflicts correctly --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: YiYi Xu <[email protected]>

a-r-r-o-w added 3 commits November 26, 2024 00:02

transformer

b28f89d

make style & make fix-copies

f082cc8

transformer

a255045

add transformer tests

36c9b40

a-r-r-o-w mentioned this pull request Nov 26, 2024

[core] refactor attention_processor.py the easy way #10022

Open

a-r-r-o-w added 16 commits November 26, 2024 14:12

80% vae

c3bd2e4

make style

43f7907

make fix-copies

02a2b6b

fix

c901641

undo cogvideox changes

868cd47

update

db13a83

update

11d2d91

match vae

d320105

add docs

755e29c

t2v pipeline working; scheduler needs to be checked

ac95930

docs

5f185cd

add pipeline test

e580b6b

update

13adf3f

update

c8dfa98

make fix-copies

b234394

Merge branch 'main' into ltx-integration

e379200

a-r-r-o-w marked this pull request as ready for review November 27, 2024 14:22

a-r-r-o-w requested review from DN6, stevhliu and yiyixuxu and removed request for yiyixuxu November 27, 2024 14:22

a-r-r-o-w commented Nov 27, 2024

View reviewed changes

stevhliu reviewed Nov 27, 2024

View reviewed changes

Merge branch 'main' into ltx-integration

336ba36

a-r-r-o-w added 4 commits December 10, 2024 14:12

[Single File] LTX support for loading original weights (#10135)

9ba6a06

* from original file mixin for ltx * undo config mapping fn changes * update

Merge branch 'main' into ltx-integration

9f9e016

add single file to pipelines

db16983

update docs

69400de

a-r-r-o-w commented Dec 11, 2024

View reviewed changes

src/diffusers/models/autoencoders/autoencoder_kl_ltx.py Outdated Show resolved Hide resolved

src/diffusers/models/autoencoders/autoencoder_kl_ltx.py Outdated Show resolved Hide resolved

a-r-r-o-w added 3 commits December 11, 2024 14:47

Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

f5c4815

Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

2106441

Merge branch 'main' into ltx-integration

d997c7b

yoavhacohen suggested changes Dec 11, 2024

View reviewed changes

docs/source/en/api/pipelines/ltx.md Show resolved Hide resolved

a-r-r-o-w added 6 commits December 11, 2024 20:33

rename classes based on ltx review

4aa7896

point to original repository for inference

93d93b1

Merge branch 'main' into ltx-integration

74f186e

make style

c9a9ab5

resolve conflicts correctly

1e67968

Merge branch 'main' into ltx-integration

bee7475

a-r-r-o-w merged commit 96c376a into main Dec 12, 2024
15 checks passed

a-r-r-o-w deleted the ltx-integration branch December 12, 2024 10:51

hlky mentioned this pull request Dec 17, 2024

Support Lightricks LTX-Video #9995

Closed

Uh oh!

[core] LTX Video #10021

[core] LTX Video #10021

Uh oh!

Conversation

a-r-r-o-w commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Nov 26, 2024

Uh oh!

Uh oh!

a-r-r-o-w Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Nov 28, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Dec 3, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

stevhliu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

a-r-r-o-w commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Skquark commented Dec 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a-r-r-o-w commented Dec 6, 2024

Uh oh!

Uh oh!

Uh oh!

yoavhacohen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tin2tin commented Dec 15, 2024

Uh oh!

a-r-r-o-w commented Dec 15, 2024

Uh oh!

tin2tin commented Dec 15, 2024

Uh oh!

a-r-r-o-w commented Dec 15, 2024

Uh oh!

a-r-r-o-w commented Dec 15, 2024

Uh oh!

tin2tin commented Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Abhinay1997 commented Dec 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

a-r-r-o-w commented Nov 26, 2024 •

edited

Loading

a-r-r-o-w commented Dec 4, 2024 •

edited

Loading

Skquark commented Dec 6, 2024 •

edited

Loading

tin2tin commented Dec 16, 2024 •

edited

Loading