Skip to content

Conversation

@a-r-r-o-w
Copy link
Contributor

@a-r-r-o-w a-r-r-o-w commented Dec 5, 2024

With Diffusers weights:

import torch
from diffusers import LTXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image

pipe = LTXImageToVideoPipeline.from_pretrained("Lightricks/LTX-Video", torch_dtype=torch.bfloat16, revision="refs/pr/32")
pipe.to("cuda")

image = load_image("https://huggingface.co/datasets/a-r-r-o-w/tiny-meme-dataset-captioned/resolve/main/images/8.png")
prompt = "A young girl stands calmly in the foreground, looking directly at the camera, as a house fire rages in the background. Flames engulf the structure, with smoke billowing into the air. Firefighters in protective gear rush to the scene, a fire truck labeled '38' visible behind them. The girl's neutral expression contrasts sharply with the chaos of the fire, creating a poignant and emotionally charged scene."
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    image=image,
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output.mp4", fps=24)

With original weights:

import torch
from diffusers import AutoencoderKLLTX, LTXImageToVideoPipeline, LTXTransformer3DModel
from diffusers.utils import export_to_video, load_image

single_file_url = "https://huggingface.co/Lightricks/LTX-Video/ltx-video-2b-v0.9.safetensors"
transformer = LTXTransformer3DModel.from_single_file(single_file_url, torch_dtype=torch.bfloat16, revision="refs/pr/32")
vae = AutoencoderKLLTX.from_single_file(single_file_url, torch_dtype=torch.bfloat16, revision="refs/pr/32")
pipe = LTXImageToVideoPipeline.from_pretrained("Lightricks/LTX-Video", transformer=transformer, vae=vae, torch_dtype=torch.bfloat16, revision="refs/pr/32")
pipe.to("cuda")

image = load_image("https://huggingface.co/datasets/a-r-r-o-w/tiny-meme-dataset-captioned/resolve/main/images/8.png")
prompt = "A young girl stands calmly in the foreground, looking directly at the camera, as a house fire rages in the background. Flames engulf the structure, with smoke billowing into the air. Firefighters in protective gear rush to the scene, a fire truck labeled '38' visible behind them. The girl's neutral expression contrasts sharply with the chaos of the fire, creating a poignant and emotionally charged scene."
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    image=image,
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output.mp4", fps=24)

Discussion: https://huggingface.slack.com/archives/C08275HSG8J/p1733324207024939

cc @yiyixuxu

@a-r-r-o-w a-r-r-o-w requested review from DN6 and yiyixuxu December 5, 2024 22:05
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@a-r-r-o-w a-r-r-o-w merged commit 9ba6a06 into ltx-integration Dec 10, 2024
2 checks passed
@a-r-r-o-w a-r-r-o-w deleted the ltx-single-file branch December 10, 2024 08:42
a-r-r-o-w added a commit that referenced this pull request Dec 12, 2024
* transformer

* make style & make fix-copies

* transformer

* add transformer tests

* 80% vae

* make style

* make fix-copies

* fix

* undo cogvideox changes

* update

* update

* match vae

* add docs

* t2v pipeline working; scheduler needs to be checked

* docs

* add pipeline test

* update

* update

* make fix-copies

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

* update

* copy t2v to i2v pipeline

* update

* apply review suggestions

* update

* make style

* remove framewise encoding/decoding

* pack/unpack latents

* image2video

* update

* make fix-copies

* update

* update

* rope scale fix

* debug layerwise code

* remove debug

* Apply suggestions from code review

Co-authored-by: YiYi Xu <[email protected]>

* propagate precision changes to i2v pipeline

* remove downcast

* address review comments

* fix comment

* address review comments

* [Single File] LTX support for loading original weights (#10135)

* from original file mixin for ltx

* undo config mapping fn changes

* update

* add single file to pipelines

* update docs

* Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

* Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

* rename classes based on ltx review

* point to original repository for inference

* make style

* resolve conflicts correctly

---------

Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: YiYi Xu <[email protected]>
sayakpaul pushed a commit that referenced this pull request Dec 23, 2024
* transformer

* make style & make fix-copies

* transformer

* add transformer tests

* 80% vae

* make style

* make fix-copies

* fix

* undo cogvideox changes

* update

* update

* match vae

* add docs

* t2v pipeline working; scheduler needs to be checked

* docs

* add pipeline test

* update

* update

* make fix-copies

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

* update

* copy t2v to i2v pipeline

* update

* apply review suggestions

* update

* make style

* remove framewise encoding/decoding

* pack/unpack latents

* image2video

* update

* make fix-copies

* update

* update

* rope scale fix

* debug layerwise code

* remove debug

* Apply suggestions from code review

Co-authored-by: YiYi Xu <[email protected]>

* propagate precision changes to i2v pipeline

* remove downcast

* address review comments

* fix comment

* address review comments

* [Single File] LTX support for loading original weights (#10135)

* from original file mixin for ltx

* undo config mapping fn changes

* update

* add single file to pipelines

* update docs

* Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

* Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

* rename classes based on ltx review

* point to original repository for inference

* make style

* resolve conflicts correctly

---------

Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: YiYi Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants