- 
                Notifications
    You must be signed in to change notification settings 
- Fork 6.5k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
(venv) C:\ai1\LTX-Video>python inference.py
Traceback (most recent call last):
  File "C:\ai1\LTX-Video\inference.py", line 23, in <module>
    text_encoder = T5EncoderModel.from_pretrained(
  File "C:\ai1\LTX-Video\venv\lib\site-packages\transformers\modeling_utils.py", line 3779, in from_pretrained
    raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory Lightricks/LTX-Video.
(venv) C:\ai1\LTX-Video>python inference.py
Traceback (most recent call last):
  File "C:\ai1\LTX-Video\inference.py", line 23, in <module>
    text_encoder = T5EncoderModel.from_pretrained(
  File "C:\ai1\LTX-Video\venv\lib\site-packages\transformers\modeling_utils.py", line 3779, in from_pretrained
    raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory Lightricks/LTX-Video.
Reproduction
Install diffusers from source and use the code mentioned here
https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video
Logs
C:\ai1\LTX-Video\Lightricks>tree /F
Folder PATH listing for volume Windows-SSD
Volume serial number is CE9F-A6AE
C:.
ββββLTX-Video
    β   ltx-video-2b-v0.9.1.safetensors
    β   model_index.json
    β
    ββββtext_encoder
    β       config.json
    β       model-00001-of-00004.safetensors
    β       model-00002-of-00004.safetensors
    β       model-00003-of-00004.safetensors
    β       model-00004-of-00004.safetensors
    β
    ββββtokenizer
    β       added_tokens.json
    β       special_tokens_map.json
    β       spiece.model
    β       tokenizer_config.json
    β
    ββββtransformer
    β       config.json
    β       diffusion_pytorch_model-00001-of-00002.safetensors
    β       diffusion_pytorch_model-00002-of-00002.safetensors
    β       diffusion_pytorch_model.safetensors.index.json
    β
    ββββvae
            config.json
            diffusion_pytorch_model.safetensorsSystem Info
Windows 11/ Python 3.10.11
(venv) C:\ai1\LTX-Video>pip list
Package            Version
------------------ ------------
accelerate         1.2.1
certifi            2024.12.14
charset-normalizer 3.4.0
colorama           0.4.6
diffusers          0.32.0.dev0
einops             0.8.0
filelock           3.16.1
fsspec             2024.12.0
gguf               0.13.0
huggingface-hub    0.25.2
idna               3.10
importlib_metadata 8.5.0
Jinja2             3.1.4
MarkupSafe         3.0.2
mpmath             1.3.0
networkx           3.4.2
numpy              2.2.0
packaging          24.2
pillow             11.0.0
pip                23.0.1
psutil             6.1.1
PyYAML             6.0.2
regex              2024.11.6
requests           2.32.3
safetensors        0.4.5
sentencepiece      0.2.0
setuptools         65.5.0
sympy              1.13.1
tokenizers         0.21.0
torch              2.5.1+cu124
torchvision        0.20.1+cu124
tqdm               4.67.1
transformers       4.47.1
typing_extensions  4.12.2
urllib3            2.2.3
wheel              0.45.1
zipp               3.21.0
Who can help?
import torch
from diffusers import LTXPipeline
from transformers import T5EncoderModel, T5Tokenizer
single_file_url = "Lightricks/LTX-Video/ltx-video-2b-v0.9.1.safetensors"
text_encoder = T5EncoderModel.from_pretrained(
  "Lightricks/LTX-Video", subfolder="text_encoder", torch_dtype=torch.bfloat16
)
tokenizer = T5Tokenizer.from_pretrained(
  "Lightricks/LTX-Video", subfolder="tokenizer", torch_dtype=torch.bfloat16
)
pipe = LTXPipeline.from_single_file(
  single_file_url, text_encoder=text_encoder, tokenizer=tokenizer, torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()
prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage"
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output_ltx.mp4", fps=24)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working