- 
                Notifications
    You must be signed in to change notification settings 
- Fork 6.5k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Using the documentation, single file loading doesn't seem to work.
Reproduction
Combining the 0.95 weight and the from_single_file code from the documentation doesn't seem to work: https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video
import torch
from diffusers import AutoencoderKLLTXVideo, LTXPipeline, LTXVideoTransformer3DModel
single_file_url = "https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.5.safetensors"
transformer = LTXVideoTransformer3DModel.from_single_file(
  single_file_url, torch_dtype=torch.bfloat16
)
vae = AutoencoderKLLTXVideo.from_single_file(single_file_url, torch_dtype=torch.bfloat16)
pipe = LTXPipeline.from_pretrained(
  "Lightricks/LTX-Video", transformer=transformer, vae=vae, torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()
prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage"
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output_gguf_ltx.mp4", fps=24)
Logs
Loading checkpoint shards: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 4/4 [00:09<00:00,  2.26s/it]
special_tokens_map.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 2.54k/2.54k [00:00<?, ?B/s]
added_tokens.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 2.59k/2.59k [00:00<?, ?B/s]
tokenizer_config.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 20.6k/20.6k [00:00<00:00, 20.6MB/s]
scheduler_config.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 419/419 [00:00<?, ?B/s]
model.safetensors.index.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββ| 19.9k/19.9k [00:00<00:00, 4.97MB/s]
config.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 786/786 [00:00<?, ?B/s]
model_index.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 412/412 [00:00<?, ?B/s]
Fetching 9 files: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 9/9 [00:00<00:00, 10.84it/s]
Loading pipeline components...:  80%|ββββββββββββββββββββββββββββββββββββββββββ          | 4/5 [00:00<00:00, 13.73it/s]
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file.py", line 509, in from_single_file
    loaded_sub_model = load_single_file_sub_model(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file.py", line 104, in load_single_file_sub_model
    loaded_sub_model = load_method(
                       ^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file_model.py", line 399, in from_single_file
    load_model_dict_into_meta(
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\models\model_loading_utils.py", line 288, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because encoder.conv_out.conv.weight expected shape torch.Size([129, 512, 3, 3, 3]), but got torch.Size([129, 2048, 3, 3, 3]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\Blender Projekter\ltx.blend\Text", line 7, in <module>
NameError: name 'LTXVideoTransformer3DModel' is not defined
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\Blender Projekter\ltx.blend\Text", line 3, in <module>
ImportError: cannot import name 'LTXVideoTransformer3DModel' from 'transformers' (C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\transformers\__init__.py)
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\Blender Projekter\ltx.blend\Text", line 9, in <module>
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file_model.py", line 399, in from_single_file
    load_model_dict_into_meta(
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\models\model_loading_utils.py", line 288, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because encoder.conv_out.conv.weight expected shape torch.Size([129, 512, 3, 3, 3]), but got torch.Size([129, 2048, 3, 3, 3]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\Blender Projekter\ltx.blend\Text", line 9, in <module>
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file_model.py", line 399, in from_single_file
    load_model_dict_into_meta(
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\models\model_loading_utils.py", line 288, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because encoder.conv_out.conv.weight expected shape torch.Size([129, 512, 3, 3, 3]), but got torch.Size([129, 2048, 3, 3, 3]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.System Info
Win 11, Diffusers: 0.33.0
Who can help?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working