LTX 0.95 Single file

### Describe the bug

Using the documentation, single file loading doesn't seem to work.

### Reproduction

Combining the 0.95 weight and the from_single_file code from the documentation doesn't seem to work: https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video

```
import torch
from diffusers import AutoencoderKLLTXVideo, LTXPipeline, LTXVideoTransformer3DModel

single_file_url = "https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.5.safetensors"

transformer = LTXVideoTransformer3DModel.from_single_file(
  single_file_url, torch_dtype=torch.bfloat16
)
vae = AutoencoderKLLTXVideo.from_single_file(single_file_url, torch_dtype=torch.bfloat16)
pipe = LTXPipeline.from_pretrained(
  "Lightricks/LTX-Video", transformer=transformer, vae=vae, torch_dtype=torch.bfloat16
)

pipe.enable_model_cpu_offload()

prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage"
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output_gguf_ltx.mp4", fps=24)
```

### Logs

```shell
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 4/4 [00:09<00:00,  2.26s/it]
special_tokens_map.json: 100%|████████████████████████████████████████████████████████████| 2.54k/2.54k [00:00<?, ?B/s]
added_tokens.json: 100%|██████████████████████████████████████████████████████████████████| 2.59k/2.59k [00:00<?, ?B/s]
tokenizer_config.json: 100%|██████████████████████████████████████████████████████| 20.6k/20.6k [00:00<00:00, 20.6MB/s]
scheduler_config.json: 100%|██████████████████████████████████████████████████████████████████| 419/419 [00:00<?, ?B/s]
model.safetensors.index.json: 100%|███████████████████████████████████████████████| 19.9k/19.9k [00:00<00:00, 4.97MB/s]
config.json: 100%|████████████████████████████████████████████████████████████████████████████| 786/786 [00:00<?, ?B/s]
model_index.json: 100%|███████████████████████████████████████████████████████████████████████| 412/412 [00:00<?, ?B/s]
Fetching 9 files: 100%|██████████████████████████████████████████████████████████████████| 9/9 [00:00<00:00, 10.84it/s]
Loading pipeline components...:  80%|█████████████████████████████████████████▌          | 4/5 [00:00<00:00, 13.73it/s]
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file.py", line 509, in from_single_file
    loaded_sub_model = load_single_file_sub_model(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file.py", line 104, in load_single_file_sub_model
    loaded_sub_model = load_method(
                       ^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file_model.py", line 399, in from_single_file
    load_model_dict_into_meta(
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\models\model_loading_utils.py", line 288, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because encoder.conv_out.conv.weight expected shape torch.Size([129, 512, 3, 3, 3]), but got torch.Size([129, 2048, 3, 3, 3]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\Blender Projekter\ltx.blend\Text", line 7, in <module>
NameError: name 'LTXVideoTransformer3DModel' is not defined
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\Blender Projekter\ltx.blend\Text", line 3, in <module>
ImportError: cannot import name 'LTXVideoTransformer3DModel' from 'transformers' (C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\transformers\__init__.py)
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\Blender Projekter\ltx.blend\Text", line 9, in <module>
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file_model.py", line 399, in from_single_file
    load_model_dict_into_meta(
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\models\model_loading_utils.py", line 288, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because encoder.conv_out.conv.weight expected shape torch.Size([129, 512, 3, 3, 3]), but got torch.Size([129, 2048, 3, 3, 3]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.
Error: Python: Traceback (most recent call last):
  File "C:\Users\peter\Documents\Blender Projekter\ltx.blend\Text", line 9, in <module>
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\loaders\single_file_model.py", line 399, in from_single_file
    load_model_dict_into_meta(
  File "C:\Users\peter\Documents\blender-4.4.1\blender-4.4.1\4.4\python\Lib\site-packages\diffusers\models\model_loading_utils.py", line 288, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because encoder.conv_out.conv.weight expected shape torch.Size([129, 512, 3, 3, 3]), but got torch.Size([129, 2048, 3, 3, 3]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.
```

### System Info

Win 11, Diffusers: 0.33.0

### Who can help?

@DN6 @a-r-r-o-w

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LTX 0.95 Single file #11258

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

LTX 0.95 Single file #11258

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions