Cannot load SD3.5M with from_single_file, mismatched shape for pos_embed.pos_embed in empty state dict

### Describe the bug

When trying to load the Stable Diffusion 3.5 Medium checkpoint from Stability AI with the following command:

pipe = diffusers.StableDiffusion3Pipeline.from_single_file( '...path/sd3.5_medium.safetensors', text_encoder=None, text_encoder_2=None, text_encoder_3=None )

The following error is produced: "Cannot load  because pos_embed.pos_embed expected shape torch.Size([1, 36864, 1536]), but got torch.Size([1, 147456, 1536])"

Please note that I slightly altered model_loading_utils.py to print this error, as the original line was missing the .shape property.

f"Cannot load {model_name_or_path_str} because {param_name} expected shape {empty_state_dict[param_name]}, but got {param.shape}"

I added .shape to empty_state_dict[param_name] as I think that was the intention.

The checkpoint I am using was from the first day of release or so, so it's possible that it was changed, though it doesn't look like it has been in the huggingface repo.

### Reproduction

pipe = diffusers.StableDiffusion3Pipeline.from_single_file( '...path/sd3.5_medium.safetensors', text_encoder=None, text_encoder_2=None, text_encoder_3=None )

### Logs

```shell
Traceback (most recent call last):
  File "...path\sd_trainer.py", line 360, in <module>
    pipe = diffusers.StableDiffusion3Pipeline.from_single_file( os.path.join(config.models_dir, config.init_model), text_encoder=None, text_encoder_2=None, text_encoder_3=None )
  File "...path\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "...path\venv\src\diffusers\src\diffusers\loaders\single_file.py", line 495, in from_single_file
    loaded_sub_model = load_single_file_sub_model(
  File "...path\venv\src\diffusers\src\diffusers\loaders\single_file.py", line 102, in load_single_file_sub_model
    loaded_sub_model = load_method(
  File "...path\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "...path\venv\src\diffusers\src\diffusers\loaders\single_file_model.py", line 299, in from_single_file
    unexpected_keys = load_model_dict_into_meta(model, diffusers_format_checkpoint, dtype=torch_dtype)
  File "...path\venv\src\diffusers\src\diffusers\models\model_loading_utils.py", line 223, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because pos_embed.pos_embed expected shape torch.Size([1, 36864, 1536]), but got torch.Size([1, 147456, 1536]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.
```


### System Info

Windows 10, using the following diffusers version, which is the latest as of this post.

-e git+https://github.com/huggingface/diffusers.git@074e123#egg=diffusers
transformers==4.46.3

### Who can help?

@yiyixuxu @sayakpaul @DN6 @asomoza

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cannot load SD3.5M with from_single_file, mismatched shape for pos_embed.pos_embed in empty state dict #10016

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cannot load SD3.5M with from_single_file, mismatched shape for pos_embed.pos_embed in empty state dict #10016

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions