Sana 4k with use_resolution_binning not supported due to sample_size 128

### Describe the bug

Using the new 4k model fails with defaults values. Specifically with use_resolution_binning=True which is the default. 

```
Traceback (most recent call last):
  File "/home/rockerboo/code/others/sana-diffusers/main.py", line 28, in <module>
    image = pipe(
            ^^^^^
  File "/home/rockerboo/code/others/sana-diffusers/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/rockerboo/code/others/sana-diffusers/.venv/lib/python3.11/site-packages/diffusers/pipelines/pag/pipeline_pag_sana.py", line 736, in __call__
    raise ValueError("Invalid sample size")
ValueError: Invalid sample size
```

Specifically https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/pag/pipeline_pag_sana.py#L728-L736 limits the binning which doesn't support the 4k

https://huggingface.co/Efficient-Large-Model/Sana_1600M_4Kpx_BF16_diffusers/blob/main/transformer/config.json#L20 the sample size is 128

Should just be a matter of adding the binning information for 4k. 

### Reproduction

https://huggingface.co/Efficient-Large-Model/Sana_1600M_4Kpx_BF16_diffusers#1-how-to-use-sanapipeline-with-%F0%9F%A7%A8diffusers PAG or the non-PAG instructions here. 

```python
# run `pip install git+https://github.com/huggingface/diffusers` before use Sana in diffusers
import torch
from diffusers import SanaPipeline

pipe = SanaPipeline.from_pretrained(
    "Efficient-Large-Model/Sana_1600M_4Kpx_BF16_diffusers",
    variant="bf16",
    torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

pipe.vae.to(torch.bfloat16)
pipe.text_encoder.to(torch.bfloat16)

# for 4096x4096 image generation OOM issue
if pipe.transformer.config.sample_size == 128:
    from patch_conv import convert_model
    pipe.vae = convert_model(pipe.vae, splits=32)

prompt = 'A cute 🐼 eating 🎋, ink drawing style'
image = pipe(
    prompt=prompt,
    height=4096,
    width=4096,
    guidance_scale=5.0,
    num_inference_steps=20,
    generator=torch.Generator(device="cuda").manual_seed(42),
)[0]

image[0].save("sana.png")
```

### Logs

```shell
A mixture of bf16 and non-bf16 filenames will be loaded.
Loaded bf16 filenames:
[vae/diffusion_pytorch_model.bf16.safetensors, transformer/diffusion_pytorch_model.bf16.safetensors, text_encoder/model.bf16-00002-of-00002.safetensors, text_encoder/model.bf16-00001-of-00002.safetensors]
Loaded non-bf16 filenames:
[transformer/diffusion_pytorch_model-00001-of-00002.safetensors, transformer/diffusion_pytorch_model-00002-of-00002.safetensors
If this behavior is not expected, please check your folder structure.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████| 2/2 [00:01<00:00,  1.97it/s]
Loading pipeline components...: 100%|█████████████████████████████████████████████████| 5/5 [00:03<00:00,  1.65it/s]
Traceback (most recent call last):
  File "/home/rockerboo/code/others/sana-diffusers/main.py", line 28, in <module>
    image = pipe(
            ^^^^^
  File "/home/rockerboo/code/others/sana-diffusers/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/rockerboo/code/others/sana-diffusers/.venv/lib/python3.11/site-packages/diffusers/pipelines/pag/pipeline_pag_sana.py", line 736, in __call__
    raise ValueError("Invalid sample size")
ValueError: Invalid sample size
```
```


### System Info

- 🤗 Diffusers version: 0.33.0.dev0
- Platform: Linux-6.12.6-arch1-1-x86_64-with-glibc2.40
- Running on Google Colab?: No
- Python version: 3.11.10
- PyTorch version (GPU?): 2.4.0+cu121 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.27.1
- Transformers version: 4.47.1
- Accelerate version: 1.2.1
- PEFT version: not installed
- Bitsandbytes version: not installed
- Safetensors version: 0.5.2
- xFormers version: not installed
- Accelerator: NVIDIA GeForce RTX 2080, 8192 MiB
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

### Who can help?

@yiyixuxu @DN6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sana 4k with use_resolution_binning not supported due to sample_size 128 #10514

Describe the bug

Reproduction

Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sana 4k with use_resolution_binning not supported due to sample_size 128 #10514

Description

Describe the bug

Reproduction

Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions