Tensor.item() cannot be called on meta tensors

### Describe the bug

As explained in the documentation, I am trying to use this feature to save memory
https://github.com/huggingface/diffusers/blob/main/docs/source/en/optimization/memory.md#cpu-offloading

I understand that enable_sequential_cpu_offload is currently not possible with bitsandbytes - int4 for which bug is logged https://github.com/bitsandbytes-foundation/bitsandbytes/issues/1525

But as shown in example it should work for pipelines without quantization.

I can confirm it works for kandinsky3 / AutoPipelineForText2Image.

### Reproduction

```python
import torch
from diffusers import Lumina2Text2ImgPipeline

pipe = Lumina2Text2ImgPipeline.from_pretrained(
    "Alpha-VLLM/Lumina-Image-2.0", torch_dtype=torch.bfloat16
)
# pipe.enable_model_cpu_offload()
pipe.enable_sequential_cpu_offload()

prompt = "Hitoshi Ashinano style. A young girl with vibrant green hair and large purple eyes peeks out from behind a white wooden door. She is wearing a white shirt and have a curious expression on her face. The background shows a blue sky with a few clouds, and there's a white fence visible. Green leaves hang down from the top left corner, and a small white circle can be seen in the sky. The scene captures a moment of innocent curiosity and wonder."

image = pipe(
    prompt, 
    negative_prompt="blurry, ugly, bad, deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, cropped, out of frame, worst quality, low quality, jpeg artifacts, fused fingers, morbid, mutilated, extra fingers, mutated hands, bad anatomy, bad proportion, extra limbs", 
    guidance_scale=6,
    num_inference_steps=35, 
    generator=torch.manual_seed(10)
).images[0]
image.save("lumina2.png")
```

### Logs

```shell
(venv) C:\aiOWN\diffuser_webui>python lumina2_lora.py
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:07<00:00,  3.54s/it]
Loading checkpoint shards: 100%|████████████████████████████████████| 3/3 [00:09<00:00,  3.17s/it]
Loading pipeline components...: 100%|███████████████████████████████| 5/5 [00:17<00:00,  3.59s/it]
The 'batch_size' argument of HybridCache is deprecated and will be removed in v4.49. Use the more precisely named 'max_batch_size' argument instead.
The 'batch_size' attribute of HybridCache is deprecated and will be removed in v4.49. Use the more precisely named 'self.max_batch_size' attribute instead.
Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\lumina2_lora.py", line 13, in <module>
    image = pipe(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\lumina2\pipeline_lumina2.py", line 648, in __call__
    ) = self.encode_prompt(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\lumina2\pipeline_lumina2.py", line 293, in encode_prompt
    prompt_embeds, prompt_attention_mask = self._get_gemma_prompt_embeds(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\lumina2\pipeline_lumina2.py", line 221, in _get_gemma_prompt_embeds
    prompt_embeds = self.text_encoder(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\accelerate\hooks.py", line 176, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\transformers\models\gemma2\modeling_gemma2.py", line 575, in forward
    past_key_values = HybridCache(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\transformers\cache_utils.py", line 1657, in __init__
    cache_shape = global_cache_shape if not self.is_sliding[i] else sliding_cache_shape
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\_meta_registrations.py", line 6471, in meta_local_scalar_dense
    raise RuntimeError("Tensor.item() cannot be called on meta tensors")
RuntimeError: Tensor.item() cannot be called on meta tensors
```

### System Info

(venv) C:\aiOWN\diffuser_webui>diffusers-cli env

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- 🤗 Diffusers version: 0.33.0.dev0
- Platform: Windows-10-10.0.26100-SP0
- Running on Google Colab?: No
- Python version: 3.10.11
- PyTorch version (GPU?): 2.5.1+cu124 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.27.1
- Transformers version: 4.48.1
- Accelerate version: 1.4.0.dev0
- PEFT version: 0.14.0
- Bitsandbytes version: 0.45.3.dev0
- Safetensors version: 0.5.2
- xFormers version: not installed
- Accelerator: NVIDIA GeForce RTX 4060 Laptop GPU, 8188 MiB
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

### Who can help?

@DN6 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tensor.item() cannot be called on meta tensors #10869

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tensor.item() cannot be called on meta tensors #10869

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions