Cogview4 pipeline not accepting prompt embeds, due to shape issues .

### Describe the bug

I've trying to run CogView4 using separate pipelines to encode text and generate the image in order to save memory (Unified Memory so I can't use offloading) with the aim of doing multiple prompts

e.g.

```py
te_pipe = CogView4Pipeline.from_pretrained("THUDM/CogView4-6B",
                                           transformer=None,
                                           vae=None,
                                           torch_dtype=torch.bfloat16).to("mps")

with torch.no_grad():
    prompt_embeds, negative_prompt_embeds = te_pipe.encode_prompt(
        prompt,
        negative_prompt,
        num_images_per_prompt=num_images_per_prompt,
    )

del te_pipe

pipe = CogView4Pipeline.from_pretrained("THUDM/CogView4-6B", text_encoder=None, tokenizer=None, torch_dtype=torch.bfloat16).to("mps")

```

and I get a failure with the following error

```py
ValueError: `prompt_embeds` and `negative_prompt_embeds` must have the same shape when passed directly, but got: `prompt_embeds` torch.Size([1, 144, 4096]) != `negative_prompt_embeds` torch.Size([1, 48, 4096]).
```

I'm using the encode function used by the pipe and can't see why the embeds would the any different to those used internally, and if so I'm not sure why the size check is needed and assuming either the check is a bug or the size of the embeddings the encode_prompt function generates is a bug.

If I try to skip the negative embeds the code tries to generate negative prompt embeddings which fails the new pipe doesn't have an encoder.



### Reproduction

```py
from diffusers import CogView4Pipeline
import torch
import gc

te_pipe = CogView4Pipeline.from_pretrained("THUDM/CogView4-6B",
                                           transformer=None,
                                           vae=None,
                                           torch_dtype=torch.bfloat16).to("mps")


prompt = "A vibrant cherry red sports car sits proudly under the gleaming sun, its polished exterior smooth and flawless, casting a mirror-like reflection. The car features a low, aerodynamic body, angular headlights that gaze forward like predatory eyes, and a set of black, high-gloss racing rims that contrast starkly with the red. A subtle hint of chrome embellishes the grille and exhaust, while the tinted windows suggest a luxurious and private interior. The scene conveys a sense of speed and elegance, the car appearing as if it's about to burst into a sprint along a coastal road, with the ocean's azure waves crashing in the background."

negative_prompt = "anime, cartoon, graphic, text, painting, crayon, graphite, abstract, glitch, deformed, mutated, ugly, disfigured, jpeg artefacts"

num_images_per_prompt=1

with torch.no_grad():
    prompt_embeds, negative_prompt_embeds = te_pipe.encode_prompt(
        prompt,
        negative_prompt,
        num_images_per_prompt=num_images_per_prompt,
    )

def flush():
    gc.collect()
    torch.mps.empty_cache()
    gc.collect()
    torch.mps.empty_cache()

del te_pipe.text_encoder
del te_pipe
flush()

pipe = CogView4Pipeline.from_pretrained("THUDM/CogView4-6B", text_encoder=None, tokenizer=None, torch_dtype=torch.bfloat16).to("mps")


# Open it for reduce GPU memory usage
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()

image = pipe(
    prompt_embeds=prompt_embeds,
    negative_prompt_embeds=negative_prompt_embeds,
    guidance_scale=3.5,
    num_images_per_prompt=num_images_per_prompt,
    num_inference_steps=50,
    width=1024,
    height=1024,
).images[0]

image.save("cogview4.png")
```


### Logs

```shell
$ python cogview4_split.py 
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.11it/s]
Loading pipeline components...: 100%|█████████████████████████████████████████████████████| 3/3 [00:00<00:00,  5.17it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 29.83it/s]
Loading pipeline components...: 100%|█████████████████████████████████████████████████████| 3/3 [00:00<00:00, 18.05it/s]
Traceback (most recent call last):
  File "/Volumes/SSD2TB/AI/Diffusers/cogview4_split.py", line 41, in <module>
    image = pipe(
            ^^^^^
  File "/Volumes/SSD2TB/AI/Diffusers/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/SSD2TB/AI/Diffusers/lib/python3.11/site-packages/diffusers/pipelines/cogview4/pipeline_cogview4.py", line 515, in __call__
    self.check_inputs(
  File "/Volumes/SSD2TB/AI/Diffusers/lib/python3.11/site-packages/diffusers/pipelines/cogview4/pipeline_cogview4.py", line 366, in check_inputs
    raise ValueError(
ValueError: `prompt_embeds` and `negative_prompt_embeds` must have the same shape when passed directly, but got: `prompt_embeds` torch.Size([1, 144, 4096]) != `negative_prompt_embeds` torch.Size([1, 48, 4096]).
```

### System Info

- 🤗 Diffusers version: 0.33.0.dev0
- Platform: macOS-15.3.1-arm64-arm-64bit
- Running on Google Colab?: No
- Python version: 3.11.10
- PyTorch version (GPU?): 2.6.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.27.1
- Transformers version: 4.49.0
- Accelerate version: 0.34.2
- PEFT version: not installed
- Bitsandbytes version: not installed
- Safetensors version: 0.4.5
- xFormers version: not installed
- Accelerator: Apple M3
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No


### Who can help?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Cogview4 pipeline not accepting prompt embeds, due to shape issues . #10962

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Cogview4 pipeline not accepting prompt embeds, due to shape issues . #10962

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions