Skip to content

Conversation

@friedrich
Copy link
Contributor

@friedrich friedrich commented Sep 9, 2025

What does this PR do?

When having more than one prompts and and also num_images_per_prompt > 1, pooled prompt embeddings are repeated incorrectly. This leads to the pooled prompt embeddings getting assigned to the wrong images.

While CLIP and T5 prompt embeddings are repeated along the sequence dimension, pooled CLIP embeddings are currently repeated along the batch dimension. This PR fixes this by repeating over the embedding dimension.

Corresponding fix for Flux.1: #9280

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu @sayakpaul

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yiyixuxu yiyixuxu merged commit 051c8a1 into huggingface:main Oct 31, 2025
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants