Skip to content

Conversation

@bghira
Copy link
Contributor

@bghira bghira commented May 12, 2024

What does this PR do?

Fixes #7365

Before submitting

Who can review?

@sayakpaul @yiyixuxu

@bghira bghira force-pushed the issue/7365b branch 2 times, most recently from 4ce77f5 to 992d2df Compare May 12, 2024 16:42
@sayakpaul sayakpaul requested a review from yiyixuxu May 13, 2024 09:53
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yiyixuxu
Copy link
Collaborator

why are all the tests failing?

@yiyixuxu
Copy link
Collaborator

can we look into the remaining failing tests?

@bghira
Copy link
Contributor Author

bghira commented May 20, 2024

wow, i hadn't expected all of those, or really looked into them until now. i'm not sure why this simple check made those tests fail. i am wondering if there's something i need to check first before assigning the prompt embeds 🤔 like whether we have any? should i try and run the unit tests locally again?

@bghira
Copy link
Contributor Author

bghira commented May 26, 2024

@yiyixuxu i was trying again today and make test doesn't work here as i guess i don't have the right version of pytest installed, and now running pip install diffusers[dev] but i think there must be a way to run this in a Docker container?

edit: after pip install diffusers[dev] it now works to run make test

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024
Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @bghira! Apologies for the delay on my part. Would it be possible to add a test for this to https://github.com/huggingface/diffusers/blob/main/tests/pipelines/stable_diffusion_xl/test_stable_diffusion_xl.py?

@bghira
Copy link
Contributor Author

bghira commented Nov 23, 2024

cc @a-r-r-o-w for test

@sayakpaul
Copy link
Member

Yeah let's ensure to add at least one test before merging.

@bghira
Copy link
Contributor Author

bghira commented Nov 23, 2024

was having trouble getting the test suite running when i pushed this and my new system doesn't have it working yet either - but a regression test would be great to keep it workin

@github-actions github-actions bot removed the stale Issues that haven't received updates label Nov 23, 2024
@sayakpaul
Copy link
Member

Seems like the core tests are failing with the changes.

@a-r-r-o-w
Copy link
Contributor

i see, will take a look at tests

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Dec 19, 2024
@bghira
Copy link
Contributor Author

bghira commented Dec 19, 2024

nice try bot, but it's not stale

@github-actions github-actions bot removed the stale Issues that haven't received updates label Dec 20, 2024
@a-r-r-o-w a-r-r-o-w requested a review from yiyixuxu December 29, 2024 19:51
@a-r-r-o-w
Copy link
Contributor

Nice try indeed, bot. Sorry about the delay, but should be good to merge now.

The previous fix did not really work as expected unless one always passed pooled_prompt_embeds, which is what caused all the test errors. The prompt_embeds[0] should only be assigned when encoding prompt_2 via text_encoder_2. The ndim check suffices atm but open to suggestions

@a-r-r-o-w
Copy link
Contributor

@yiyixuxu Our make fix-copies does not really work for things in examples/ folder. I wonder if that's by choice or simply overlooked. I think it wouldn't be too difficult to support (I got it working but it creates too many changes in the existing community pipelines so could maybe take up in another PR, but important to note that every change made to the # Copied from stuff in community folder was Find & Replace. Using the update fix-copies implementation simply breaks a lot of pipelines)

@a-r-r-o-w
Copy link
Contributor

Gentle ping @yiyixuxu for a final review


# We are only ALWAYS interested in the pooled output of the final text encoder
pooled_prompt_embeds = prompt_embeds[0]
if pooled_prompt_embeds is None and prompt_embeds[0].ndim == 2:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh so this is for when users pass a pre-generated pooled_prompt_embeds but not prompt_embeds? can you explain why they need to do that? (they need to run the text encoder to get the prompt_embeds and will get the pooled_prompt_embeds anyway)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, when someone wants to use their own provided pooled_prompt_embeds, but use a different prompt (they don't pass prompt_embeds here), we will encode the prompt and overwrite the value that they passed. This PR only assigns the value if a custom pooled_prompt_embeds was not passed because you should be able to use different prompts for different text encoder, and by that reasoning different precomputed embeddings too (@linoytsaban did some nice threads/blogs testing this with flux).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is the custom pooled_prompt_embeds not specific to the prompt? - because if it is processed based on the text_encoder output, I would imagine the user also would have prompt_embeds at hands and can pass it along with the pooled_prompt_embeds so that we don't need to run text encoder again

either way it is ok to merge because I don't think it will cause any problem, just try to understand the use case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really old PR at this point and changes may have been made over time such that it was relevant then but may not be now, so I'm not 100% sure either - I saw it was stale so either we help move it to completion or we can close if not needed. Maybe @bghira can elaborate further on his use case.

But, I also don't see a way of getting the pooled_prompt_embeds just from having access to prompt_embeds like you mention. When we get the last_hidden_state from the text encoders, it is a ndim=3 tensor of shape [1, 77, 768], and we concatenate the embeddings across channel dim. But pooled embeddings are ndim=2 tensors of shape [1, 768]. Am I missing something trivial here?

Copy link
Contributor

@a-r-r-o-w a-r-r-o-w Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, pooled_prompt_embeds can be custom here, completely different from what prompt is about, if passed, or prompt_embeds is about.

See these X threads by Linoy (although this is in the context of Flux):

@yiyixuxu yiyixuxu merged commit a0acbdc into huggingface:main Jan 8, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provided pooled_prompt_embeds is overwritten via prompt_embeds[0]

5 participants