-
Couldn't load subscription status.
- Fork 6.5k
fix for #7365, prevent pipelines from overriding provided prompt embeds #7926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4ce77f5 to
992d2df
Compare
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
why are all the tests failing? |
|
can we look into the remaining failing tests? |
|
wow, i hadn't expected all of those, or really looked into them until now. i'm not sure why this simple check made those tests fail. i am wondering if there's something i need to check first before assigning the prompt embeds 🤔 like whether we have any? should i try and run the unit tests locally again? |
|
@yiyixuxu i was trying again today and edit: after |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @bghira! Apologies for the delay on my part. Would it be possible to add a test for this to https://github.com/huggingface/diffusers/blob/main/tests/pipelines/stable_diffusion_xl/test_stable_diffusion_xl.py?
|
cc @a-r-r-o-w for test |
|
Yeah let's ensure to add at least one test before merging. |
|
was having trouble getting the test suite running when i pushed this and my new system doesn't have it working yet either - but a regression test would be great to keep it workin |
|
Seems like the core tests are failing with the changes. |
|
i see, will take a look at tests |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
|
nice try bot, but it's not stale |
|
Nice try indeed, bot. Sorry about the delay, but should be good to merge now. The previous fix did not really work as expected unless one always passed |
|
@yiyixuxu Our |
|
Gentle ping @yiyixuxu for a final review |
|
|
||
| # We are only ALWAYS interested in the pooled output of the final text encoder | ||
| pooled_prompt_embeds = prompt_embeds[0] | ||
| if pooled_prompt_embeds is None and prompt_embeds[0].ndim == 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh so this is for when users pass a pre-generated pooled_prompt_embeds but not prompt_embeds? can you explain why they need to do that? (they need to run the text encoder to get the prompt_embeds and will get the pooled_prompt_embeds anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, when someone wants to use their own provided pooled_prompt_embeds, but use a different prompt (they don't pass prompt_embeds here), we will encode the prompt and overwrite the value that they passed. This PR only assigns the value if a custom pooled_prompt_embeds was not passed because you should be able to use different prompts for different text encoder, and by that reasoning different precomputed embeddings too (@linoytsaban did some nice threads/blogs testing this with flux).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is the custom pooled_prompt_embeds not specific to the prompt? - because if it is processed based on the text_encoder output, I would imagine the user also would have prompt_embeds at hands and can pass it along with the pooled_prompt_embeds so that we don't need to run text encoder again
either way it is ok to merge because I don't think it will cause any problem, just try to understand the use case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a really old PR at this point and changes may have been made over time such that it was relevant then but may not be now, so I'm not 100% sure either - I saw it was stale so either we help move it to completion or we can close if not needed. Maybe @bghira can elaborate further on his use case.
But, I also don't see a way of getting the pooled_prompt_embeds just from having access to prompt_embeds like you mention. When we get the last_hidden_state from the text encoders, it is a ndim=3 tensor of shape [1, 77, 768], and we concatenate the embeddings across channel dim. But pooled embeddings are ndim=2 tensors of shape [1, 768]. Am I missing something trivial here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yes, pooled_prompt_embeds can be custom here, completely different from what prompt is about, if passed, or prompt_embeds is about.
See these X threads by Linoy (although this is in the context of Flux):
What does this PR do?
Fixes #7365
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@sayakpaul @yiyixuxu