Skip to content

Conversation

@ppbrown
Copy link
Contributor

@ppbrown ppbrown commented May 23, 2025

Background:

I am working on an SDXL varient that exclusively has only (long)CLIP-L.
cf: https://huggingface.co/opendiffusionai/sdxl-longcliponly

Problem that this PR fixes

I would LIKE to be able to just use StableDiffusionXLPipeline.
The text_encoder_2 component is supposedly already marked as optional. However...

Previously, the StableDiffusionXLPipeline did not correctly handle cases where tokenizer_2 and text_encoder_2 were None, despite them being listed in _optional_components. This could lead to errors during prompt encoding.

This commit addresses the issue by:

  1. Modifying encode_prompt to dynamically build lists of tokenizers and text encoders, only including those that are not None.
  2. Adjusting the logic for handling prompt_2 and negative_prompt_2 to only use them if the corresponding tokenizer_2 and text_encoder_2 are present.
  3. Ensuring that pooled embeddings are correctly sourced from the last available text encoder.
  4. Updating the determination of text_encoder_projection_dim in the __call__ method to correctly use available text encoder configs or fallback to pooled embedding dimensions if text_encoder_2 is not present.

Unit tests have been added to verify the pipeline's behavior when:

  • Only the primary tokenizer and text_encoder are present.
  • Only tokenizer_2 and text_encoder_2 are present.
  • All tokenizer and text encoder components are present.

What does this PR do?

Fixes # (issue)

Before submitting

Core library:

…usionXLPipeline

Previously, the StableDiffusionXLPipeline did not correctly handle cases
where `tokenizer_2` and `text_encoder_2` were None, despite them being
listed in `_optional_components`. This could lead to errors during prompt
encoding.

This commit addresses the issue by:
1. Modifying `encode_prompt` to dynamically build lists of tokenizers and
   text encoders, only including those that are not None.
2. Adjusting the logic for handling `prompt_2` and `negative_prompt_2` to
   only use them if the corresponding `tokenizer_2` and `text_encoder_2`
   are present.
3. Ensuring that pooled embeddings are correctly sourced from the last
   available text encoder.
4. Updating the determination of `text_encoder_projection_dim` in the
   `__call__` method to correctly use available text encoder configs or
   fallback to pooled embedding dimensions if `text_encoder_2` is not present.

Unit tests have been added to verify the pipeline's behavior when:
- Only the primary `tokenizer` and `text_encoder` are present.
- Only `tokenizer_2` and `text_encoder_2` are present.
- All tokenizer and text encoder components are present.
…nts in StableDiffusionXLPipeline

This commit addresses a NameError in `encode_prompt` and ensures that
`tokenizer_2` and `text_encoder_2` are truly optional in the
StableDiffusionXLPipeline.

Changes:
- Corrected a `NameError` in `encode_prompt` where `prompt_embeds_output`
  was used before assignment. The logic now correctly uses the output variable
  from the text encoder loop (e.g., `current_encoder_output`).
- Modified `encode_prompt` to dynamically build lists of tokenizers and
  text encoders, only including those that are not None.
- Adjusted logic for handling `prompt_2` and `negative_prompt_2` to
  only use them if the corresponding `tokenizer_2` and `text_encoder_2`
  are present.
- Ensured that pooled embeddings are correctly sourced from the last
  available text encoder and that `text_encoder_projection_dim` in `__call__`
  is determined robustly.

Unit tests focusing on optional component configurations have been reviewed
and confirmed to cover the corrected code paths.
@ppbrown
Copy link
Contributor Author

ppbrown commented May 23, 2025

bah. found bugs.
cancelling.

@ppbrown ppbrown closed this May 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant