Skip to content

Conversation

@alirezafarashah
Copy link

@alirezafarashah alirezafarashah commented Oct 22, 2025

What does this PR do?

This PR fixes a small inconsistency in the output dimension of the _get_t5_prompt_embeds function in the Stable Diffusion 3 pipeline.

Previously, when self.text_encoder_3 was None, the function returned a tensor (torch.zeros) with a sequence length of self.tokenizer_max_length (77), which corresponds to the CLIP encoder. However, the T5 text encoder used in SD3 has a different maximum sequence length (256).

As a result, when text_encoder_3 was available, the prompt embeddings had a sequence length of 333 (256 from T5 + 77 from CLIP), but when it was not available, the returned tensor had only 154 (77 + 77), leading to an inconsistency in output dimensions in encode_prompt.

Motivation and Context

This change ensures consistent tensor shapes across different encoder availability conditions in the SD3 pipeline.
It prevents dimension mismatches and potential runtime errors when text_encoder_3 is None.

Previously, the zeros tensor used self.tokenizer_max_length, which corresponds to CLIP, instead of T5’s longer sequence length.
This mismatch led to inconsistent embedding dimensions when combining outputs from CLIP and T5 in encode_prompt.

Changes Made

  • Replaced self.tokenizer_max_length with max_sequence_length when returning the zero tensor in _get_t5_prompt_embeds, ensuring consistent output dimensions whether text_encoder_3 is None or available.
    The same max_sequence_length parameter is already used in the tokenization step of the same function:
    text_inputs = self.tokenizer_3(
        prompt,
        padding="max_length",
        max_length=max_sequence_length,
        truncation=True,
        add_special_tokens=True,
        return_tensors="pt",
    )
  • No changes to functionality, inputs, or outputs beyond dimension consistency.

Before submitting

Who can review?

@sayakpaul sayakpaul requested a review from yiyixuxu October 24, 2025 19:20
Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@yiyixuxu
Copy link
Collaborator

@bot /style

@github-actions
Copy link
Contributor

github-actions bot commented Oct 24, 2025

Style fix runs successfully without any file modified.

@yiyixuxu
Copy link
Collaborator

hey @alirezafarashah
can you run `make fix-copies"?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@alirezafarashah
Copy link
Author

hey @alirezafarashah can you run `make fix-copies"?

Hey @yiyixuxu
I did and pushed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants