-
Couldn't load subscription status.
- Fork 6.5k
Sd35 controlnet #10020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sd35 controlnet #10020
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments are quite minor and not merge-blockers.
I guess we could add docs and tests merge merging.
| for i in range(num_layers) | ||
| ] | ||
| ) | ||
| if joint_attention_dim is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good enough condition for now. Because based joint_attention_dim we initialize both the context_embedded and transformer_blocks (that have the JointTransformerBlock type). I am okay with it.
| # SD3.5 8b controlnet does not have a `pos_embed`, | ||
| # it use the `pos_embed` from the transformer to process input before passing to controlnet | ||
| elif self.pos_embed is None and hidden_states.ndim != 3: | ||
| raise ValueError("hidden_states must be 3D when pos_embed is not used") | ||
|
|
||
| if self.context_embedder is not None and encoder_hidden_states is None: | ||
| raise ValueError("encoder_hidden_states must be provided when context_embedder is used") | ||
| # SD3.5 8b controlnet does not have a `context_embedder`, it does not use `encoder_hidden_states` | ||
| elif self.context_embedder is None and encoder_hidden_states is not None: | ||
| raise ValueError("encoder_hidden_states should not be provided when context_embedder is not used") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very useful!
| @maybe_allow_in_graph | ||
| class SD3SingleTransformerBlock(nn.Module): | ||
| r""" | ||
| A Single Transformer block as part of the MMDiT architecture, used in Stable Diffusion 3 ControlNet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could make this more explicit by:
- Specifying it's needed for the SD3.5 ControlNet.
- Providing a reference https://stability.ai/news/sd3-5-large-controlnets
| controlnet_config = ( | ||
| self.controlnet.config | ||
| if isinstance(self.controlnet, SD3ControlNetModel) | ||
| else self.controlnet.nets[0].config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is okay for now but could there be a case where we may have incompatibility configs when len(self.controlnet.nets) > 1? I guess we will know when we will know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely familiar with SD3 so all changes seem correct to me so far and inference works well 😅
LGTM!
|
I will add tests once the weights PR are merged |
|
this pr adds support for using
when adding new support, it should be in ORIGINAL format published by authors first, then internally-converted stuff. |
|
we try to support single-file format as best as we can but will not prioritize single-file format over diffusers format.
|
|
it also fails in current form for any type of offloading - which is kind of a necessity given the model size.
|
|
@vladmandic ahh thanks! indeed |
|
@vladmandic ohh, actually, but controlnet was never able to be offloaded because it was used at same time with transformers
|
|
somehow, i dont have such problems with |
|
oh make sense, we will look into this |
* add model/pipeline Co-authored-by: Sayak Paul <[email protected]>
test canny
test depth
test blur