Skip to content

Commit 8d21311

Browse files
committed
clarify the mapping between Transformer2DModel and finegrained variants.
1 parent 7298bdd commit 8d21311

File tree

3 files changed

+29
-2
lines changed

3 files changed

+29
-2
lines changed

src/diffusers/pipelines/dit/pipeline_dit.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,10 @@ class DiTPipeline(DiffusionPipeline):
4646
4747
Parameters:
4848
transformer ([`DiTTransformer2DModel`]):
49-
A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents.
49+
A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents. It was initially
50+
published as `Transformer2DModel` which is why [the
51+
configuration](https://huggingface.co/facebook/DiT-XL-2-256/blob/main/transformer/config.json#L2) still
52+
shows the class name as `Transformer2DModel`. This mismatch can be safely ignored.
5053
vae ([`AutoencoderKL`]):
5154
Variational Auto-Encoder (VAE) model to encode and decode images to and from latent representations.
5255
scheduler ([`DDIMScheduler`]):

src/diffusers/pipelines/pixart_alpha/pipeline_pixart_alpha.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -256,7 +256,10 @@ class PixArtAlphaPipeline(DiffusionPipeline):
256256
Tokenizer of class
257257
[T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
258258
transformer ([`PixArtTransformer2DModel`]):
259-
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents.
259+
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. It was initially
260+
published as `Transformer2DModel` which is why [the configuration
261+
still](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS/blob/main/transformer/config.json#L2) shows
262+
the class name as `Transformer2DModel`. This mismatch can be safely ignored.
260263
scheduler ([`SchedulerMixin`]):
261264
A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
262265
"""

src/diffusers/pipelines/pixart_alpha/pipeline_pixart_sigma.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,27 @@ def retrieve_timesteps(
185185
class PixArtSigmaPipeline(DiffusionPipeline):
186186
r"""
187187
Pipeline for text-to-image generation using PixArt-Sigma.
188+
189+
This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the
190+
library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.)
191+
192+
Args:
193+
vae ([`AutoencoderKL`]):
194+
Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations.
195+
text_encoder ([`T5EncoderModel`]):
196+
Frozen text-encoder. PixArt-Alpha uses
197+
[T5](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5EncoderModel), specifically the
198+
[t5-v1_1-xxl](https://huggingface.co/PixArt-alpha/PixArt-alpha/tree/main/t5-v1_1-xxl) variant.
199+
tokenizer (`T5Tokenizer`):
200+
Tokenizer of class
201+
[T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
202+
transformer ([`PixArtTransformer2DModel`]):
203+
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. It was initially
204+
published as `Transformer2DModel` which is why [the configuration
205+
still](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS/blob/main/transformer/config.json#L2)
206+
shows the class name as `Transformer2DModel`. This mismatch can be safely ignored.
207+
scheduler ([`SchedulerMixin`]):
208+
A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
188209
"""
189210

190211
bad_punct_regex = re.compile(

0 commit comments

Comments
 (0)