File tree Expand file tree Collapse file tree 3 files changed +26
-2
lines changed Expand file tree Collapse file tree 3 files changed +26
-2
lines changed Original file line number Diff line number Diff line change @@ -46,7 +46,9 @@ class DiTPipeline(DiffusionPipeline):
4646
4747 Parameters:
4848 transformer ([`DiTTransformer2DModel`]):
49- A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents.
49+ A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents. Initially published as
50+ [`Transformer2DModel`](https://huggingface.co/facebook/DiT-XL-2-256/blob/main/transformer/config.json#L2)
51+ in the config, but the mismatch can be ignored.
5052 vae ([`AutoencoderKL`]):
5153 Variational Auto-Encoder (VAE) model to encode and decode images to and from latent representations.
5254 scheduler ([`DDIMScheduler`]):
Original file line number Diff line number Diff line change @@ -256,7 +256,9 @@ class PixArtAlphaPipeline(DiffusionPipeline):
256256 Tokenizer of class
257257 [T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
258258 transformer ([`PixArtTransformer2DModel`]):
259- A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents.
259+ A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
260+ [`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS/blob/main/transformer/config.json#L2)
261+ in the config, but the mismatch can be ignored.
260262 scheduler ([`SchedulerMixin`]):
261263 A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
262264 """
Original file line number Diff line number Diff line change @@ -185,6 +185,26 @@ def retrieve_timesteps(
185185class PixArtSigmaPipeline (DiffusionPipeline ):
186186 r"""
187187 Pipeline for text-to-image generation using PixArt-Sigma.
188+
189+ This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the
190+ library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.)
191+
192+ Args:
193+ vae ([`AutoencoderKL`]):
194+ Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations.
195+ text_encoder ([`T5EncoderModel`]):
196+ Frozen text-encoder. PixArt-Alpha uses
197+ [T5](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5EncoderModel), specifically the
198+ [t5-v1_1-xxl](https://huggingface.co/PixArt-alpha/PixArt-alpha/tree/main/t5-v1_1-xxl) variant.
199+ tokenizer (`T5Tokenizer`):
200+ Tokenizer of class
201+ [T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
202+ transformer ([`PixArtTransformer2DModel`]):
203+ A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
204+ [`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS/blob/main/transformer/config.json#L2)
205+ in the config, but the mismatch can be ignored.
206+ scheduler ([`SchedulerMixin`]):
207+ A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
188208 """
189209
190210 bad_punct_regex = re .compile (
You can’t perform that action at this time.
0 commit comments