Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion src/diffusers/pipelines/dit/pipeline_dit.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@ class DiTPipeline(DiffusionPipeline):

Parameters:
transformer ([`DiTTransformer2DModel`]):
A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents.
A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents. Initially published as
[`Transformer2DModel`](https://huggingface.co/facebook/DiT-XL-2-256/blob/main/transformer/config.json#L2)
in the config, but the mismatch can be ignored.
vae ([`AutoencoderKL`]):
Variational Auto-Encoder (VAE) model to encode and decode images to and from latent representations.
scheduler ([`DDIMScheduler`]):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,9 @@ class PixArtAlphaPipeline(DiffusionPipeline):
Tokenizer of class
[T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
transformer ([`PixArtTransformer2DModel`]):
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents.
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
[`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS/blob/main/transformer/config.json#L2)
in the config, but the mismatch can be ignored.
scheduler ([`SchedulerMixin`]):
A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
"""
Expand Down
20 changes: 20 additions & 0 deletions src/diffusers/pipelines/pixart_alpha/pipeline_pixart_sigma.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,26 @@ def retrieve_timesteps(
class PixArtSigmaPipeline(DiffusionPipeline):
r"""
Pipeline for text-to-image generation using PixArt-Sigma.

This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the
library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.)

Args:
vae ([`AutoencoderKL`]):
Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations.
text_encoder ([`T5EncoderModel`]):
Frozen text-encoder. PixArt-Alpha uses
[T5](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5EncoderModel), specifically the
[t5-v1_1-xxl](https://huggingface.co/PixArt-alpha/PixArt-alpha/tree/main/t5-v1_1-xxl) variant.
tokenizer (`T5Tokenizer`):
Tokenizer of class
[T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
transformer ([`PixArtTransformer2DModel`]):
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
[`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS/blob/main/transformer/config.json#L2)
in the config, but the mismatch can be ignored.
scheduler ([`SchedulerMixin`]):
A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
"""

bad_punct_regex = re.compile(
Expand Down
Loading