huggingface
diff --git a/‎.github/workflows/nightly_tests.yml‎
Lines changed: 2 additions & 0 deletions b/‎.github/workflows/nightly_tests.yml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/source/en/_toctree.yml‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/_toctree.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/api/models/allegro_transformer3d.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/models/allegro_transformer3d.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/api/models/cogvideox_transformer3d.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/models/cogvideox_transformer3d.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/api/models/cogview3plus_transformer2d.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/models/cogview3plus_transformer2d.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/api/models/mochi_transformer3d.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/models/mochi_transformer3d.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/api/pipelines/allegro.md‎
Lines changed: 46 additions & 1 deletion b/‎docs/source/en/api/pipelines/allegro.md‎
Lines changed: 46 additions & 1 deletion
diff --git a/‎docs/source/en/api/pipelines/animatediff.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/pipelines/animatediff.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/api/pipelines/attend_and_excite.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/pipelines/attend_and_excite.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/api/pipelines/audioldm.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/pipelines/audioldm.md‎
Lines changed: 1 addition & 1 deletion
@@ -359,6 +359,8 @@ jobs:
             test_location: "bnb"
           - backend: "gguf"
             test_location: "gguf"
+          - backend: "torchao"
+            test_location: "torchao"
     runs-on:
       group: aws-g6e-xlarge-plus
     container:
 
@@ -48,7 +48,7 @@
   - local: using-diffusers/inpaint
     title: Inpainting
   - local: using-diffusers/text-img2vid
-    title: Text or image-to-video
+    title: Video generation
   - local: using-diffusers/depth2img
     title: Depth-to-image
   title: Generative tasks
 
@@ -18,7 +18,7 @@ The model can be loaded with the following code snippet.
 ```python
 from diffusers import AllegroTransformer3DModel
 
-vae = AllegroTransformer3DModel.from_pretrained("rhymes-ai/Allegro", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
+transformer = AllegroTransformer3DModel.from_pretrained("rhymes-ai/Allegro", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
 ```
 
 ## AllegroTransformer3DModel
 
@@ -18,7 +18,7 @@ The model can be loaded with the following code snippet.
 ```python
 from diffusers import CogVideoXTransformer3DModel
 
-vae = CogVideoXTransformer3DModel.from_pretrained("THUDM/CogVideoX-2b", subfolder="transformer", torch_dtype=torch.float16).to("cuda")
+transformer = CogVideoXTransformer3DModel.from_pretrained("THUDM/CogVideoX-2b", subfolder="transformer", torch_dtype=torch.float16).to("cuda")
 ```
 
 ## CogVideoXTransformer3DModel
 
@@ -18,7 +18,7 @@ The model can be loaded with the following code snippet.
 ```python
 from diffusers import CogView3PlusTransformer2DModel
 
-vae = CogView3PlusTransformer2DModel.from_pretrained("THUDM/CogView3Plus-3b", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
+transformer = CogView3PlusTransformer2DModel.from_pretrained("THUDM/CogView3Plus-3b", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
 ```
 
 ## CogView3PlusTransformer2DModel
 
@@ -18,7 +18,7 @@ The model can be loaded with the following code snippet.
 ```python
 from diffusers import MochiTransformer3DModel
 
-vae = MochiTransformer3DModel.from_pretrained("genmo/mochi-1-preview", subfolder="transformer", torch_dtype=torch.float16).to("cuda")
+transformer = MochiTransformer3DModel.from_pretrained("genmo/mochi-1-preview", subfolder="transformer", torch_dtype=torch.float16).to("cuda")
 ```
 
 ## MochiTransformer3DModel
 
@@ -19,10 +19,55 @@ The abstract from the paper is:
 
 <Tip>
 
-Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers.md) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading.md#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
+Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
 
 </Tip>
 
+## Quantization
+
+Quantization helps reduce the memory requirements of very large models by storing model weights in a lower precision data type. However, quantization may have varying impact on video quality depending on the video model.
+
+Refer to the [Quantization](../../quantization/overview) overview to learn more about supported quantization backends and selecting a quantization backend that supports your use case. The example below demonstrates how to load a quantized [`AllegroPipeline`] for inference with bitsandbytes.
+
+```py
+import torch
+from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig, AllegroTransformer3DModel, AllegroPipeline
+from diffusers.utils import export_to_video
+from transformers import BitsAndBytesConfig as BitsAndBytesConfig, T5EncoderModel
+
+quant_config = BitsAndBytesConfig(load_in_8bit=True)
+text_encoder_8bit = T5EncoderModel.from_pretrained(
+    "rhymes-ai/Allegro",
+    subfolder="text_encoder",
+    quantization_config=quant_config,
+    torch_dtype=torch.float16,
+)
+
+quant_config = DiffusersBitsAndBytesConfig(load_in_8bit=True)
+transformer_8bit = AllegroTransformer3DModel.from_pretrained(
+    "rhymes-ai/Allegro",
+    subfolder="transformer",
+    quantization_config=quant_config,
+    torch_dtype=torch.float16,
+)
+
+pipeline = AllegroPipeline.from_pretrained(
+    "rhymes-ai/Allegro",
+    text_encoder=text_encoder_8bit,
+    transformer=transformer_8bit,
+    torch_dtype=torch.float16,
+    device_map="balanced",
+)
+
+prompt = (
+    "A seaside harbor with bright sunlight and sparkling seawater, with many boats in the water. From an aerial view, "
+    "the boats vary in size and color, some moving and some stationary. Fishing boats in the water suggest that this "
+    "location might be a popular spot for docking fishing boats."
+)
+video = pipeline(prompt, guidance_scale=7.5, max_sequence_length=512).frames[0]
+export_to_video(video, "harbor.mp4", fps=15)
+```
+
 ## AllegroPipeline
 
 [[autodoc]] AllegroPipeline
 
@@ -803,7 +803,7 @@ FreeInit is not really free - the improved quality comes at the cost of extra co
 
 <Tip>
 
-Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
+Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
 
 </Tip>
 
 
@@ -22,7 +22,7 @@ You can find additional information about Attend-and-Excite on the [project page
 
 <Tip>
 
-Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
+Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
 
 </Tip>
 
 
@@ -37,7 +37,7 @@ During inference:
 
 <Tip>
 
-Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
+Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
 
 </Tip>