| 
15 | 15 | 
 
  | 
16 | 16 | # Mochi  | 
17 | 17 | 
 
  | 
18 |  | -[Mochi 1 Preview](https://huggingface.co/genmo/mochi-1-preview) from Genmo.  | 
 | 18 | +> [!TIP]  | 
 | 19 | +> Only a research preview of the model weights is available at the moment.  | 
 | 20 | +
  | 
 | 21 | +[Mochi 1](https://huggingface.co/genmo/mochi-1-preview) is a video generation model by Genmo with a strong focus on prompt adherence and motion quality. The model features a 10B parameter Asmmetric Diffusion Transformer (AsymmDiT) architecture, and uses non-square QKV and output projection layers to reduce inference memory requirements. A single T5-XXL model is used to encode prompts.  | 
19 | 22 | 
 
  | 
20 | 23 | *Mochi 1 preview is an open state-of-the-art video generation model with high-fidelity motion and strong prompt adherence in preliminary evaluation. This model dramatically closes the gap between closed and open video generation systems. The model is released under a permissive Apache 2.0 license.*  | 
21 | 24 | 
 
  | 
22 |  | -<Tip>  | 
 | 25 | +> [!TIP]  | 
 | 26 | +> Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.  | 
 | 27 | +
  | 
 | 28 | +## Quantization  | 
 | 29 | + | 
 | 30 | +Quantization helps reduce the memory requirements of very large models by storing model weights in a lower precision data type. Refer to the [Quantization](../../quantization/overview) to learn more about supported quantization backends and selecting a quantization backend that supports your use case.  | 
 | 31 | + | 
 | 32 | +The example below demonstrates how to load a quantized [`MochiPipeline`] for inference with bitsandbytes.  | 
 | 33 | + | 
 | 34 | +```py  | 
 | 35 | +import torch  | 
 | 36 | +from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig, MochiTransformer3DModel, MochiPipeline  | 
 | 37 | +from diffusers.utils import export_to_video  | 
 | 38 | +from transformers import BitsAndBytesConfig as BitsAndBytesConfig, T5EncoderModel  | 
 | 39 | + | 
 | 40 | +quant_config = BitsAndBytesConfig(load_in_8bit=True)  | 
 | 41 | +text_encoder_8bit = T5EncoderModel.from_pretrained(  | 
 | 42 | +    "genmo/mochi-1-preview",  | 
 | 43 | +    subfolder="text_encoder2",  | 
 | 44 | +    quantization_config=quant_config,  | 
 | 45 | +    torch_dtype=torch.float16,  | 
 | 46 | +)  | 
 | 47 | + | 
 | 48 | +quant_config = DiffusersBitsAndBytesConfig(load_in_8bit=True)  | 
 | 49 | +transformer_8bit = MochiTransformer3DModel.from_pretrained(  | 
 | 50 | +    "genmo/mochi-1-preview",  | 
 | 51 | +    subfolder="transformer",  | 
 | 52 | +    quantization_config=quant_config,  | 
 | 53 | +    torch_dtype=torch.float16,  | 
 | 54 | +)  | 
23 | 55 | 
 
  | 
24 |  | -Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers.md) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading.md#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.  | 
 | 56 | +pipeline = MochiPipeline.from_pretrained(  | 
 | 57 | +    "genmo/mochi-1-preview",  | 
 | 58 | +    text_encoder=text_encoder_8bit,  | 
 | 59 | +    transformer=transformer_8bit,  | 
 | 60 | +    torch_dtype=torch.float16,  | 
 | 61 | +    device_map="balanced",  | 
 | 62 | +)  | 
25 | 63 | 
 
  | 
26 |  | -</Tip>  | 
 | 64 | +frames = pipeline(  | 
 | 65 | +  "Close-up of a cats eye, with the galaxy reflected in the cats eye. Ultra high resolution 4k.",  | 
 | 66 | +  num_inference_steps=28,  | 
 | 67 | +  guidance_scale=3.5  | 
 | 68 | +).frames[0]  | 
 | 69 | +export_to_video(frames, "cat.mp4")  | 
 | 70 | +```  | 
27 | 71 | 
 
  | 
28 | 72 | ## MochiPipeline  | 
29 | 73 | 
 
  | 
 | 
0 commit comments