RobuQ: Pushing DiTs to W1.58A2 via Robust Activation Quantization

Kaicheng Yang,Xun Zhang,Haotong Qin,Yucheng Lin, Kaisen Yang,Xianglong Yan and Yulun Zhang. [arXiv] [supplementary material]

🔥🔥🔥 News

2025-09-26: Repository initial release.

Abstract: Diffusion Transformers (DiTs) have recently emerged as a powerful backbone for image generation, demonstrating superior scalability and performance over U-Net architectures. However, their practical deployment is hindered by substantial computational and memory costs. While Quantization-Aware Training (QAT) has shown promise for U-Nets, its application to DiTs faces unique challenges, primarily due to the sensitivity and distributional complexity of activations. In this work, we identify activation quantization as the primary bottleneck for pushing DiTs to extremely low-bit settings. To address this, we propose a systematic QAT framework for DiTs, named RobuQ. We start by establishing a strong ternary weight (W1.58A4) DiT baseline. Building upon this, we propose RobustQuantizer to achieve robust activation quantization. Our theoretical analyses show that the Hadamard transform can convert unknown per-token distributions into per-token normal distributions, providing a strong foundation for this method. Furthermore, we propose AMPN, the first Activation-only Mixed-Precision Network pipeline for DiTs. This method applies ternary weights across the entire network while allocating different activation precisions to each layer to eliminate information bottlenecks. Through extensive experiments on unconditional and conditional image generation, our RobuQ framework achieves state-of-the-art performance for DiT quantization in sub-4-bit quantization configuration. To the best of our knowledge, RobuQ is the first achieving stable and competitive image generation on large datasets like ImageNet-1K with activations quantized to average 2 bits.

Visualization

Fig1:RobuQ-W1.58A3 256×256 images from quantized DiT-XL/2 on ImageNet-1K

Fig1:RobuQ-W1.58A2 256×256 images from quantized DiT-XL/2 on ImageNet-1K

Comparison

Fig2:RobuQ consistently outperforms all previous methods

🔖 TODO

Release ckpt,training and inference code
Release inference engine
Release more quantized DiTs

💡 Acknowledgements

This code is built on Diffusion Transformer.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
figs		figs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RobuQ: Pushing DiTs to W1.58A2 via Robust Activation Quantization

🔥🔥🔥 News

Visualization

Comparison

🔖 TODO

💡 Acknowledgements

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

RobuQ: Pushing DiTs to W1.58A2 via Robust Activation Quantization

🔥🔥🔥 News

Visualization

Comparison

🔖 TODO

💡 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Packages