Big shoutout to the incredible progress in Visual Autoregressive (VAR) models!
We present Scaled Spatial Guidance (SSG) [ICLR 2026], training-free, inference-time guidance that steers VAR generation toward its intended coarse-to-fine hierarchy while maintaining global coherence. With SSG, VAR and VAR-structured models, including HART and Infinity, gain in fidelity and diversity while preserving low latency over all scales.
Although our experiments focus on VAR, HART, and Infinity, SSG is designed as a versatile framework. It readily adapts to other VAR-structured models, operating independently of specific tokenization designs or conditioning (class or text) modalities.
Experimental results and detailed implementation are available in the paper and GitHub.
Paper: https://arxiv.org/abs/2602.05534
Github: https://github.com/Youngwoo-git/SSG