Update autoencoder_kl_wan.py

franciszzj · web-flow · commit c0015fcfa79b · 2025-09-16T10:49:02.000+01:00
When using the Wan2.2 VAE, the spatial compression ratio calculated here is incorrect. It should be 16 instead of 8. Pass it in directly via the config to ensure it’s correct here.
diff --git a/src/diffusers/models/autoencoders/autoencoder_kl_wan.py b/src/diffusers/models/autoencoders/autoencoder_kl_wan.py
@@ -1052,7 +1052,7 @@ def __init__(
             is_residual=is_residual,
         )
 
-        self.spatial_compression_ratio = 2 ** len(self.temperal_downsample)
+        self.spatial_compression_ratio = scale_factor_spatial
 
         # When decoding a batch of video latents at a time, one can save memory by slicing across the batch dimension
         # to perform decoding of a single video latent at a time.

Original file line number	Diff line number	Diff line change
`@@ -1052,7 +1052,7 @@ def __init__(`
`1052`	`1052`	`is_residual=is_residual,`
`1053`	`1053`	`)`
`1054`	`1054`
`1055`		`- self.spatial_compression_ratio = 2 ** len(self.temperal_downsample)`
	`1055`	`+ self.spatial_compression_ratio = scale_factor_spatial`
`1056`	`1056`
`1057`	`1057`	`# When decoding a batch of video latents at a time, one can save memory by slicing across the batch dimension`
`1058`	`1058`	`# to perform decoding of a single video latent at a time.`