-
Notifications
You must be signed in to change notification settings - Fork 53
Open
Description
First, I would like to thank the author for their excellent work. I have some questions regarding VAE’s ability to handle images at different resolutions.
- First, VAE was trained on ImageNet-256, which means images with a resolution of 256×256, am I correct in this understanding?
- I directly used ldm-imagenet256-f16d16-50ep.ckpt and vavae-imagenet256-f16f32-dinov2-50ep.ckpt to reconstruct images at a resolution of 1024×1024. To my surprise, the reconstruction results were still quite good, with PSNR values of 33.62 and 30.11, respectively, and no obvious tiling or degradation was observed in the reconstructed images. Could you please explain the reason why VAE is able to generalize and perform well at higher resolutions as well?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels