Skip to content

Create latent downscaler for vanilla SDXL compatibility #1

@wendlerc

Description

@wendlerc

Started here and preliminary results are promising: https://github.com/wendlerc/latent_downscaling
First checkpoints: https://huggingface.co/wendlerc/latent-downscale-madebyollin-sdxl-vae-fp16-fix

Currently we have a 10% improvement on SDXL generated imagenet vs naive downscaling with image interpolation.

Nearest interpolation
b-4: 66.4 (88.6)
b-8: 60.3 (84.4)

Latent downscaler
b-4: 75.6 (94.1) [+9.2]
b-8: 72.2 (92.4) [+11.9]

VAE decode, image resize, VAE encode (~upper bound)
b-4: 84.2 (97.0)
b-8: 81.5 (96.3)

So there are still another 10% on the table. It would be nice to close this gap.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions