Skip to content

Conversation

@leejet
Copy link
Owner

@leejet leejet commented Oct 14, 2025

With sdxl_vae-fp16-fix.safetensors

.\bin\Release\sd.exe -m ..\..\stable-diffusion-webui\models\Stable-diffusion\sd_xl_base_1.0.safetensors -p "a lovely cat" -v --vae ..\..\stable-diffusion-webui\models\VAE\sdxl_vae-fp16-fix.safetensors -H 1024 -W 1024
output

Without sdxl_vae-fp16-fix.safetensors

.\bin\Release\sd.exe -m ..\..\stable-diffusion-webui\models\Stable-diffusion\sd_xl_base_1.0.safetensors -p "a lovely cat" -v -H 1024 -W 1024
output

@wbruna
Copy link
Contributor

wbruna commented Oct 14, 2025

Works well with a few problematic checkpoints here. Thanks!

And after removing the check for external files, it's working even for a few alternative VAEs that always gave me black outputs. Any reason for not enabling the scaling factor for SDXL every time?

@leejet
Copy link
Owner Author

leejet commented Oct 15, 2025

The VAE conv scaling may slightly affect image quality and generation speed in some cases, so I prefer to avoid using it whenever possible.
I’ve added a --force-sdxl-vae-conv-scale option to force-enable conv scaling for the SDXL VAE, which should meet your needs.

@leejet leejet merged commit 40a6a87 into master Oct 15, 2025
9 checks passed
@wbruna
Copy link
Contributor

wbruna commented Oct 15, 2025

And after removing the check for external files, it's working even for a few alternative VAEs that always gave me black outputs.

FWIW, vae-conv-direct also seems to be enough to avoid the issue for the problematic VAEs (although it's far slower on ROCm).

@leejet leejet deleted the sdxl_vae_precision_fix branch October 28, 2025 15:28
@Green-Sky
Copy link
Contributor

Green-Sky commented Nov 6, 2025

This pr increases vram usage for VAE in all vae-conv-direct(?) cases.

SD1.5 768x1024

vae compute buffer size: 1920.19 MB -> vae compute buffer size: 2112.19 MB

$ result/bin/sd --vae-conv-direct -v --diffusion-fa -m models/CyberRealistic_V9_FP16.safetensors --sampling-method dpm++2m --scheduler karras -W 768 -H 1024 --cfg-scale 5 --steps 38 -n "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry" -p "a lovely cat"

SD2 512x512

vae compute buffer size: 640.06 MB -> vae compute buffer size: 704.06 MB

$ result/bin/sd -m models/sd_turbo-f16-q8_0.gguf --cfg-scale 1 --steps 8 -p "a lovely cat" --vae-conv-direct -v

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants