Skip to content

Conversation

wbruna
Copy link
Contributor

@wbruna wbruna commented Oct 15, 2025

No idea if this is really OK, since the resulting images look a little bit different in details, and the memory savings look surprisingly high, but this is for a 1024x1024 image, no vae-conv-direct:

before:

[INFO ] stable-diffusion.cpp:2386 - decoding 1 latents
[DEBUG] ggml_extend.hpp:1579 - vae compute buffer size: 7680.25 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1677 - computing vae decode graph completed, taking 5.98s
[INFO ] stable-diffusion.cpp:2396 - latent 1 decoded, taking 5.98s
[INFO ] stable-diffusion.cpp:2400 - decode_first_stage completed, taking 5.98s

after:

[INFO ] stable-diffusion.cpp:2386 - decoding 1 latents
[DEBUG] ggml_extend.hpp:1579 - vae compute buffer size: 6656.25 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1677 - computing vae decode graph completed, taking 5.99s
[INFO ] stable-diffusion.cpp:2396 - latent 1 decoded, taking 6.00s
[INFO ] stable-diffusion.cpp:2400 - decode_first_stage completed, taking 6.00s

@leejet
Copy link
Owner

leejet commented Oct 17, 2025

Performing an in-place scale on the input x of ggml_nn_conv_2d is not a good idea, because x might also be used as the input for other operations. This is also the reason why the image changes after applying in-place scaling.

@wbruna
Copy link
Contributor Author

wbruna commented Oct 17, 2025

Performing an in-place scale on the input x of ggml_nn_conv_2d is not a good idea, because x might also be used as the input for other operations. This is also the reason why the image changes after applying in-place scaling.

Oh, I see. But that wouldn't be an issue for the result of the conv_2d call, right?

I'll test it again with this change.

@wbruna wbruna marked this pull request as draft October 17, 2025 14:14
@leejet
Copy link
Owner

leejet commented Oct 17, 2025

Yes, but it will affect other operations using x, and ultimately it will impact the generated image.

@wbruna
Copy link
Contributor Author

wbruna commented Oct 18, 2025

I tested changing only the second ggml_scale, and although the generated image didn't change anymore, I didn't notice any memory improvements either.

@wbruna wbruna closed this Oct 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants