WAN 2.2 converting to GGUF #788

TastyMoocow · 2025-09-04T22:14:34Z

TastyMoocow
Sep 4, 2025

I saw the author's pull request over at Llama CPP which incorporated handling of WAN. While the pull request was merged, the quantize binary (exe file) does not recognize the wan format. So I clone the latest sd.cpp but the convert function also failed with <model.cpp:1177 - invalid tensor 'model.diffusion_model.patch_embedding.weight'>.

So my questions are:

Should I wait for sd.cpp to update before trying to convert again?
Is there any existing methods of converting a Wan2.2 safetensor model to GGUF format? I couldn't find an answer after searching the Internet but obviously people have successfully created a GGUF format of WAN2.2.

Note, I already tried the ComfyUI GGUF approach and that also failed.

stduhpf · 2025-09-05T08:28:39Z

stduhpf
Sep 5, 2025

model.diffusion_model.patch_embedding.weight is a 5D tensor, and these aren't supported yet in the lastest "sd.cpp" you need to wait for the update, or build from the wan branch.

0 replies

TastyMoocow · 2025-09-05T19:13:15Z

TastyMoocow
Sep 5, 2025
Author

Thanks for pointing me to the wan branch. I tried switching to that branch and build it using cmake per instruction (clone recursive then switch to wan branch). But it error out with the following messages:

D:\tools\stable-diffusion.cpp\model.cpp(1061,17): error C2027: use of undefined type 'gguf_tensor_shape' [D:\tools\stab
le-diffusion.cpp\build\stable-diffusion.vcxproj]
D:\tools\stable-diffusion.cpp\model.cpp(1058,79):
see declaration of 'gguf_tensor_shape'

D:\tools\stable-diffusion.cpp\model.cpp(1072,17): error C3861: 'gguf_init_from_file_ext': identifier not found [D:\tool
s\stable-diffusion.cpp\build\stable-diffusion.vcxproj]

D:\tools\stable-diffusion.cpp\ggml_extend.hpp(883,18): error C3861: 'ggml_conv_3d': identifier not found [D:\tools\stab
le-diffusion.cpp\build\stable-diffusion.vcxproj]
(compiling source file '../stable-diffusion.cpp')

D:\tools\stable-diffusion.cpp\wan.hpp(76,17): error C3861: 'ggml_pad_ext': identifier not found [D:\tools\stable-diffus
ion.cpp\build\stable-diffusion.vcxproj]
(compiling source file '../stable-diffusion.cpp')

D:\tools\stable-diffusion.cpp\ggml_extend.hpp(883,18): error C3861: 'ggml_conv_3d': identifier not found [D:\tools\stab
le-diffusion.cpp\build\stable-diffusion.vcxproj]
(compiling source file '../upscaler.cpp')

Not sure if I did something wrong during the cloning process.

1 reply

wbruna Sep 5, 2025

This looks like the ggml version is wrong. See if ggml appears on git status output; if so, git submodule update should bring it to the correct version.

TastyMoocow · 2025-09-05T21:27:53Z

TastyMoocow
Sep 5, 2025
Author

After updating the submodule, it started to convert but it always stop / crash at the same spot of a WAN 2.2 model. I ran with the -v flag (assuming it's verbose?) but no explicit error messages pop up. Any suggestions?

4 replies

stduhpf Sep 5, 2025

maybe you're running out of memory

TastyMoocow Sep 5, 2025
Author

Don't think it's memory. I rebooted my VM, gave it 54GB of memory and 6 CPU threads. Even with Win11 as the OS, it didn't consume more than 12 to 14GB of memory. It still crash / stop at the exact same spot. I'm going to try it on my main machine that isn't a VM and report back.

TastyMoocow Sep 5, 2025
Author

OK tested with my main PC that isn't a VM. For sure it isn't a memory issue because I tried a different version of the model and it also crash / stop at the exact same spot. If someone got time to try with their machine, that would be nice. Note, I tried V9 and V10 of the following model.

https://huggingface.co/Phr00t/WAN2.2-14B-Rapid-AllInOne

TastyMoocow Sep 6, 2025
Author

Tried converting to F16, the process would complete without error but the output is 0 bits.

hartmark · 2025-09-07T02:29:18Z

hartmark
Sep 7, 2025

For the lightning lora to work we are limited by the supported quants seen here:
https://github.com/ggml-org/ggml/blob/7dee1d6a1e7611f238d09be96738388da97c88ed/src/ggml-cuda/getrows.cu#L156

I have updated the docs to point out that in the lora.md
#792

That leaves us with just Q8_0 supported when using the Wan2.2 GGUF found at:
https://huggingface.co/bullerwins/Wan2.2-I2V-A14B-GGUF

It would be nice to have support for a bit less heavy quants like Q5_K_S that I have been able to generate images with on my Amd 7800xt 16GB in ComfyUI

1 reply

TastyMoocow Sep 7, 2025
Author

Trying to convert to Q8_0 also failed for this specific model. Failed at the exact same spot in the progress bar.

WAN 2.2 converting to GGUF #788

Uh oh!

TastyMoocow Sep 4, 2025

Replies: 4 comments · 6 replies

Uh oh!

stduhpf Sep 5, 2025

Uh oh!

TastyMoocow Sep 5, 2025 Author

Uh oh!

wbruna Sep 5, 2025

Uh oh!

TastyMoocow Sep 5, 2025 Author

Uh oh!

stduhpf Sep 5, 2025

Uh oh!

TastyMoocow Sep 5, 2025 Author

Uh oh!

TastyMoocow Sep 5, 2025 Author

Uh oh!

TastyMoocow Sep 6, 2025 Author

Uh oh!

hartmark Sep 7, 2025

Uh oh!

Uh oh!

TastyMoocow Sep 7, 2025 Author

TastyMoocow
Sep 4, 2025

Replies: 4 comments 6 replies

stduhpf
Sep 5, 2025

TastyMoocow
Sep 5, 2025
Author

TastyMoocow
Sep 5, 2025
Author

TastyMoocow Sep 5, 2025
Author

TastyMoocow Sep 5, 2025
Author

TastyMoocow Sep 6, 2025
Author

hartmark
Sep 7, 2025

TastyMoocow Sep 7, 2025
Author