feat: add wan2.1/2.2 support #778

leejet · 2025-08-29T14:57:01Z

Feature:

Wan2.1 T2V 1.3B
Wan2.1 T2V 14B
Wan2.1 I2V 14B
Wan2.2 T2V A14B
Wan2.2 I2V A14B
Wan2.2 TI2V 5B
Wan2.1 FLF2V 14B
Wan2.2 FLF2V 14B

TODO:

Vace
Fun control
Reduce the memory usage of WAN VAE

Warning: Currently, only the CUDA and CPU backends support WAN VAE. If you are using another backend, try using --vae-on-cpu to run the WAN VAE on the CPU. Although this will be very slow.

Examples

Since GitHub does not support AVI files, the file I uploaded was converted from AVI to MP4.

Wan2.1 T2V 1.3B

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\wan2.1_t2v_1.3B_fp16.safetensors --vae ..\..\ComfyUI\models\vae\wan_2.1_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf  -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部， 畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 832 -H 480 --diffusion-fa --video-frames 33

Wan2.1_1.3B_t2v.mp4

Wan2.1 T2V 14B

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\wan2.1-t2v-14b-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\wan_2.1_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf  -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 832 -H 480 --diffusion-fa  --offload-to-cpu --video-frames 33

Wan2.1_14B_t2v.mp4

Wan2.1 I2V 14B

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\wan2.1-i2v-14b-480p-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\wan_2.1_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf --clip_vision ..\..\ComfyUI\models\clip_vision\clip_vision_h.safetensors -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 480 -H 832 --diffusion-fa --video-frames 33 --offload-to-cpu -i ..\assets\cat_with_sd_cpp_42.png

Wan2.1_14B_i2v.mp4

Wan2.2 T2V A14B

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf --high-noise-diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\wan_2.1_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf  -p "a lovely cat" --cfg-scale 3.5 --sampling-method euler --steps 10 --high-noise-cfg-scale 3.5 --high-noise-sampling-method euler --high-noise-steps 8 -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 832 -H 480 --diffusion-fa --offload-to-cpu --video-frames 33

Wan2.2_14B_t2v.mp4

Wan2.2 I2V A14B

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf --high-noise-diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\wan_2.1_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf  -p "a lovely cat" --cfg-scale 3.5 --sampling-method euler --steps 10 --high-noise-cfg-scale 3.5 --high-noise-sampling-method euler --high-noise-steps 8 -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 832 -H 480 --diffusion-fa --offload-to-cpu --video-frames 33 --offload-to-cpu -i ..\assets\cat_with_sd_cpp_42.png

Wan2.2_14B_i2v.mp4

Wan2.2 I2V A14B T2I

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf --high-noise-diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\wan_2.1_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf  -p "a lovely cat" --cfg-scale 3.5 --sampling-method euler --steps 10 --high-noise-cfg-scale 3.5 --high-noise-sampling-method euler --high-noise-steps 8 -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 832 -H 480 --diffusion-fa --offload-to-cpu

Wan2.2 T2V 14B with Lora

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf --high-noise-diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\wan_2.1_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf  -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise:1><lora:|high_noise|wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise:1>" --cfg-scale 3.5 --sampling-method euler --steps 4 --high-noise-cfg-scale 3.5 --high-noise-sampling-method euler --high-noise-steps 4 -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 832 -H 480 --diffusion-fa --offload-to-cpu --lora-model-dir ..\..\ComfyUI\models\loras --video-frames 33

Wan2.2_14B_t2v_lora.mp4

Wan2.2 TI2V 5B

T2V

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\wan2.2_ti2v_5B_fp16.safetensors --vae ..\..\ComfyUI\models\vae\wan2.2_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf  -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 480 -H 832 --diffusion-fa --offload-to-cpu --video-frames 33

Wan2.2_5B_t2v.mp4

I2V

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\wan2.2_ti2v_5B_fp16.safetensors --vae ..\..\ComfyUI\models\vae\wan2.2_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf  -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 480 -H 832 --diffusion-fa --offload-to-cpu --video-frames 33 -i ..\assets\cat_with_sd_cpp_42.png

Wan2.2_5B_i2v.mp4

Wan2.1 FLF2V 14B

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\wan2.1-flf2v-14b-720p-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\wan_2.1_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf --clip_vision ..\..\ComfyUI\models\clip_vision\clip_vision_h.safetensors -p "glass flower blossom" --cfg-scale 6.0 --sampling-method euler -v -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 480 -H 832 --diffusion-fa --video-frames 33 --offload-to-cpu --init-img ..\..\ComfyUI\input\start_image.png --end-img ..\..\ComfyUI\input\end_image.png

Wan2.1_14B_flf2v.mp4

Wan2.2 FLF2V 14B

.\bin\Release\sd.exe -M vid_gen --diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf --high-noise-diffusion-model  ..\..\ComfyUI\models\diffusion_models\Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\wan_2.1_vae.safetensors --t5xxl ..\..\ComfyUI\models\text_encoders\umt5-xxl-encoder-Q8_0.gguf --cfg-scale 3.5 --sampling-method euler --steps 10 --high-noise-cfg-scale 3.5 --high-noise-sampling-method euler --high-noise-steps 8 -v -p "glass flower blossom" -n "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" -W 480 -H 832 --diffusion-fa --video-frames 33 --offload-to-cpu --init-img ..\..\ComfyUI\input\start_image.png --end-img ..\..\ComfyUI\input\end_image.png

Wan2.2_14B_flf2v.mp4

leejet · 2025-08-29T15:00:20Z

Finally, Wan's support was added. This took me a long time. Once this PR is merged, I will try to add support for Qwen Image.

Green-Sky · 2025-08-29T15:11:48Z

Great job @leejet , very nice. I can't wait to try it later.

I see you went for mjpeg+avi, is there also an option to output it as a png image sequence?

leejet · 2025-08-29T15:32:18Z

Great job @leejet , very nice. I can't wait to try it later.

I see you went for mjpeg+avi, is there also an option to output it as a png image sequence?

I will add command-line parameters to control it, but the priority is not very high.

chaserhkj · 2025-09-02T09:48:53Z

@Green-Sky It might not be the problem with the implementation but with the smaller model itself. I think that smaller model's distilling is not done very well, I had a lot of trouble getting consistent result in ComfyUI using that smaller model as well. I had far better changes using a quantized version of the full model.

Green-Sky · 2025-09-02T10:03:21Z

@Green-Sky It might not be the problem with the implementation but with the smaller model itself. I think that smaller model's distilling is not done very well, I had a lot of trouble getting consistent result in ComfyUI using that smaller model as well. I had far better changes using a quantized version of the full model.

Hmm. I don't think you can call Wan2.2 TI2V 5B a distilled model. It has its own VAE, that has way more compression than the other VAE.

Wan2.2 open-sources a 5B model built with our advanced Wan2.2-VAE that achieves a compression ratio of 16×16×4.

Also, the same model behaves just fine with text only input.

stduhpf · 2025-09-02T10:30:30Z

Wan2.2 TI2V 5B with image input still seems to be somewhat broken,

I think wan2.2 TI2V kind of sucks in I2V mode in ComfyUI too.

Edit: I tried to match the settings as well as I could in comfy, It's definitely not as bad, but maybe it's just a lucky seed.

Edit2: No something's definitely wrong with this PR's implementation, the cat keeps sneezing no matter the seed, and this doesn't happen at all in ComfyUI.

Edit 3: I was using --sed instead of --seed (thank god #767 is merged in master now)

seed 42	seed 0

chaserhkj · 2025-09-02T13:04:20Z

On Tue, Sep 2, 2025, 6:03 a.m. Erik Scholz ***@***.***> wrote: *Green-Sky* left a comment (leejet/stable-diffusion.cpp#778) <#778 (comment)> @Green-Sky <https://github.com/Green-Sky> It might not be the problem with the implementation but with the smaller model itself. I think that smaller model's distilling is not done very well, I had a lot of trouble getting consistent result in ComfyUI using that smaller model as well. I had far better changes using a quantized version of the full model. Hmm. I don't think you can call Wan2.2 TI2V 5B a distilled model. It has its own VAE, that has way more compression then the other VAE.

You re correct, but maybe this just means it's not very well trained, anyways. I think efforts around quantizing the 14B model still makes far more sense for lower end devices. The VAE is the problem there though as in my use case in ComfyUI I was constantly blasted with VRAM OOMs with a 16GB GPU during VAE procedures. Had to do a lot of offloading.

…

Wan2.2 open-sources a 5B model built with our advanced Wan2.2-VAE that achieves a compression ratio of 16×16×4. — Reply to this email directly, view it on GitHub <#778 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANPXZYNP5K4VIB54Y2IFLT3QVTP7AVCNFSM6AAAAACFEZWMR2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTENBUGY2TONZVHE> . You are receiving this because you commented.Message ID: ***@***.***>

tyllmoritz · 2025-09-04T18:11:44Z

When I tried this with the vulkan backend, I had problems with im2col_3d.

For a quick and dirty test, I just reverted the commit "cuda/cpu: add im2col_3d support" in https://github.com/leejet/ggml/tree/wan

There are two better solutions (already implemented by @leejet and @jeffbolznv , thanks for your work):

either change the supports_op function (already merged in the llama.cpp repo):
ggml: add ops for WAN video model (cuda && cpu)
or add imcol_3d to vulkan:
vulkan: support im2col_3d

leejet · 2025-09-06T10:07:15Z

Since ggml-org/ggml has already synchronized the PR that I made for adding WAN-related operations, I have decided to merge this PR first. This PR already contains too many changes. Support for VACE and FUN will be in a separate pull request.

Green-Sky

sorry for the late review.

docs/wan.md

examples/cli/main.cpp

Green-Sky · 2025-09-06T11:00:47Z

--offload-to-cpu is not part of help too.

leejet · 2025-09-06T18:27:50Z

All of these have been fixed. Thank you for your review comments.

Amin456789 · 2025-09-08T09:12:06Z

@LostRuins please add this to ur gui if possibile. will be great if u add support for lora too

thank u guys for making this, thnx leejet and others

LostRuins · 2025-09-28T08:36:59Z

Hello @leejet , I noticed that the sd_vid_gen_params_t doesn't contain any parameters for toggling VAE tiling - currently does VAE tiling work for WAN videos and is it possible to enable? Thanks!

Edit: Reason is because without VAE tiling currently it's trying to allocated a massive buffer on vulkan that goes OOM.

LostRuins · 2025-09-29T14:09:50Z

Also can someone help me understand how the flow shift works? Is that what's causing these abrupt transitions and how can I avoid it?

LostRuins · 2025-09-30T05:52:34Z

still getting really weird results in most gens

@wbruna any ideas?

Final edit: All resolved by switching to wan2.2-rapid-mega-aio-v3

leejet · 2025-10-11T17:09:05Z

Hello @leejet , I noticed that the sd_vid_gen_params_t doesn't contain any parameters for toggling VAE tiling - currently does VAE tiling work for WAN videos and is it possible to enable? Thanks!

Edit: Reason is because without VAE tiling currently it's trying to allocated a massive buffer on vulkan that goes OOM.

Currently, WAN VAE does not support video tiling, and I haven’t tested the feasibility of video tiling yet.

leejet · 2025-10-11T17:12:25Z

Also can someone help me understand how the flow shift works? Is that what's causing these abrupt transitions and how can I avoid it?

Try lower shift values (2.0 to 5.0) for lower resolution videos and higher shift values (7.0 to 12.0) for higher resolution images. https://huggingface.co/docs/diffusers/en/api/pipelines/wan#notes

LostRuins · 2025-10-12T10:55:19Z

Currently, WAN VAE does not support video tiling, and I haven’t tested the feasibility of video tiling yet.

Would it be possible to simply do the VAE per-frame (the entire frame at once). I confess I don't know how it works, but the memory usage for a single frame image is perfectly ok. The problem only comes when doing longer videos with many frames.

leejet · 2025-10-12T13:51:20Z

Currently, WAN VAE does not support video tiling, and I haven’t tested the feasibility of video tiling yet.

Would it be possible to simply do the VAE per-frame (the entire frame at once). I confess I don't know how it works, but the memory usage for a single frame image is perfectly ok. The problem only comes when doing longer videos with many frames.

        struct ggml_tensor* decode(struct ggml_context* ctx,
                                   struct ggml_tensor* z,
                                   int64_t b = 1) {
            // z: [b*c, t, h, w]
            GGML_ASSERT(b == 1);

            clear_cache();

            auto decoder = std::dynamic_pointer_cast<Decoder3d>(blocks["decoder"]);
            auto conv2   = std::dynamic_pointer_cast<CausalConv3d>(blocks["conv2"]);

            int64_t iter_ = z->ne[2];
            auto x        = conv2->forward(ctx, z);
            struct ggml_tensor* out;
            for (int64_t i = 0; i < iter_; i++) {
                _conv_idx = 0;
                if (i == 0) {
                    auto in = ggml_slice(ctx, x, 2, i, i + 1);  // [b*c, 1, h, w]
                    out     = decoder->forward(ctx, in, b, _feat_map, _conv_idx, i);
                } else {
                    auto in   = ggml_slice(ctx, x, 2, i, i + 1);  // [b*c, 1, h, w]
                    auto out_ = decoder->forward(ctx, in, b, _feat_map, _conv_idx, i);
                    out       = ggml_concat(ctx, out, out_, 2);
                }
            }
            if (wan2_2) {
                out = unpatchify(ctx, out, 2, b);
            }
            clear_cache();
            return out;
        }

Currently, decoding is done frame by frame, and the compute buffer size used is the same for both 33 frames and 81 frames.

LostRuins · 2025-10-12T14:28:36Z

Oh, then why is it smaller for something like 1 frame or 5 frames?

leejet · 2025-10-12T14:49:35Z

Starting from chunk 1, each chunk depends on data from the previous chunk, so the computation graph is different, causing the compute buffer to grow. In theory, after chunk 1, the compute buffer shouldn’t grow anymore, but in practice, it actually stops growing after chunk 2. I tried creating a separate computation graph for each chunk, and indeed, the buffer no longer grows after chunk 1. However, the results for chunk 1 were a bit odd, so I disabled the related code—you can check the code around build_graph_partial.

By the way, for Wan VAE, the decoding rule for chunks is: chunk 0 corresponds to 1 frame, and starting from chunk 1, each chunk corresponds to 4 frames.

henk717 · 2025-10-12T20:57:55Z

As a measure, in our mode for 80 frames (KoboldCpp side) I am measuring 75GB of vram used during the generation on the 14B 2.2. If I use only 10 frames I can fit it on my 3090 fine. So something is balooning the vram usage with higher frame counts.

LostRuins · 2025-10-13T02:09:00Z

what resolution were you generating at?

Also if this logic is correct then 10 frames should take the same amount of memory as 80 frames, but it seems higher.

leejet · 2025-10-13T13:09:46Z

Have you used --diffusion-fa? This option can significantly reduce the VRAM usage.

leejet added 21 commits August 2, 2025 11:00

add wan vae suppport

e3f9366

add wan model support

5f7d988

add umt5 support

bace0a0

add wan2.1 t2i support

1d9ccea

make flash attn work with wan

00f790d

make wan a little faster

73f76e6

add wan2.1 t2v support

3a2840f

add wan gguf support

b0833eb

add offload params to cpu support

9b29de2

add wan2.1 i2v support

d83867b

crop image before resize

459fd4d

set default fps to 16

e69195d

add diff lora support

9fcc856

fix wan2.1 i2v

cf48441

introduce sd_sample_params_t

afef8ce

add wan2.2 t2v support

079b393

add wan2.2 14B i2v support

815e9fd

add wan2.2 ti2v support

6de680a

add high noise lora support

eb3fed8

sync: update ggml submodule url

27a2cfe

avoid build failure on linux

2410ce3

Merge branch 'master' into wan

b05b2b2

Green-Sky mentioned this pull request Aug 29, 2025

Request support for WAN2.1 （Video Generation） #715

Open

This was referenced Aug 29, 2025

Is 3D generation within the scope of this project? #775

Open

wan 2.2 #759

Open

Qwen Image #758

Open

revert the L_k padding, introduced as part of #736 #774

Closed

leejet added 3 commits September 6, 2025 15:13

update to latest ggml

2570565

add GGUFReader

29c61c8

update docs

125acc8

leejet merged commit cb1d975 into master Sep 6, 2025
8 checks passed

Green-Sky reviewed Sep 6, 2025

View reviewed changes

docs/wan.md Show resolved Hide resolved

examples/cli/main.cpp Show resolved Hide resolved

wbruna mentioned this pull request Sep 6, 2025

update ggml and adjust for new type #773

Closed

This was referenced Sep 9, 2025

img2img 'weakened' by diffusion-fa #756

Closed

Can get wan 2.1 to start generating #810

Open

jeffbolznv mentioned this pull request Sep 20, 2025

vulkan: 64-bit im2col ggml-org/llama.cpp#16135

Merged

feat: add wan2.1/2.2 support #778

feat: add wan2.1/2.2 support #778

Uh oh!

Conversation

leejet commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Examples

Wan2.1 T2V 1.3B

Wan2.1 T2V 14B

Wan2.1 I2V 14B

Wan2.2 T2V A14B

Wan2.2 I2V A14B

Wan2.2 I2V A14B T2I

Wan2.2 T2V 14B with Lora

Wan2.2 TI2V 5B

T2V

I2V

Wan2.1 FLF2V 14B

Wan2.2 FLF2V 14B

Uh oh!

leejet commented Aug 29, 2025

Uh oh!

Green-Sky commented Aug 29, 2025

Uh oh!

leejet commented Aug 29, 2025

Uh oh!

chaserhkj commented Sep 2, 2025

Uh oh!

Green-Sky commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stduhpf commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chaserhkj commented Sep 2, 2025 via email

Uh oh!

tyllmoritz commented Sep 4, 2025

Uh oh!

leejet commented Sep 6, 2025

Uh oh!

Uh oh!

Green-Sky left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Green-Sky commented Sep 6, 2025

Uh oh!

leejet commented Sep 6, 2025

Uh oh!

Amin456789 commented Sep 8, 2025

Uh oh!

LostRuins commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Sep 29, 2025

Uh oh!

LostRuins commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leejet commented Oct 11, 2025

Uh oh!

leejet commented Oct 11, 2025

Uh oh!

LostRuins commented Oct 12, 2025

Uh oh!

leejet commented Oct 12, 2025

Uh oh!

LostRuins commented Oct 12, 2025

Uh oh!

leejet commented Oct 12, 2025

Uh oh!

henk717 commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leejet commented Oct 13, 2025

Uh oh!

leejet commented Aug 29, 2025 •

edited

Loading

Green-Sky commented Sep 2, 2025 •

edited

Loading

stduhpf commented Sep 2, 2025 •

edited

Loading

LostRuins commented Sep 28, 2025 •

edited

Loading

LostRuins commented Sep 30, 2025 •

edited

Loading

henk717 commented Oct 12, 2025 •

edited

Loading

LostRuins commented Oct 13, 2025 •

edited

Loading