[GGUF] feat: support loading diffusers format gguf checkpoints. #11684

sayakpaul · 2025-06-10T03:59:55Z

What does this PR do?

Refer to ngxson/diffusion-to-gguf#1 to know how to obtain the checkpoint.

After the checkpoint is obtained, run the following code for inference:

Expand

import torch
from diffusers import FluxPipeline, FluxTransformer2DModel, GGUFQuantizationConfig

ckpt_path = "model-Q4_0.gguf"
transformer = FluxTransformer2DModel.from_single_file(
    ckpt_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16,
    config="black-forest-labs/FLUX.1-dev",
    subfolder="transformer",
)
pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    transformer=transformer,
    torch_dtype=torch.bfloat16,
).to("cuda")
prompt = "A cat holding a sign that says GGUF"
image = pipe(prompt, generator=torch.manual_seed(0)).images[0]
image.save("flux-gguf.png")

Currently, the entrypoint for the diffusers formatted GGUF checkpoint is through from_single_file(). It remains to be seen if after https://github.com/ngxson/flux-to-gguf, we wanna support them through from_pretrained().

Sample diffusers-format GGUF file: https://huggingface.co/sayakpaul/flux-diffusers-gguf

@DN6 please feel free to make any changes or even change the direction of the PR as you see fit.

HuggingFaceDocBuilderDev · 2025-06-10T04:07:08Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

nitinmukesh · 2025-06-10T16:48:15Z

pip install git+https://github.com/huggingface/diffusers.git@refs/pull/11684/head

from typing import List
import torch
import PIL.Image
from diffusers import AutoencoderKLWan, WanVACEPipeline, WanVACETransformer3DModel
from diffusers.schedulers.scheduling_unipc_multistep import UniPCMultistepScheduler
from diffusers.utils import export_to_video, load_image, load_video
from diffusers import GGUFQuantizationConfig

model_id = "a-r-r-o-w/Wan-VACE-1.3B-diffusers"
transformer_path = f"https://huggingface.co/newgenai79/Wan-VACE-1.3B-diffusers-gguf/blob/main/Wan-VACE-1.3B-diffusers-Q8_0.gguf"
transformer_gguf = WanVACETransformer3DModel.from_single_file(
    transformer_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16,
    config=model_id,
    subfolder="transformer",
)
vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
pipe = WanVACEPipeline.from_pretrained(
    model_id,
    transformer=transformer_gguf,
    vae=vae, 
    torch_dtype=torch.bfloat16
)
flow_shift = 3.0  # 5.0 for 720P, 3.0 for 480P
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=flow_shift)
pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()


prompt = "A sleek, humanoid robot stands in a vast warehouse filled with neatly stacked cardboard boxes on industrial shelves. The robot's metallic body gleams under the bright, even lighting, highlighting its futuristic design and intricate joints. A glowing blue light emanates from its chest, adding a touch of advanced technology. The background is dominated by rows of boxes, suggesting a highly organized storage system. The floor is lined with wooden pallets, enhancing the industrial setting. The camera remains static, capturing the robot's poised stance amidst the orderly environment, with a shallow depth of field that keeps the focus on the robot while subtly blurring the background for a cinematic effect."
negative_prompt = "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards"

output = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=832,
    height=480,
    num_frames=81,
    num_inference_steps=30,
    guidance_scale=5.0,
    conditioning_scale=0.0,
    generator=torch.Generator().manual_seed(0),
).frames[0]
export_to_video(output, "output.mp4", fps=16)


(sddw-dev) C:\aiOWN\diffuser_webui>python WanVace_GGUF.py
W0610 22:13:36.639000 21016 site-packages\torch\distributed\elastic\multiprocessing\redirects.py:29] NOTE: Redirects are currently not supported in Windows or MacOs.
Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\WanVace_GGUF.py", line 11, in <module>
    transformer_gguf = WanVACETransformer3DModel.from_single_file(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nitin\miniconda3\envs\sddw-dev\Lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nitin\miniconda3\envs\sddw-dev\Lib\site-packages\diffusers\loaders\single_file_model.py", line 235, in from_single_file
    raise ValueError(
ValueError: FromOriginalModelMixin is currently only compatible with StableCascadeUNet, UNet2DConditionModel, AutoencoderKL, ControlNetModel, SD3Transformer2DModel, MotionAdapter, SparseControlNetModel, FluxTransformer2DModel, LTXVideoTransformer3DModel, AutoencoderKLLTXVideo, AutoencoderDC, MochiTransformer3DModel, HunyuanVideoTransformer3DModel, AuraFlowTransformer2DModel, Lumina2Transformer2DModel, SanaTransformer2DModel, WanTransformer3DModel, AutoencoderKLWan, HiDreamImageTransformer2DModel

nitinmukesh · 2025-06-19T19:05:06Z

@sayakpaul

Any suggestions on above issue, pls.

sayakpaul · 2025-06-20T01:36:35Z

I am not sure this PR supports Wan yet.

DN6 · 2025-06-27T10:19:57Z

Would be better to add a utility function

def _should_convert_state_dict_to_diffusers(model_state_dict, checkpoint_state_dict):
    return not set(model_state_dict.keys()).issubset(set(checkpoint_state_dict.keys())

to single_file_model.py.

If condition passes, convert the checkpoint with this line

diffusers/src/diffusers/loaders/single_file_model.py

Line 375 in d7dd924

diffusers_format_checkpoint = checkpoint_mapping_fn(

if not set diffusers_format_checkpoint to the current checkpoint

sayakpaul · 2025-06-27T10:24:28Z

@DN6 I think we discussed that in the conversion we will embed metadata to the GGUF file. This is now supported: ngxson/diffusion-to-gguf#3. Would you be able to make changes to this PR to see if that works?

sayakpaul · 2025-07-16T07:20:47Z

@DN6 if you have time to ^.

DN6 · 2025-08-06T15:45:21Z

@sayakpaul This should be good to merge. Would work well for QwenImage (need to add single file mixin to the model) if you want to give it a try.

sayakpaul · 2025-08-06T15:49:21Z

Let give this a try and merge :)

sayakpaul · 2025-08-07T06:23:01Z

@DN6 I have pushed a small change to support Qwen. Could you check that?

It seems to be working with a Q4_0 GGUF checkpoint I obtained.

Code

:

from diffusers import QwenImageTransformer2DModel, GGUFQuantizationConfig, DiffusionPipeline
import torch 

ckpt_id = "Qwen/Qwen-Image"
transformer = QwenImageTransformer2DModel.from_single_file(
    "qwen-Q4_0.gguf", 
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16,
    config=ckpt_id,
    subfolder="transformer",
)
pipe = DiffusionPipeline.from_pretrained(ckpt_id, torch_dtype=torch.bfloat16).to("cuda")
prompt = "stock photo of two people, a man and a woman, wearing lab coats writing on a white board with markers, the white board has text that reads 'The Diffusers library by Hugging Face makes it easy for developers to run image generation and inference using state-of-the-art diffusion models with just a few lines of code' with sloppy writing and traces clearly made by a human. The photo is taken from the side and has depth of field so some parts of the board looks blurred giving it a more professional look"

image = pipe(
    prompt=prompt,
    negative_prompt="negative_prompt",
    width=1024,
    height=1024,
    num_inference_steps=25,
    true_cfg_scale=4.0,
    generator=torch.manual_seed(0),
).images[0]
image.save("gguf_qwen.png")

Result

:

feat: support loading diffusers format gguf checkpoints.

e5ca3a6

sayakpaul requested a review from DN6 June 10, 2025 03:59

sayakpaul mentioned this pull request Jun 10, 2025

feat: support diffusers ckpt. ngxson/diffusion-to-gguf#1

Merged

DN6 added 3 commits August 5, 2025 22:48

update

4ced879

Merge branch 'sf-do-not-convert' into support-diffusers-ckpt-gguf

bf1ac4a

update

3f67ed0

Merge branch 'main' into support-diffusers-ckpt-gguf

8963162

sayakpaul added 4 commits August 7, 2025 08:06

Merge branch 'main' into support-diffusers-ckpt-gguf

ace1d4a

Merge branch 'main' into support-diffusers-ckpt-gguf

0dd7817

qwen

a85f597

Merge branch 'main' into support-diffusers-ckpt-gguf

410ea44

Merge branch 'main' into support-diffusers-ckpt-gguf

1c788a9

sayakpaul mentioned this pull request Aug 7, 2025

[docs] diffusers gguf checkpoints #12092

Merged

Merge branch 'main' into support-diffusers-ckpt-gguf

d10207e

DN6 approved these changes Aug 8, 2025

View reviewed changes

DN6 merged commit f20aba3 into main Aug 8, 2025
14 of 15 checks passed

DN6 mentioned this pull request Aug 8, 2025

Qwen image transformers doesn't currently support from_single_file (i.e. GGUFs) #12098

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[GGUF] feat: support loading diffusers format gguf checkpoints. #11684

[GGUF] feat: support loading diffusers format gguf checkpoints. #11684

Uh oh!

sayakpaul commented Jun 10, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jun 10, 2025

Uh oh!

nitinmukesh commented Jun 10, 2025

Uh oh!

nitinmukesh commented Jun 19, 2025

Uh oh!

sayakpaul commented Jun 20, 2025

Uh oh!

DN6 commented Jun 27, 2025

Uh oh!

sayakpaul commented Jun 27, 2025

Uh oh!

sayakpaul commented Jul 16, 2025

Uh oh!

DN6 commented Aug 6, 2025

Uh oh!

sayakpaul commented Aug 6, 2025

Uh oh!

sayakpaul commented Aug 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[GGUF] feat: support loading diffusers format gguf checkpoints. #11684

[GGUF] feat: support loading diffusers format gguf checkpoints. #11684

Uh oh!

Conversation

sayakpaul commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Jun 10, 2025

Uh oh!

nitinmukesh commented Jun 10, 2025

Uh oh!

nitinmukesh commented Jun 19, 2025

Uh oh!

sayakpaul commented Jun 20, 2025

Uh oh!

DN6 commented Jun 27, 2025

Uh oh!

sayakpaul commented Jun 27, 2025

Uh oh!

sayakpaul commented Jul 16, 2025

Uh oh!

DN6 commented Aug 6, 2025

Uh oh!

sayakpaul commented Aug 6, 2025

Uh oh!

sayakpaul commented Aug 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sayakpaul commented Jun 10, 2025 •

edited

Loading