gguf new quant type support (with demo) #12049

calcuis · 2025-08-03T09:22:59Z

not perfect but works; support more gguf quant types

engine:
https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/quant2c.py

inference example(s):
https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k6.py https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k5.py

gguf file sample(s):
https://huggingface.co/calcuis/kontext-gguf/tree/main https://huggingface.co/calcuis/krea-gguf/tree/main

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.

calcuis/gguf-connector#3

Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

could simply test it with the the inference example(s) above or the code below:

import torch
from transformers import T5EncoderModel
from diffusers import FluxPipeline, GGUFQuantizationConfig, FluxTransformer2DModel

model_path = "https://huggingface.co/calcuis/krea-gguf/blob/main/flux1-krea-dev-iq4_nl.gguf"
transformer = FluxTransformer2DModel.from_single_file(
    model_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16,
    config="callgg/krea-decoder",
    subfolder="transformer"
)

text_encoder = T5EncoderModel.from_pretrained(
    "chatpig/t5-v1_1-xxl-encoder-fp32-gguf",
    gguf_file="t5xxl-encoder-fp32-q2_k.gguf",
    torch_dtype=torch.bfloat16
    )

pipe = FluxPipeline.from_pretrained(
    "callgg/krea-decoder",
    transformer=transformer,
    text_encoder_2=text_encoder,
    torch_dtype=torch.bfloat16
    )
pipe.enable_model_cpu_offload() # could change it to cuda if you have good gpu

prompt = "a pig holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=2.5,
).images[0]
image.save("output.png")

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

not perfect but works; support more gguf quant types engine: https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/quant2c.py inference example(s): https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k6.py https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k5.py gguf file sample(s): https://huggingface.co/calcuis/kontext-gguf/tree/main https://huggingface.co/calcuis/krea-gguf/tree/main

a-r-r-o-w · 2025-08-05T20:02:54Z

Hi @calcuis, any reason for closing this? I think it might be super cool to support (although that's for @DN6 to decide as gguf codeowner)

calcuis · 2025-08-05T20:13:18Z

Hi @calcuis, any reason for closing this? I think it might be super cool to support (although that's for @DN6 to decide as gguf codeowner)

hi @a-r-r-o-w , oh, sorry; we open it again; working on the other parts still; iq4_nl and iq4_xs might be robust enough for your team to test recently

calcuis · 2025-08-05T20:13:39Z

thanks

calcuis added 2 commits August 3, 2025 01:17

Merge branch 'main' into patch-1

2cdd624

calcuis closed this Aug 5, 2025

calcuis deleted the patch-1 branch August 5, 2025 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gguf new quant type support (with demo) #12049

gguf new quant type support (with demo) #12049

Uh oh!

calcuis commented Aug 3, 2025 •

edited

Loading

Uh oh!

a-r-r-o-w commented Aug 5, 2025

Uh oh!

calcuis commented Aug 5, 2025

Uh oh!

calcuis commented Aug 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gguf new quant type support (with demo) #12049

gguf new quant type support (with demo) #12049

Uh oh!

Conversation

calcuis commented Aug 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

a-r-r-o-w commented Aug 5, 2025

Uh oh!

calcuis commented Aug 5, 2025

Uh oh!

calcuis commented Aug 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

calcuis commented Aug 3, 2025 •

edited

Loading