Skip to content

Conversation

@calcuis
Copy link
Contributor

@calcuis calcuis commented Aug 3, 2025

not perfect but works; support more gguf quant types

engine:
https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/quant2c.py

inference example(s):
https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k6.py https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/k5.py

gguf file sample(s):
https://huggingface.co/calcuis/kontext-gguf/tree/main https://huggingface.co/calcuis/krea-gguf/tree/main

Screenshot 2025-08-02 224103 Screenshot 2025-08-02 224124

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you read our philosophy doc (important for complex PRs)?
  • Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.

calcuis/gguf-connector#3

could simply test it with the the inference example(s) above or the code below:

import torch
from transformers import T5EncoderModel
from diffusers import FluxPipeline, GGUFQuantizationConfig, FluxTransformer2DModel

model_path = "https://huggingface.co/calcuis/krea-gguf/blob/main/flux1-krea-dev-iq4_nl.gguf"
transformer = FluxTransformer2DModel.from_single_file(
    model_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16,
    config="callgg/krea-decoder",
    subfolder="transformer"
)

text_encoder = T5EncoderModel.from_pretrained(
    "chatpig/t5-v1_1-xxl-encoder-fp32-gguf",
    gguf_file="t5xxl-encoder-fp32-q2_k.gguf",
    torch_dtype=torch.bfloat16
    )

pipe = FluxPipeline.from_pretrained(
    "callgg/krea-decoder",
    transformer=transformer,
    text_encoder_2=text_encoder,
    torch_dtype=torch.bfloat16
    )
pipe.enable_model_cpu_offload() # could change it to cuda if you have good gpu

prompt = "a pig holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=2.5,
).images[0]
image.save("output.png")

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@calcuis calcuis closed this Aug 5, 2025
@calcuis calcuis deleted the patch-1 branch August 5, 2025 20:00
@a-r-r-o-w
Copy link
Contributor

Hi @calcuis, any reason for closing this? I think it might be super cool to support (although that's for @DN6 to decide as gguf codeowner)

@calcuis
Copy link
Contributor Author

calcuis commented Aug 5, 2025

Hi @calcuis, any reason for closing this? I think it might be super cool to support (although that's for @DN6 to decide as gguf codeowner)

hi @a-r-r-o-w , oh, sorry; we open it again; working on the other parts still; iq4_nl and iq4_xs might be robust enough for your team to test recently

@calcuis
Copy link
Contributor Author

calcuis commented Aug 5, 2025

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants