-
Notifications
You must be signed in to change notification settings - Fork 578
Support q8_0 quantization for image model loading #1692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support q8_0 quantization for image model loading #1692
Conversation
48fe5b4 to
bc6a440
Compare
bc6a440 to
431851b
Compare
431851b to
3be29ab
Compare
|
Please unmark from draft when ready for review |
|
Also tested other quants. q6_K could be nice to have: loading time seems more or less the same as q4_0, and it makes a lot of difference for SDXL. With the current code, it's just a matter of adding the types and saving values to the lists; but I'm not sure if it wouldn't clutter the interface. |
|
It would definitely be kind of clutter. For power users who want a specific quant, it's far better for them to do the pre-quantization themselves and then load the GGUF. This option is more for those who want to keep safetensors and have a convenient option. Additionally, k-quants present some challenges in that they have a block size requirement (128) that's larger than a classic quant (32) and can fail on some architectures. So if you had to put 3 quants, I'd rather they be q4_0, q5_1 and q8_0. I think 3 quant types are the max we should have for this dynamic quant option. |
I agree; I do that myself, especially to be able to apply different quants to different parts of the model. It's just q8_0 that's really convenient to have.
Ah, that's unfortunate.
I don't know if it's just my CPU showing its age, but the _1 conversions seem too slow to be used on-demand:
So maybe it would be better to leave it with just q8_0 and q4_0 for now, and see if other users have a need for an additional option. |
2a4fd75 to
c0e4404
Compare
|
Just squashed the help and cleanup changes, and rebased. Should be good to go, unless you prefer adding that extra quant. |
|
Just q8_0 should be fine. I'll review later |
|
Yeah, the optional parameter seems sensible. I'll give it a try. I also see a benefit in keeping the parameter different from a quantization name: being able to change its inner workings without changing its purpose - in the same way we don't care, say, how exactly a But since the '1' in current .kcpps files means 'q4' right now, it may be challenging to switch it to mean 'q8' without some other change to differentiate new and old files. I only managed to keep its meaning in the new code by never writing that value back to the new files. |
|
no, Didn't test this but it might work dont really need a fully function but just an example |
q4_0 may degrade quality significantly, especially for smaller models like SD 1.5 and SDXL. q8_0 provides a middle-ground, giving half the memory savings of q4_0 but loading faster and with less quality loss.
c0e4404 to
9450531
Compare
|
Done; please take a look. It seemed a bit unhelpful to add a "0=off,1=q8, 2=q4" to the tooltip, so I abused the dropdown strings to add that information. But I'm sure there's a less hackish way to do that... |
|
minor adjustments for sdquant: allow backend to do the translation for the type more defensively, adjust the UI dropdown for clarity. |
|
Looks good to me. IMHO, it'd be a little less confusing if the values in the interface matched the values on the command line, but no strong feelings about it :-) |


q4_0 may degrade quality significantly, especially for smaller models like SD 1.5 and SDXL. q8_0 provides a middle-ground, giving half the memory savings of q4_0 but with less loading time and quality loss.
Depends on #1678 for the combobox auxiliary function.