Ask for 5.0bpw and 6.0bpw quantization for GLM-4.5V-exl3 vision performance is sensitive to to quantization

https://huggingface.co/turboderp/GLM-4.5V-exl3
I compared different quantization levels one by one and found this vision model is more sensitive than non-vision version GLM4.5 Air. Even with  GLM4.5V 4.0bpw the text based coding quality is much lower that of GLM4.5 Air. I suspect the addition of vision function reduced its text output performance. However, the maximum quantization is 4.0 bpw, @turboderp @could you provide 5.0bpw and 6.0bpw exl3 version so that I can find a "breakeven" point of GLM4.5V comparable to GLM4.5 Air. Thanks in advance.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Ask for 5.0bpw and 6.0bpw quantization for GLM-4.5V-exl3 vision performance is sensitive to to quantization #109

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Ask for 5.0bpw and 6.0bpw quantization for GLM-4.5V-exl3 vision performance is sensitive to to quantization #109

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions