Skip to content

Ask for 5.0bpw and 6.0bpw quantization for GLM-4.5V-exl3 vision performance is sensitive to to quantization #109

@blackcat1402

Description

@blackcat1402

https://huggingface.co/turboderp/GLM-4.5V-exl3
I compared different quantization levels one by one and found this vision model is more sensitive than non-vision version GLM4.5 Air. Even with GLM4.5V 4.0bpw the text based coding quality is much lower that of GLM4.5 Air. I suspect the addition of vision function reduced its text output performance. However, the maximum quantization is 4.0 bpw, @turboderp @Could you provide 5.0bpw and 6.0bpw exl3 version so that I can find a "breakeven" point of GLM4.5V comparable to GLM4.5 Air. Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions