https://huggingface.co/turboderp/GLM-4.5V-exl3
I compared different quantization levels one by one and found this vision model is more sensitive than non-vision version GLM4.5 Air. Even with GLM4.5V 4.0bpw the text based coding quality is much lower that of GLM4.5 Air. I suspect the addition of vision function reduced its text output performance. However, the maximum quantization is 4.0 bpw, @turboderp @Could you provide 5.0bpw and 6.0bpw exl3 version so that I can find a "breakeven" point of GLM4.5V comparable to GLM4.5 Air. Thanks in advance.