Support Quantized Model. For example: https://huggingface.co/THUDM/chatglm2-6b-int4 https://huggingface.co/Qwen/Qwen1.5-72B-Chat-GPTQ-Int4