Support load Qwen1.5-72B-Chat-GPTQ-Int4 by auto_gptq

Run Qwen1.5-72B-Chat-GPTQ-Int4 is much slower than Qwen1.5-72B-Chat by transformer package.
Quantited model need load by auto_gptq.

https://github.com/QwenLM/Qwen/blob/main/README_CN.md#%E6%8E%A8%E7%90%86%E6%80%A7%E8%83%BD