Update README.md (#2665)

Qubitium · web-flow · commit 6a65d6918407 · 2026-04-03T07:55:57.000+08:00
* Update README.md

* Update version.py
diff --git a/README.md b/README.md
@@ -20,7 +20,7 @@
 </p>
 
 ## Latest News
-* 04/03/2026 [6.0.2](https://github.com/ModelCloud/GPTQModel/releases/tag/v6.0.2): 🎉 New quantization methods: `ParoQuant`, `GGUF`, `FP8`, `EXL3`, and `FOEM: First-Order Error Matters`. Added PrismML/Bonsai 1bit model quantization (inference only), faster ParoQuant/AWQ kernels, ParoQuant `optimization scope` control: `module` (Paro Lite) or `layer` (Paro reference), plus `Gemma4`, `MiniCPM-O`, `MiniCPM-V`, and `GLM4 MOE lite` model support.
+* 04/03/2026 [6.0.3](https://github.com/ModelCloud/GPTQModel/releases/tag/v6.0.3): 🎉 New quantization methods: `ParoQuant`, `GGUF`, `FP8`, `EXL3`, and `FOEM: First-Order Error Matters`. Added PrismML/Bonsai 1bit model quantization (inference only), faster ParoQuant/AWQ kernels, ParoQuant `optimization scope` control: `module` (Paro Lite) or `layer` (Paro reference), plus `Gemma4`, `MiniCPM-O`, `MiniCPM-V`, and `GLM4 MOE lite` model support.
 * 03/19/2026 [5.8.0](https://github.com/ModelCloud/GPTQModel/releases/tag/v5.8.0): ✨HF Transformers 5.3.0 support with auto-defusing of `fused` models via pypi pkg: [Defuser](https://github.com/ModelCloud/Defuser). Qwen 3.5 family support added. New fast HF `cpu` kernels for GPTQ/AWQ added. Experimental INT8 `cpu` kernel added for GPTQ. 
 * 03/09/2026 [main]: ✨Qwen 3.5 MoE model support added. New HF Kernel support added for AWQ. 
 HF Kernel for both gptq/awq are now used by default for cpu devices for best performance. New INT8 kernel ported from Intel for gptq. 
@@ -254,7 +254,7 @@ Selected public references where teams or companies explicitly mention `GPTQMode
 | Apertus           | ✅ | EXAONE 3/4    | ✅ | Dots1                  | ✅ | Mistral3       | ✅ | Qwen 2/3 (Next/MoE) | ✅ |
 | Baichuan          | ✅ | Falcon (H1)   | ✅ | InternLM 1/2.5         | ✅ | Mixtral        | ✅ | Qwen 2/2.5/3 VL     | ✅ |
 | Bloom             | ✅ | FastVLM       | ✅ | Kimi K2                | ✅ | MobileLLM      | ✅ | Qwen 2.5/3 Omni     | ✅ |
-| ChatGLM           | ✅ | Gemma 1/2/3   | ✅ | Klear                  | ✅ | MOSS           | ✅ | RefinedWeb          | ✅ |
+| ChatGLM           | ✅ | Gemma 1-4     | ✅ | Klear                  | ✅ | MOSS           | ✅ | RefinedWeb          | ✅ |
 | CodeGen           | ✅ | GPTBigCod     | ✅ | LING/RING              | ✅ | MPT            | ✅ | StableLM            | ✅ |
 | Cohere 1-2        | ✅ | GPTQ-Neo(X)   | ✅ | Llama 1-3.3            | ✅ | Nemotron H     | ✅ | StarCoder2          | ✅ |
 | DBRX Converted    | ✅ | GPT-2         | ✅ | Llama 3.2 VL           | ✅ | Nemotron Ultra | ✅ | TeleChat2           | ✅ |
diff --git a/gptqmodel/version.py b/gptqmodel/version.py
@@ -7,4 +7,4 @@
 # even minor versions are release
 # 5.2.0 => release, 5.1.0 => devel
 # micro version (5.2.x) denotes patch fix, i.e. 5.2.1 is a patch fix release
-__version__ = "6.0.2"
+__version__ = "6.0.3"