Skip to content

Commit 6a65d69

Browse files
authored
Update README.md (#2665)
* Update README.md * Update version.py
1 parent f4608ab commit 6a65d69

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
</p>
2121

2222
## Latest News
23-
* 04/03/2026 [6.0.2](https://github.com/ModelCloud/GPTQModel/releases/tag/v6.0.2): 🎉 New quantization methods: `ParoQuant`, `GGUF`, `FP8`, `EXL3`, and `FOEM: First-Order Error Matters`. Added PrismML/Bonsai 1bit model quantization (inference only), faster ParoQuant/AWQ kernels, ParoQuant `optimization scope` control: `module` (Paro Lite) or `layer` (Paro reference), plus `Gemma4`, `MiniCPM-O`, `MiniCPM-V`, and `GLM4 MOE lite` model support.
23+
* 04/03/2026 [6.0.3](https://github.com/ModelCloud/GPTQModel/releases/tag/v6.0.3): 🎉 New quantization methods: `ParoQuant`, `GGUF`, `FP8`, `EXL3`, and `FOEM: First-Order Error Matters`. Added PrismML/Bonsai 1bit model quantization (inference only), faster ParoQuant/AWQ kernels, ParoQuant `optimization scope` control: `module` (Paro Lite) or `layer` (Paro reference), plus `Gemma4`, `MiniCPM-O`, `MiniCPM-V`, and `GLM4 MOE lite` model support.
2424
* 03/19/2026 [5.8.0](https://github.com/ModelCloud/GPTQModel/releases/tag/v5.8.0): ✨HF Transformers 5.3.0 support with auto-defusing of `fused` models via pypi pkg: [Defuser](https://github.com/ModelCloud/Defuser). Qwen 3.5 family support added. New fast HF `cpu` kernels for GPTQ/AWQ added. Experimental INT8 `cpu` kernel added for GPTQ.
2525
* 03/09/2026 [main]: ✨Qwen 3.5 MoE model support added. New HF Kernel support added for AWQ.
2626
HF Kernel for both gptq/awq are now used by default for cpu devices for best performance. New INT8 kernel ported from Intel for gptq.
@@ -254,7 +254,7 @@ Selected public references where teams or companies explicitly mention `GPTQMode
254254
| Apertus || EXAONE 3/4 || Dots1 || Mistral3 || Qwen 2/3 (Next/MoE) ||
255255
| Baichuan || Falcon (H1) || InternLM 1/2.5 || Mixtral || Qwen 2/2.5/3 VL ||
256256
| Bloom || FastVLM || Kimi K2 || MobileLLM || Qwen 2.5/3 Omni ||
257-
| ChatGLM || Gemma 1/2/3 || Klear || MOSS || RefinedWeb ||
257+
| ChatGLM || Gemma 1-4 || Klear || MOSS || RefinedWeb ||
258258
| CodeGen || GPTBigCod || LING/RING || MPT || StableLM ||
259259
| Cohere 1-2 || GPTQ-Neo(X) || Llama 1-3.3 || Nemotron H || StarCoder2 ||
260260
| DBRX Converted || GPT-2 || Llama 3.2 VL || Nemotron Ultra || TeleChat2 ||

gptqmodel/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,4 @@
77
# even minor versions are release
88
# 5.2.0 => release, 5.1.0 => devel
99
# micro version (5.2.x) denotes patch fix, i.e. 5.2.1 is a patch fix release
10-
__version__ = "6.0.2"
10+
__version__ = "6.0.3"

0 commit comments

Comments
 (0)