diff --git a/README.md b/README.md index 5be32b5..9ead8c8 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ This repo collects papers, documents, and codes about model quantization for any - [Survey\_of\_Binarization](#survey_of_binarization) - [Survey\_of\_Quantization](#survey_of_quantization) - [Papers](#papers) + - [2025](#2025) - [2024](#2024) - [2023](#2023) - [2022](#2022) @@ -109,6 +110,10 @@ Amir Gholami\* , Sehoon Kim\* , Zhen Dong\* , Zhewei Yao\* , Michael W. Mahoney, ---- +### 2025 + +- [[ICML](https://arxiv.org/abs/2505.07004)] GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance [[code](https://github.com/snu-mllab/GuidedQuant)]![GitHub Repo stars](https://img.shields.io/github/stars/snu-mllab/GuidedQuant) + ### 2024 - [[TMLR](https://openreview.net/pdf?id=IEKtMMSblm)] PLUM: Improving Inference Efficiency By Leveraging Repetition-Sparsity Trade-Off [[code](https://github.com/sachitkuhar/PLUM)][[webpage](https://github.com/sachitkuhar/PLUM)][[video](https://www.youtube.com/watch?v=nE_CYDWqQ_I)][**`bnn`**] [**`inference`**]