Skip to content

Commit 7994773

Browse files
authored
🔥[ABQ-LLM] Arbitrary-Bit Quantized Inference Acceleration for Large Language Models (#37)
1 parent 2e608c2 commit 7994773

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
165165
|2024.05|🔥[I-LLM] I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models(@Houmo AI)|[[pdf]](https://arxiv.org/pdf/2405.17849)|⚠️|⭐️ |
166166
|2024.06|🔥[OutlierTune] OutlierTune: Efficient Channel-Wise Quantization for Large Language Models(@Beijing University)|[[pdf]](https://arxiv.org/pdf/2406.18832)|⚠️|⭐️ |
167167
|2024.06|🔥[GPTQT] GPTQT: Quantize Large Language Models Twice to Push the Efficiency(@zju)|[[pdf]](https://arxiv.org/pdf/2407.02891)|⚠️|⭐️ |
168+
|2024.08|🔥[ABQ-LLM] ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models(@ByteDance)|[[pdf]](https://arxiv.org/pdf/2408.08554)|⚠️|⭐️ |
168169

169170

170171
### 📖IO/FLOPs-Aware/Sparse Attention ([©️back👆🏻](#paperlist))

0 commit comments

Comments
 (0)