We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 6ec5080 commit 025f894Copy full SHA for 025f894
docs/index.html
@@ -272,6 +272,14 @@ <h2 class="title is-4">Insight: Information Disparity in Pre-training vs. Fine-t
272
This addresses the <b>serving challenge</b>.
273
</p>
274
275
+ <p style="margin-bottom: 20px;">
276
+ Past work (GPT-Zip, DeltaZip) has also explored quantization of the weight delta, achieving
277
+ quantization levels as low as 2-bits by applying methods introduced by GPTQ. We find that
278
+ the weight delta is extremely compressible, and are able to achieve <b>1-bit quantization</b>
279
+ with minimal performance degradation using a simpler methodology.
280
+ </p>
281
+
282
283
<h2 class="title is-4">BitDelta Overview</h2>
284
<h2 class="title is-5">1-bit quantization</h2>
285
0 commit comments