LUT-based compressed data type #3496

andreyanufr · 2025-05-15T11:53:14Z

Changes

Implementation of compression to fixed codebook (LUT) values .

Reason for changes

CVS-167084

Related tickets

CVS-167084

Tests

tests/openvino/native/quantization/test_weights_compression.py

https://github.com/openvinotoolkit/nncf/actions/runs/16024264575

examples/llm_compression/openvino/smollm2_360m_codebook/main.py

src/nncf/quantization/algorithms/weight_compression/onnx_backend.py

src/nncf/quantization/algorithms/weight_compression/weight_lowering.py

src/nncf/quantization/algorithms/weight_compression/config.py

nikita-savelyevv

LGTM!

examples/llm_compression/openvino/smollm2_360m_codebook/main.py

AlexanderDokuchaev · 2025-07-07T07:27:30Z

src/nncf/quantization/algorithms/weight_compression/algorithm.py

        ranks = [advanced_parameters.lora_adapter_rank, advanced_parameters.lora_correction_params.adapter_rank]
+
+        if advanced_parameters.codebook_params.codebook is not None:
+            codebook = Tensor(advanced_parameters.codebook_params.codebook).as_numpy_tensor().data


For what is it need Tensor(advanced_parameters.codebook_params.codebook).as_numpy_tensor().data?

To be sure that
if (codebook[:-1] >= codebook[1:]).any():
works correctly for all data types.

AlexanderDokuchaev · 2025-07-07T07:33:42Z

src/nncf/quantization/algorithms/weight_compression/algorithm.py

+        return WeightCompressionConfig(
+            mode=self._mode,
+            group_size=self._group_size,
+            codebook_values=get_cb4_quantiles()
+            if self._mode == CompressWeightsMode.CB4_F8E4M3
+            else Tensor(self._advanced_parameters.codebook_params.codebook),
+        )


Suggested change

return WeightCompressionConfig(

mode=self._mode,

group_size=self._group_size,

codebook_values=get_cb4_quantiles()

if self._mode == CompressWeightsMode.CB4_F8E4M3

else Tensor(self._advanced_parameters.codebook_params.codebook),

)

codebook_values = get_cb4_quantiles() if self._mode == CompressWeightsMode.CB4_F8E4M3 else Tensor(self._advanced_parameters.codebook_params.codebook)

return WeightCompressionConfig(

mode=self._mode,

group_size=self._group_size,

codebook_values=codebook_values,

)

AlexanderDokuchaev · 2025-07-07T07:36:33Z

src/nncf/quantization/algorithms/weight_compression/weight_lowering.py


    scale = fns.max(fns.abs(weight), axis=reduction_axes, keepdims=True)
+    if config.mode in [CompressWeightsMode.E2M1, CompressWeightsMode.CODEBOOK, CompressWeightsMode.CB4_F8E4M3]:
+        max_val = 6.0 if config.mode == CompressWeightsMode.E2M1 else max(np.abs(config.get_numpy_codebook()))


Use nncf.Tensor in common code

AlexanderDokuchaev · 2025-07-07T07:38:26Z

src/nncf/quantization/algorithms/weight_compression/weight_lowering.py

+    )
+
+    if center_of_quantiles is None:
+        quantiles = np.array(quantiles)


Please don't combine operations with backend specific types and nncf.Tensor in common code.

src/nncf/quantization/algorithms/weight_compression/constants.py

src/nncf/openvino/graph/metatypes/openvino_metatypes.py

2) Changed custom codebook to smaller in codebook example.

src/nncf/openvino/graph/metatypes/openvino_metatypes.py

examples/llm_compression/openvino/smollm2_360m_codebook/main.py

src/nncf/quantization/algorithms/weight_compression/algorithm.py

src/nncf/quantization/algorithms/weight_compression/onnx_backend.py

src/nncf/parameters.py

src/nncf/quantization/quantize_model.py

src/nncf/quantization/advanced_parameters.py

src/nncf/quantization/algorithms/weight_compression/algorithm.py

src/nncf/quantization/algorithms/weight_compression/parameters.py

alexsu52 and others added 27 commits September 2, 2024 13:22

Support scale estimation inside GPTQ

488cacc

fix for INT4_ASYM

ee64877

Merge remote-tracking branch 'upstream/develop' into develop

f22e411

Merge remote-tracking branch 'upstream/develop' into develop

51b4d7b

Merge remote-tracking branch 'upstream/develop' into develop

f66cd1e

Merge remote-tracking branch 'upstream/develop' into develop

7ce5a53

Merge remote-tracking branch 'upstream/develop' into develop

f74d156

Merge remote-tracking branch 'upstream/develop' into develop

5288c79

Merge remote-tracking branch 'upstream/develop' into develop

1becf15

Merge remote-tracking branch 'upstream/develop' into develop

047d7d9

Merge remote-tracking branch 'upstream/develop' into develop

c0c7e57

Merge remote-tracking branch 'upstream/develop' into develop

b74dea1

Merge remote-tracking branch 'upstream/develop' into develop

26a9a77

Merge remote-tracking branch 'upstream/develop' into develop

25fcc2c

Merge remote-tracking branch 'upstream/develop' into develop

26d4887

Merge remote-tracking branch 'upstream/develop' into develop

7748233

Merge remote-tracking branch 'upstream/develop' into develop

df251b3

Merge remote-tracking branch 'upstream/develop' into develop

4c134c4

Merge remote-tracking branch 'upstream/develop' into develop

6147097

Merge remote-tracking branch 'upstream/develop' into develop

2b94d28

Merge remote-tracking branch 'upstream/develop' into develop

5e312a5

Draft.

2fc8f9c

Draft.

7c6795e

Draft for codebook.

1dcdd75

Compression for default codebook.

b870d8d

Reverted change in spell check.

ac26b8a

Fixed compression to 4bit for codebook indexes.

16d7a9e

github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF ONNX Pull requests that updates NNCF ONNX labels May 15, 2025

Applied suggestion.

f1c68d6

github-actions bot removed NNCF PT Pull requests that updates NNCF PyTorch NNCF ONNX Pull requests that updates NNCF ONNX NNCF PTQ Pull requests that updates NNCF PTQ labels Jul 2, 2025

alexsu52 reviewed Jul 2, 2025

View reviewed changes

examples/llm_compression/openvino/smollm2_360m_codebook/main.py Outdated Show resolved Hide resolved

examples/llm_compression/openvino/smollm2_360m_codebook/main.py Outdated Show resolved Hide resolved

andreyanufr added 2 commits July 2, 2025 10:13

Merge remote-tracking branch 'upstream/develop' into aanuf/LUT

7673381

Fixed bug.

8159e56

nikita-savelyevv reviewed Jul 2, 2025

View reviewed changes

Fixed bug for onnx.

b24936b

nikita-savelyevv approved these changes Jul 2, 2025

View reviewed changes

Applied suggestion.

6fdfd33

andreyanufr requested a review from alexsu52 July 3, 2025 11:57

AlexanderDokuchaev requested changes Jul 7, 2025

View reviewed changes

src/nncf/quantization/algorithms/weight_compression/constants.py Outdated Show resolved Hide resolved

src/nncf/quantization/algorithms/weight_compression/constants.py Outdated Show resolved Hide resolved

AlexanderDokuchaev requested changes Jul 7, 2025

View reviewed changes

src/nncf/openvino/graph/metatypes/openvino_metatypes.py Show resolved Hide resolved

andreyanufr added 6 commits July 8, 2025 10:15

Applied suggestions.

17d6d2d

Applied suggestions.

61abc6a

1) Added docstrings for codebook example.

d1d8232

2) Changed custom codebook to smaller in codebook example.

Applied suggestions.

b8f2526

Applied suggestion.

ca342ab

Changed docstring formatting.

635ef23

AlexanderDokuchaev requested changes Jul 9, 2025

View reviewed changes

Applied suggestions.

50a94aa

AlexanderDokuchaev approved these changes Jul 9, 2025

View reviewed changes

alexsu52 reviewed Jul 9, 2025

View reviewed changes

src/nncf/parameters.py Outdated Show resolved Hide resolved

src/nncf/quantization/quantize_model.py Outdated Show resolved Hide resolved

src/nncf/quantization/advanced_parameters.py Outdated Show resolved Hide resolved

alexsu52 reviewed Jul 9, 2025

View reviewed changes

src/nncf/quantization/algorithms/weight_compression/algorithm.py Outdated Show resolved Hide resolved

src/nncf/quantization/algorithms/weight_compression/parameters.py Outdated Show resolved Hide resolved

Applied suggestions.

82d9e5c

alexsu52 approved these changes Jul 10, 2025

View reviewed changes

alexsu52 merged commit 71ae2c1 into openvinotoolkit:develop Jul 10, 2025
20 checks passed

andreyanufr mentioned this pull request Sep 2, 2025

[release_v2180] Release notes #3629

Merged

LUT-based compressed data type #3496

LUT-based compressed data type #3496

Uh oh!

Conversation

andreyanufr commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Reason for changes

Related tickets

Tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nikita-savelyevv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

andreyanufr commented May 15, 2025 •

edited

Loading