Skip to content

Commit 98515cb

Browse files
Update documentation (#3703)
### Changes Update the documentation to reflect the current status of ONNX backend support.
1 parent 7c54fc3 commit 98515cb

File tree

4 files changed

+13
-7
lines changed

4 files changed

+13
-7
lines changed

README.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ learning frameworks.
4444
| Compression algorithm | OpenVINO | PyTorch | TorchFX | TensorFlow | ONNX |
4545
| :------------------------------------------------------------------------------------------------------- | :-------: | :-------: | :-----------: | :-----------: | :-----------: |
4646
| [Post-Training Quantization](./docs/usage/post_training_compression/post_training_quantization/Usage.md) | Supported | Supported | Experimental | Supported | Supported |
47-
| [Weights Compression](./docs/usage/post_training_compression/weights_compression/Usage.md) | Supported | Supported | Experimental | Not supported | Not supported |
47+
| [Weights Compression](./docs/usage/post_training_compression/weights_compression/Usage.md) | Supported | Supported | Experimental | Not supported | Supported |
4848
| [Activation Sparsity](./src/nncf/experimental/torch/sparsify_activations/ActivationSparsity.md) | Not supported | Experimental | Not supported| Not supported| Not supported |
4949

5050
### Training-Time Compression Algorithms
@@ -409,9 +409,9 @@ A list of notebooks demonstrating OpenVINO conversion and inference together wit
409409
| [LLM Instruction Following](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-question-answering) | Weight Compression | OpenVINO | NLP, Instruction Following |
410410
| [LLM Chat Bots](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot) | Weight Compression | OpenVINO | NLP, Chat Bot |
411411

412-
### Post-Training Quantization Examples
412+
### Post-Training Quantization and Weight Compression Examples
413413

414-
Compact scripts demonstrating quantization and corresponding inference speed boost:
414+
Compact scripts demonstrating quantization/weight compression and corresponding inference speed boost:
415415

416416
| Example Name | Compression Algorithm | Backend | Domain |
417417
|:-----------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------:|:----------:|:----------------------:|
@@ -424,6 +424,12 @@ Compact scripts demonstrating quantization and corresponding inference speed boo
424424
| [TorchFX Resnet18](./examples/post_training_quantization/torch_fx/resnet18/README.md) | Post-Training Quantization | TorchFX | Image Classification |
425425
| [TensorFlow MobileNetV2](./examples/post_training_quantization/tensorflow/mobilenet_v2/README.md) | Post-Training Quantization | TensorFlow | Image Classification |
426426
| [ONNX MobileNetV2](./examples/post_training_quantization/onnx/mobilenet_v2/README.md) | Post-Training Quantization | ONNX | Image Classification |
427+
| [ONNX YOLOv8 QwAС](./examples/post_training_quantization/onnx/yolov8_quantize_with_accuracy_control/README.md) | Post-Training Quantization with Accuracy Control | ONNX | Object Detection |
428+
| [ONNX TinyLlama WC](./examples/llm_compression/onnx/tiny_llama/README.md) | Weight Compression | ONNX | LLM |
429+
| [TorchFX TinyLlama WC](./examples/llm_compression/torch_fx/tiny_llama/README.md) | Weight Compression | TorchFX | LLM |
430+
| [OpenVINO TinyLlama WC](./examples/llm_compression/openvino/tiny_llama/README.md) | Weight Compression | OpenVINO | LLM |
431+
| [OpenVINO TinyLlama WC with HS](./examples/llm_compression/openvino/tiny_llama_find_hyperparams/README.md) | Weight Compression with Hyperparameters Search | OpenVINO | LLM |
432+
| [ONNX TinyLlama WC with SE](./examples/llm_compression/onnx/tiny_llama_scale_estimation/README.md) | Weight Compression with Scale Estimation | ONNX | LLM |
427433

428434
### Quantization-Aware Training Examples
429435

docs/Algorithms.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
- Signed and unsigned
88
- Per tensor/per channel
99
- Each backend support export to the OpenVINO format
10-
- [Weights compression](./usage/post_training_compression/weights_compression/Usage.md) (OpenVINO, PyTorch, TorchFX)
10+
- [Weights compression](./usage/post_training_compression/weights_compression/Usage.md) (OpenVINO, PyTorch, TorchFX, ONNX)
1111
- Symmetric 8 bit compression mode
1212
- Symmetric and asymmetric 4 bit compression mode
1313
- NF4 compression mode

docs/PyPiPublishing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ For more information about NNCF, see:
4141
| Compression algorithm | OpenVINO | PyTorch | TorchFX | TensorFlow | ONNX |
4242
| :------------------------------------------------------------------------------------------------------- | :-------: | :-------: | :-----------: | :-----------: | :-----------: |
4343
| [Post-Training Quantization](https://github.com/openvinotoolkit/nncf/blob/develop/docs/usage/post_training_compression/post_training_quantization/Usage.md) | Supported | Supported | Experimental | Supported | Supported |
44-
| [Weights Compression](https://github.com/openvinotoolkit/nncf/blob/develop/docs/usage/post_training_compression/weights_compression/Usage.md) | Supported | Supported | Experimental | Not supported | Not supported |
44+
| [Weights Compression](https://github.com/openvinotoolkit/nncf/blob/develop/docs/usage/post_training_compression/weights_compression/Usage.md) | Supported | Supported | Experimental | Not supported | Supported |
4545
| [Activation Sparsity](https://github.com/openvinotoolkit/nncf/blob/develop/src/nncf/experimental/torch/sparsify_activations/ActivationSparsity.md) | Not supported | Experimental | Not supported| Not supported| Not supported |
4646

4747
### Training-Time Compression Algorithms

docs/usage/post_training_compression/weights_compression/Usage.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717

1818
The Weights Compression algorithm is aimed at compressing the weights of the models and can be used to optimize the model footprint and performance of large models where the size of weights is relatively larger than the size of activations, for example, Large Language Models (LLM). The algorithm compresses weights for Linear, Convolution and Embedding layers.
1919

20-
[OpenVINO](https://github.com/openvinotoolkit/openvino) is the preferred backend to run Weights Compression with. PyTorch and Torch FX are also supported.
20+
[OpenVINO](https://github.com/openvinotoolkit/openvino) is the preferred backend to run Weights Compression with. PyTorch, ONNX and Torch FX are also supported.
2121

2222
### Supported modes
2323

@@ -696,7 +696,7 @@ Accuracy/footprint trade-off for `microsoft/Phi-3-mini-4k-instruct`:
696696

697697
### Limitations
698698

699-
- The algorithm is supported for OpenVINO, PyTorch and Torch FX models.
699+
- The algorithm is supported for OpenVINO, PyTorch, ONNX and Torch FX models.
700700
- The compression applies in-place.
701701
- The compressed model is not trainable.
702702
- INT4_SYM, INT4_ASYM, NF4 and E2M1 modes, grouped quantization and mixed precision selection is available for OpenVINO backend only.

0 commit comments

Comments
 (0)