Skip to content

Commit 44ea48b

Browse files
Merge branch 'openvinotoolkit:develop' into support-transpose
2 parents 1c86647 + ae7a584 commit 44ea48b

File tree

6 files changed

+6
-72
lines changed

6 files changed

+6
-72
lines changed

docs/usage/training_time_compression/Quantization.md renamed to docs/usage/Quantization.md

File renamed without changes.

docs/usage/post_training_compression/post_training_quantization/Usage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Post-Training Quantization is a quantization algorithm that doesn't demand retraining of a quantized model.
44
It utilizes a small subset of the initial dataset to calibrate quantization constants.
5-
Please refer to this [document](/docs/usage/training_time_compression/Quantization.md) for details of the implementation.
5+
Please refer to this [document](/docs/usage/Quantization.md) for details of the implementation.
66

77
NNCF provides an advanced Post-Training Quantization algorithm, which consists of the following techniques:
88

docs/usage/post_training_compression/weights_compression/Usage.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,8 @@ By default, the algorithm applies asymmetric 8-bit integer quantization (INT8_AS
2727

2828
| Compression Mode | Element type | Scale type | Granularity | Description |
2929
|------------------|--------------|------------|--------------------------|----------------------------|
30-
| INT8_ASYM | INT8 | FP16 | Per-channel | [Asymmetric quantization](/docs/usage/training_time_compression/Quantization.md#asymmetric-quantization) |
31-
| INT8_SYM | INT8 | FP16 | Per-channel | [Symmetric quantization](/docs/usage/training_time_compression/Quantization.md#symmetric-quantization) |
30+
| INT8_ASYM | INT8 | FP16 | Per-channel | [Asymmetric quantization](/docs/usage/Quantization.md#asymmetric-quantization) |
31+
| INT8_SYM | INT8 | FP16 | Per-channel | [Symmetric quantization](/docs/usage/Quantization.md#symmetric-quantization) |
3232

3333
#### Mixed precision modes
3434

@@ -40,8 +40,8 @@ NNCF can automatically distribute precision assignments based on quantization se
4040

4141
| Compression Mode | Element type | Scale type | Granularity | Description |
4242
|------------------|--------------|------------|--------------------------|-------------|
43-
| INT4_SYM | INT4 | FP16 | Per-channel / Group-wise | [Symmetric quantization](/docs/usage/training_time_compression/Quantization.md#symmetric-quantization) |
44-
| INT4_ASYM | INT4 | FP16 | Per-channel / Group-wise | [Asymmetric quantization](/docs/usage/training_time_compression/Quantization.md#asymmetric-quantization) |
43+
| INT4_SYM | INT4 | FP16 | Per-channel / Group-wise | [Symmetric quantization](/docs/usage/Quantization.md#symmetric-quantization) |
44+
| INT4_ASYM | INT4 | FP16 | Per-channel / Group-wise | [Asymmetric quantization](/docs/usage/Quantization.md#asymmetric-quantization) |
4545
| NF4 | FP32 | FP16 | Per-channel / Group-wise | [NormalFloat-4](https://arxiv.org/pdf/2305.14314v1.pdf) lookup table with 16 FP32 values |
4646
| CODEBOOK | Any | FP16 | Per-channel / Group-wise | Arbitrary lookup table (codebook) |
4747
| CB4 | E4M3 | FP16 | Per-channel / Group-wise | A fixed lookup table with 16 E4M3 values based on NF4 values |

docs/usage/training_time_compression/quantization_aware_training/Usage.md

Lines changed: 1 addition & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
This is a step-by-step tutorial on how to integrate the NNCF package into the existing PyTorch projects.
44
The use case implies that the user already has a training pipeline that reproduces training of the model in the floating point precision and pretrained model.
55
The task is to prepare this model for accelerated inference by simulating the compression at train time.
6-
Please refer to this [document](/docs/usage/training_time_compression/Quantization.md) for details of the implementation.
6+
Please refer to this [document](/docs/usage/Quantization.md) for details of the implementation.
77

88
## Basic usage
99

@@ -70,27 +70,6 @@ quantized_model.load_state_dict(state_dict)
7070

7171
You can save the `compressed_model` object `torch.save` as usual: via `state_dict` and `load_state_dict` methods.
7272

73-
## Advanced usage
74-
75-
### Compression of custom modules
76-
77-
With no target model code modifications, NNCF only supports native PyTorch modules with respect to trainable parameter (weight) compressed, such as `torch.nn.Conv2d`.
78-
If your model contains a custom, non-PyTorch standard module with trainable weights that should be compressed, you can register it using the `@nncf.register_module` decorator:
79-
80-
```python
81-
import nncf
82-
83-
@nncf.register_module(ignored_algorithms=[...])
84-
class MyModule(torch.nn.Module):
85-
def __init__(self, ...):
86-
self.weight = torch.nn.Parameter(...)
87-
# ...
88-
```
89-
90-
If registered module should be ignored by specific algorithms use `ignored_algorithms` parameter of decorator.
91-
92-
In the example above, the NNCF-compressed models that contain instances of `MyModule` will have the corresponding modules extended with functionality that will allow NNCF to quantize the `weight` parameter of `MyModule` before it takes part in `MyModule`'s `forward` calculation.
93-
9473
## Examples
9574

9675
- See a PyTorch [example](/examples/quantization_aware_training/torch/resnet18/README.md) for **Quantization** Compression scenario on Tiny ImageNet-200 dataset.

tests/cross_fw/sdl/__init__.py

Lines changed: 0 additions & 10 deletions
This file was deleted.

tests/cross_fw/sdl/symlinks.py

Lines changed: 0 additions & 35 deletions
This file was deleted.

0 commit comments

Comments
 (0)