Commit 1451632
Optimized compression for FP8 modes (#3748)
### Changes
Added optimized OpenVINO weights compression for fp8e4m3 data type.
`optimum-cli export openvino` time:
| Model | Memory Before (MiB) | Memory After (MiB) | Time Before (sec) |
Time After (sec) |
|---------------------------|---------------------|----------------------------|-------------------|---------------------------|
| Llama-3.2-1B | 2328.03 | 2394.14 (+2.84%) | 63.52 | 14.03 (-77.92%) |
| Phi-4-mini | 5608.48 | 5197.70 (-7.33%) | 187.34 | 28.05 (-85.03%) |
| Llama-3.1-8B | 9918.52 | 8443.87 (-14.86%) | 399.14 | 48.48 (-87.86%)
|
### Reason for changes
UX improvement.
### Tests
Extended existing tests.
https://github.com/openvinotoolkit/nncf/actions/runs/19767009608
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>1 parent 9065776 commit 1451632
File tree
16 files changed
+558
-118
lines changed- src/nncf
- openvino/optimized_functions
- quantization/algorithms/weight_compression
- tensor
- functions
- tests
- cross_fw/test_templates
- openvino
- native/quantization
- optimized_functions
16 files changed
+558
-118
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
16 | 15 | | |
17 | 16 | | |
18 | 17 | | |
| |||
23 | 22 | | |
24 | 23 | | |
25 | 24 | | |
| 25 | + | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| |||
49 | 50 | | |
50 | 51 | | |
51 | 52 | | |
| 53 | + | |
| 54 | + | |
52 | 55 | | |
53 | 56 | | |
54 | 57 | | |
| |||
117 | 120 | | |
118 | 121 | | |
119 | 122 | | |
120 | | - | |
| 123 | + | |
121 | 124 | | |
122 | 125 | | |
123 | 126 | | |
| |||
129 | 132 | | |
130 | 133 | | |
131 | 134 | | |
132 | | - | |
133 | | - | |
| 135 | + | |
134 | 136 | | |
135 | 137 | | |
136 | 138 | | |
| |||
177 | 179 | | |
178 | 180 | | |
179 | 181 | | |
| 182 | + | |
| 183 | + | |
180 | 184 | | |
181 | 185 | | |
182 | 186 | | |
| |||
235 | 239 | | |
236 | 240 | | |
237 | 241 | | |
238 | | - | |
| 242 | + | |
239 | 243 | | |
240 | 244 | | |
241 | 245 | | |
| |||
290 | 294 | | |
291 | 295 | | |
292 | 296 | | |
| 297 | + | |
| 298 | + | |
293 | 299 | | |
294 | 300 | | |
295 | 301 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
579 | 580 | | |
580 | 581 | | |
581 | 582 | | |
582 | | - | |
583 | | - | |
584 | 583 | | |
585 | 584 | | |
586 | 585 | | |
| |||
605 | 604 | | |
606 | 605 | | |
607 | 606 | | |
608 | | - | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
609 | 613 | | |
610 | 614 | | |
611 | 615 | | |
| |||
633 | 637 | | |
634 | 638 | | |
635 | 639 | | |
636 | | - | |
637 | | - | |
638 | | - | |
639 | | - | |
640 | | - | |
641 | | - | |
642 | | - | |
| 640 | + | |
| 641 | + | |
643 | 642 | | |
644 | | - | |
| 643 | + | |
645 | 644 | | |
646 | 645 | | |
647 | 646 | | |
648 | 647 | | |
649 | 648 | | |
650 | 649 | | |
651 | | - | |
652 | | - | |
| 650 | + | |
653 | 651 | | |
654 | 652 | | |
655 | 653 | | |
| |||
Lines changed: 33 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
50 | 57 | | |
51 | 58 | | |
52 | 59 | | |
| |||
74 | 81 | | |
75 | 82 | | |
76 | 83 | | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
77 | 109 | | |
78 | 110 | | |
79 | 111 | | |
| |||
Lines changed: 33 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
14 | 17 | | |
15 | 18 | | |
16 | 19 | | |
| |||
101 | 104 | | |
102 | 105 | | |
103 | 106 | | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
0 commit comments