Commit 09246b2
Support 3D Weights in SE Algorithm (#3706)
### Changes
<!--- What was changed (briefly), how to reproduce (if applicable), what
the reviewers should focus on -->
### Reason for changes
<!--- Why should the change be applied -->
### Related tickets
175212
### Tests
PR Performance Job: post_training_weight_compression_performance - 57
Develop Branch Performance Job:
post_training_weight_compression_performance - 58
WC Conformance Test:
https://github.com/openvinotoolkit/nncf/actions/runs/19161281090: Pass
Model: Qwen/Qwen3-30B-A3B
NNCF Backend: OpenVINO
Higher is better.
Task: gsm8k
Limit: 100
Max New Tokens: 10000
OpenVINO version: 2026.0.0.dev20251111 (with WA for 176465)
n-shots: 5(default)
Precision Type | Filter | Value | Stderr
-- | -- | -- | --
INT4 SYM Per-Channel (with Scale estimation) Calibrated on wikitext with
128 samples | flexible-extract | 0.66 | 0.0476
| strict-match | 0.38 | 0.0488
INT4 SYM Per-Channel | flexible-extract | 0.77 | 0.0423
| strict-match | 0.28 | 0.0451
FP16 | flexible-extract | 0.91 | 0.0288
| strict-match | 0.86 | 0.0349
WWB Results with Reasoning Disabled:
INT4 Sym Per-Channel: 0.826173(vs FP16)
INT4 Sym Per-Channel with SE: 0.938537(vs FF16)
Model: openai/gpt-oss-20b
NNCF Backend: Torch
Higher is better.
```
time \
accelerate launch -m lm_eval --model hf \
--model_args "{\"pretrained\":\"${MODEL_DIR}\",\"enable_thinking\":false}" \
--tasks gsm8k_cot_llama \
--fewshot_as_multiturn \
--apply_chat_template=True \
--device cuda \
--limit 100 \
--output_path $EXP_DIR \
--gen_kwargs max_new_tokens=1024,temperature=0.6,top_p=0.95,top_k=20 \
```
gpt-oss-20b | strict-match | flexible-extract |
-- | -- | -- |
bf16 | 0.96 | 0.96 |
int4_sym_gs32_experts_int8_the_rest | 0.96 | 0.94 |
int4_sym_gs32_int8_the_rest | 0.64 | 0.96 |
int4_sym_gs32_int8_the_rest_SE_32tokens_128samples_64se_samples | 0.79 |
0.96 |
int4_sym_gs32_int8_the_rest_SE_128tokens_128samples_64se_samples | 0.81
| 0.96 |
int4_sym_gs32_int8_the_rest_SE_half_128tokens_128samples_64se_samples |
0.83 | 0.97 |
int4_sym_gs32_int8_the_rest_SE_half_256tokens_256samples _64se_samples |
0.79 | 0.95 |
int4_sym_gs32_int8_the_rest_SE_256tokens_256samples_64se_samples | 0.90
| 0.96 |
int4_sym_gs32_int8_the_rest_SE_256tokens_256samples_256se_samples | 0.95
| 0.94 |
---------
Co-authored-by: Daniil Lyakhov <[email protected]>1 parent 6d28a5d commit 09246b2
File tree
12 files changed
+428
-48
lines changed- src/nncf
- onnx/graph
- quantization
- algorithms/weight_compression
- tests
- cross_fw/test_templates
- onnx/quantization
- openvino/native
- quantization
- torch2
- function_hook/quantization
- fx
12 files changed
+428
-48
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
138 | 138 | | |
139 | 139 | | |
140 | 140 | | |
| 141 | + | |
| 142 | + | |
141 | 143 | | |
142 | 144 | | |
143 | 145 | | |
| |||
Lines changed: 26 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
31 | | - | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
32 | 33 | | |
33 | | - | |
34 | | - | |
35 | | - | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
36 | 45 | | |
37 | | - | |
38 | | - | |
39 | | - | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
40 | 54 | | |
41 | 55 | | |
42 | | - | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
43 | 60 | | |
Lines changed: 53 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
18 | 20 | | |
19 | 21 | | |
20 | 22 | | |
| |||
786 | 788 | | |
787 | 789 | | |
788 | 790 | | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
789 | 799 | | |
790 | 800 | | |
791 | 801 | | |
| |||
851 | 861 | | |
852 | 862 | | |
853 | 863 | | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
854 | 880 | | |
855 | 881 | | |
856 | 882 | | |
| |||
899 | 925 | | |
900 | 926 | | |
901 | 927 | | |
902 | | - | |
| 928 | + | |
903 | 929 | | |
904 | 930 | | |
905 | 931 | | |
| |||
1089 | 1115 | | |
1090 | 1116 | | |
1091 | 1117 | | |
1092 | | - | |
| 1118 | + | |
1093 | 1119 | | |
1094 | 1120 | | |
1095 | 1121 | | |
1096 | 1122 | | |
1097 | 1123 | | |
1098 | 1124 | | |
1099 | | - | |
| 1125 | + | |
| 1126 | + | |
1100 | 1127 | | |
1101 | 1128 | | |
1102 | 1129 | | |
| 1130 | + | |
1103 | 1131 | | |
1104 | 1132 | | |
1105 | | - | |
| 1133 | + | |
1106 | 1134 | | |
1107 | 1135 | | |
1108 | 1136 | | |
1109 | | - | |
1110 | | - | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
| 1142 | + | |
| 1143 | + | |
| 1144 | + | |
| 1145 | + | |
| 1146 | + | |
| 1147 | + | |
| 1148 | + | |
| 1149 | + | |
1111 | 1150 | | |
| 1151 | + | |
| 1152 | + | |
| 1153 | + | |
| 1154 | + | |
| 1155 | + | |
1112 | 1156 | | |
1113 | | - | |
| 1157 | + | |
1114 | 1158 | | |
1115 | 1159 | | |
1116 | 1160 | | |
| |||
1120 | 1164 | | |
1121 | 1165 | | |
1122 | 1166 | | |
1123 | | - | |
| 1167 | + | |
1124 | 1168 | | |
1125 | 1169 | | |
1126 | 1170 | | |
| |||
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
144 | 144 | | |
145 | 145 | | |
146 | 146 | | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
147 | 150 | | |
148 | 151 | | |
149 | 152 | | |
| |||
Lines changed: 28 additions & 22 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
196 | 196 | | |
197 | 197 | | |
198 | 198 | | |
| 199 | + | |
199 | 200 | | |
200 | 201 | | |
201 | | - | |
202 | | - | |
203 | | - | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
204 | 208 | | |
205 | 209 | | |
206 | 210 | | |
| |||
220 | 224 | | |
221 | 225 | | |
222 | 226 | | |
223 | | - | |
| 227 | + | |
224 | 228 | | |
225 | 229 | | |
226 | 230 | | |
| |||
233 | 237 | | |
234 | 238 | | |
235 | 239 | | |
236 | | - | |
| 240 | + | |
237 | 241 | | |
238 | 242 | | |
239 | | - | |
| 243 | + | |
240 | 244 | | |
241 | 245 | | |
242 | | - | |
243 | | - | |
| 246 | + | |
| 247 | + | |
244 | 248 | | |
245 | 249 | | |
| 250 | + | |
246 | 251 | | |
247 | | - | |
| 252 | + | |
| 253 | + | |
248 | 254 | | |
249 | 255 | | |
250 | 256 | | |
| |||
272 | 278 | | |
273 | 279 | | |
274 | 280 | | |
275 | | - | |
| 281 | + | |
276 | 282 | | |
277 | 283 | | |
278 | | - | |
| 284 | + | |
279 | 285 | | |
280 | 286 | | |
281 | 287 | | |
| |||
286 | 292 | | |
287 | 293 | | |
288 | 294 | | |
289 | | - | |
| 295 | + | |
290 | 296 | | |
291 | 297 | | |
292 | 298 | | |
| |||
340 | 346 | | |
341 | 347 | | |
342 | 348 | | |
343 | | - | |
| 349 | + | |
344 | 350 | | |
345 | | - | |
| 351 | + | |
346 | 352 | | |
347 | 353 | | |
348 | 354 | | |
349 | 355 | | |
350 | 356 | | |
351 | 357 | | |
352 | 358 | | |
353 | | - | |
| 359 | + | |
354 | 360 | | |
355 | 361 | | |
356 | 362 | | |
| |||
359 | 365 | | |
360 | 366 | | |
361 | 367 | | |
362 | | - | |
| 368 | + | |
363 | 369 | | |
364 | | - | |
| 370 | + | |
365 | 371 | | |
366 | 372 | | |
367 | 373 | | |
368 | | - | |
| 374 | + | |
369 | 375 | | |
370 | | - | |
| 376 | + | |
371 | 377 | | |
372 | | - | |
| 378 | + | |
373 | 379 | | |
374 | | - | |
| 380 | + | |
375 | 381 | | |
376 | 382 | | |
377 | 383 | | |
| |||
421 | 427 | | |
422 | 428 | | |
423 | 429 | | |
424 | | - | |
| 430 | + | |
425 | 431 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| |||
Lines changed: 33 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
231 | 231 | | |
232 | 232 | | |
233 | 233 | | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
234 | 248 | | |
235 | 249 | | |
236 | 250 | | |
237 | 251 | | |
238 | 252 | | |
239 | 253 | | |
240 | 254 | | |
241 | | - | |
| 255 | + | |
| 256 | + | |
242 | 257 | | |
243 | 258 | | |
244 | | - | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
245 | 266 | | |
246 | 267 | | |
247 | | - | |
248 | 268 | | |
249 | 269 | | |
250 | 270 | | |
| |||
258 | 278 | | |
259 | 279 | | |
260 | 280 | | |
261 | | - | |
262 | | - | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
263 | 290 | | |
264 | 291 | | |
265 | 292 | | |
| |||
328 | 355 | | |
329 | 356 | | |
330 | 357 | | |
| 358 | + | |
331 | 359 | | |
332 | 360 | | |
333 | 361 | | |
| |||
0 commit comments