Commit 88445b3
authored
Support 3D Weights in AWQ Algorithm (openvinotoolkit#3728)
### Changes
The core idea of this change is to first unsqueeze the weights so that
it becomes 3D. Even the 2D weights. Then the rest of the algorithm
implementation is changed such that it expects the weight shape to be
3D.
Earlier we traversed each group in a weight individually. But now Since
we want to find the scales for per-channel as well as per-expert, we
traverse by group index as well as batch/expert index (this is just 1
for 2D weights so the behavior is same as before).
### Reason for changes
Support AWQ for models with 3D weights such as MoE models.
### Related tickets
175789 & 175212
### Tests
Current AWQ tests were extended to include the AWQ test models with 3D
weights.
**Accuracy Evaluation Results:**
Model: Qwen/Qwen3-30B-A3B
NNCF Backend: OpenVINO
Higher is better.
Task: gsm8k
Limit: 100
Max New Tokens: 10000
OpenVINO version: 2026.0.0.dev20260102
n-shots: 5(default)
Model | Filter | Score (exact_match) | Stderr
-- | -- | -- | --
FP16 | flexible-extract | 0.92 | 0.0273
| strict-match | 0.82 | 0.0386
INT4 SYM Per-Channel (no AWQ) | flexible-extract | 0.83 | 0.0378
| strict-match | 0.27 | 0.0446
INT4 SYM Per-Channel (AWQ data-free) | flexible-extract | 0.83 | 0.0378
| strict-match | 0.22 | 0.0416
INT4 Per-Channel (AWQ data-aware) | flexible-extract | 0.83 | 0.0378
| strict-match | 0.35 | 0.0479
Comparison of accuracy with `meta-llama/Llama-3.2-1B-Instruct` on
Develop and this branch
Variant | bits_per_byte | byte_perplexity | word_perplexity
-- | -- | -- | --
This Branch (Data Aware) | 0.7774 | 1.7141 | 17.8427
This Branch (Data Free) | 0.7774 | 1.7141 | 17.8427
develop (Data Aware) | 0.7774 | 1.7141 | 17.8427
develop (Data Free) | 0.7774 | 1.7141 | 17.8427
WC Conformance test:
https://github.com/openvinotoolkit/nncf/actions/runs/20883502496 - Pass
WC Example Test:
https://github.com/openvinotoolkit/nncf/actions/runs/20883506117 - Pass1 parent 2c068c5 commit 88445b3
File tree
8 files changed
+732
-245
lines changed- src/nncf/quantization/algorithms/weight_compression
- tests
- cross_fw/test_templates
- onnx/quantization
- openvino/native
- quantization
- torch
- function_hook/quantization
- fx
8 files changed
+732
-245
lines changedLines changed: 71 additions & 32 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
159 | 159 | | |
160 | 160 | | |
161 | 161 | | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | 162 | | |
| 163 | + | |
167 | 164 | | |
168 | 165 | | |
169 | 166 | | |
| |||
172 | 169 | | |
173 | 170 | | |
174 | 171 | | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
175 | 190 | | |
176 | | - | |
| 191 | + | |
177 | 192 | | |
178 | 193 | | |
179 | 194 | | |
| |||
185 | 200 | | |
186 | 201 | | |
187 | 202 | | |
188 | | - | |
| 203 | + | |
189 | 204 | | |
190 | 205 | | |
191 | 206 | | |
| |||
198 | 213 | | |
199 | 214 | | |
200 | 215 | | |
201 | | - | |
202 | | - | |
203 | | - | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
204 | 228 | | |
205 | 229 | | |
206 | 230 | | |
| |||
223 | 247 | | |
224 | 248 | | |
225 | 249 | | |
| 250 | + | |
| 251 | + | |
226 | 252 | | |
227 | 253 | | |
228 | 254 | | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
229 | 262 | | |
230 | 263 | | |
231 | 264 | | |
232 | 265 | | |
233 | 266 | | |
234 | 267 | | |
235 | | - | |
236 | | - | |
| 268 | + | |
| 269 | + | |
237 | 270 | | |
238 | 271 | | |
239 | 272 | | |
240 | | - | |
| 273 | + | |
241 | 274 | | |
242 | 275 | | |
243 | | - | |
244 | | - | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
245 | 281 | | |
246 | 282 | | |
247 | 283 | | |
248 | | - | |
249 | | - | |
250 | | - | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
251 | 290 | | |
252 | | - | |
| 291 | + | |
253 | 292 | | |
254 | 293 | | |
255 | 294 | | |
256 | 295 | | |
257 | 296 | | |
258 | | - | |
| 297 | + | |
259 | 298 | | |
260 | | - | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
261 | 302 | | |
262 | 303 | | |
263 | 304 | | |
264 | 305 | | |
265 | 306 | | |
266 | | - | |
267 | | - | |
268 | | - | |
269 | 307 | | |
270 | 308 | | |
271 | 309 | | |
| |||
281 | 319 | | |
282 | 320 | | |
283 | 321 | | |
284 | | - | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
285 | 325 | | |
286 | 326 | | |
287 | 327 | | |
288 | 328 | | |
289 | 329 | | |
290 | 330 | | |
291 | | - | |
| 331 | + | |
292 | 332 | | |
293 | 333 | | |
294 | 334 | | |
295 | 335 | | |
296 | 336 | | |
297 | 337 | | |
298 | 338 | | |
299 | | - | |
300 | | - | |
301 | | - | |
| 339 | + | |
302 | 340 | | |
303 | | - | |
304 | | - | |
305 | | - | |
| 341 | + | |
306 | 342 | | |
307 | 343 | | |
308 | 344 | | |
| |||
313 | 349 | | |
314 | 350 | | |
315 | 351 | | |
316 | | - | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
317 | 356 | | |
318 | 357 | | |
319 | 358 | | |
| |||
Lines changed: 8 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
113 | | - | |
| 113 | + | |
| 114 | + | |
114 | 115 | | |
115 | 116 | | |
116 | 117 | | |
117 | 118 | | |
118 | 119 | | |
119 | | - | |
| 120 | + | |
120 | 121 | | |
121 | | - | |
| 122 | + | |
122 | 123 | | |
123 | 124 | | |
124 | 125 | | |
| |||
267 | 268 | | |
268 | 269 | | |
269 | 270 | | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
270 | 275 | | |
271 | 276 | | |
272 | 277 | | |
| |||
Lines changed: 39 additions & 22 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
360 | 360 | | |
361 | 361 | | |
362 | 362 | | |
363 | | - | |
| 363 | + | |
364 | 364 | | |
365 | 365 | | |
366 | 366 | | |
| |||
372 | 372 | | |
373 | 373 | | |
374 | 374 | | |
| 375 | + | |
375 | 376 | | |
376 | | - | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
377 | 380 | | |
378 | 381 | | |
379 | | - | |
| 382 | + | |
380 | 383 | | |
381 | | - | |
| 384 | + | |
382 | 385 | | |
383 | 386 | | |
384 | 387 | | |
| |||
388 | 391 | | |
389 | 392 | | |
390 | 393 | | |
391 | | - | |
392 | | - | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
393 | 399 | | |
394 | 400 | | |
395 | 401 | | |
| |||
408 | 414 | | |
409 | 415 | | |
410 | 416 | | |
411 | | - | |
| 417 | + | |
412 | 418 | | |
413 | 419 | | |
414 | | - | |
415 | | - | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
416 | 423 | | |
417 | 424 | | |
418 | 425 | | |
| 426 | + | |
| 427 | + | |
419 | 428 | | |
420 | | - | |
| 429 | + | |
421 | 430 | | |
422 | 431 | | |
423 | 432 | | |
| |||
429 | 438 | | |
430 | 439 | | |
431 | 440 | | |
432 | | - | |
| 441 | + | |
433 | 442 | | |
434 | 443 | | |
435 | 444 | | |
436 | 445 | | |
437 | | - | |
| 446 | + | |
438 | 447 | | |
439 | 448 | | |
440 | 449 | | |
| |||
490 | 499 | | |
491 | 500 | | |
492 | 501 | | |
| 502 | + | |
493 | 503 | | |
494 | 504 | | |
495 | 505 | | |
496 | 506 | | |
497 | 507 | | |
498 | 508 | | |
| 509 | + | |
499 | 510 | | |
500 | 511 | | |
501 | 512 | | |
| |||
505 | 516 | | |
506 | 517 | | |
507 | 518 | | |
508 | | - | |
509 | | - | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
510 | 523 | | |
511 | | - | |
512 | | - | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
513 | 527 | | |
514 | 528 | | |
515 | 529 | | |
| |||
526 | 540 | | |
527 | 541 | | |
528 | 542 | | |
529 | | - | |
| 543 | + | |
530 | 544 | | |
531 | 545 | | |
532 | 546 | | |
| |||
652 | 666 | | |
653 | 667 | | |
654 | 668 | | |
655 | | - | |
| 669 | + | |
| 670 | + | |
656 | 671 | | |
657 | | - | |
658 | | - | |
| 672 | + | |
| 673 | + | |
659 | 674 | | |
660 | 675 | | |
661 | 676 | | |
662 | | - | |
| 677 | + | |
663 | 678 | | |
664 | 679 | | |
665 | 680 | | |
| |||
778 | 793 | | |
779 | 794 | | |
780 | 795 | | |
781 | | - | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
782 | 799 | | |
783 | 800 | | |
784 | 801 | | |
| |||
0 commit comments