Commit 0bc916e
authored
[AWQ][DDP] adding DDP functionality to AWQ (#2457)
This PR enables AWQ to have DDP functionality.
similar to [GPTQ
DDP](#2333) i noticed
a situation involving compounding floating point errors. With GPTQ this
issue made the non DDP evaluation performance better, however this time
it made the DDP evaluation performance worse. After correcting the
compounding error, It looks like both DDP and non-DDP evaluation
performance is more aligned with one another and its also slightly
better or equal (to 2 decimal points) compared to before. see results
below:
```
Script Model Time (min) GPU (GB) Flex Strict Flex(before) Strict(before)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
examples/awq/llama_example_ddp.py ./Meta-Llama-3-8B-Instruct-awq-asym-DDP4 2.40 4.99 0.7142 0.7149 0.6983 0.6990 ++
examples/awq/llama_example.py ./Meta-Llama-3-8B-Instruct-awq-asym 7.02 10.20 0.7081 0.7074 0.7058 0.7058 ++
examples/awq/llama_example_with_masking_ddp.py ./Meta-Llama-3-8B-Instruct-awq-asym-masked-DDP4 2.67 4.98 0.7119 0.7119
examples/awq/llama_example_with_masking.py ./Meta-Llama-3-8B-Instruct-awq-asym-masked 8.13 10.14 0.7058 0.7074
examples/awq/qwen3_vl_30b_example_ddp.py ./Qwen3-VL-30B-A3B-Instruct-AWQ-W4A16-g32-DDP4 143.10 3.38 0.8764 0.8529 0.8696 0.8453 ++
examples/awq/qwen3-vl-30b-a3b-Instruct-example.py ./Qwen3-VL-30B-A3B-Instruct-AWQ-W4A16-mse-seq 446.68 3.93 0.8643 0.8491 0.8613 0.8499 +-
examples/awq/qwen3_moe_example_ddp.py ./Qwen3-30B-A3B-awq-sym-DDP4 143.90 3.36 0.8802 0.8832 0.8848 0.8802 -+
examples/awq/qwen3_moe_example.py ./Qwen3-30B-A3B-awq-sym 459.65 4.13 0.8825 0.8863 0.8878 0.8840 -+
```
## changes:
- Added distributed functionality
- Accumulate activation sums instead of means to avoid floating point
errors
- Make everything broadcastable by changing to tensors
- added helper for all_reducing with sum op
Test Plan:
see penultimate commit for test scripts and evaluation framework
---------
Signed-off-by: HDCharles <charlesdavidhernandez@gmail.com>1 parent 2ab0244 commit 0bc916e
File tree
3 files changed
+273
-31
lines changed- examples/awq
- src/llmcompressor/modifiers/awq
3 files changed
+273
-31
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
0 commit comments