Commit aaad9b6
Add SCALE_DTYPE and ZP_DTYPE support for quantization shaders (#13225)
Summary:
Pull Request resolved: #13225
This change adds support for parameterized SCALE_DTYPE and ZP_DTYPE to the quantization and dequantization shaders. This is necessary as when exporting llama with "8da4w" you might have different affine calls with various scale and zero point dtypes. I've also added functionality to automatically populate optional parameters.
NOTE: Disable the fusion for linear_qta8a_qga4w as the bug for why it doesn't work with exporting llama is being resolved.
**Key Changes:**
(1) **YAML Configuration Updates:**
- Added SCALE_DTYPE and ZP_DTYPE parameters to quantize_texture.yaml and dequantize_texture.yaml
- Added generate_variant_forall entries for SCALE_DTYPE (float) and ZP_DTYPE (int8, int32, float)
- This enables shader variants for different scale and zero_point data types
(2) **GLSL Shader Updates:**
- Added SCALE_T and ZP_T type definitions using the new parameters
- Updated tensor declarations to use parameterized types instead of hardcoded "float" and "int"
- Added proper type casting (float() and int()) for all scale and zero_point accesses
- Added required extensions for SCALE_DTYPE and ZP_DTYPE
(3) **C++ Implementation Updates:**
- Added dtype suffixes for scale and zero_point in all quantize/dequantize node functions
- Added comprehensive data type validation in all implementation functions:
- Scale tensors: fp32 only (for now)
- Zero point tensors: int32, int8, fp32
- Updated Quantize.cpp, Dequantize.cpp, and ChooseQParams.cpp with consistent validation
This change resolves shader compilation errors and enables more flexible quantization strategies by supporting multiple data types for quantization parameters.
Reviewed By: SS-JIA
Differential Revision: D798352671 parent f7ddbde commit aaad9b6
File tree
18 files changed
+595
-118
lines changed- backends/vulkan
- _passes
- runtime/graph/ops
- glsl
- impl
- utils
- test
18 files changed
+595
-118
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
499 | 499 | | |
500 | 500 | | |
501 | 501 | | |
502 | | - | |
| 502 | + | |
503 | 503 | | |
504 | 504 | | |
505 | 505 | | |
| |||
Lines changed: 15 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
| 15 | + | |
14 | 16 | | |
15 | 17 | | |
16 | 18 | | |
17 | 19 | | |
18 | 20 | | |
| 21 | + | |
| 22 | + | |
19 | 23 | | |
20 | 24 | | |
21 | 25 | | |
22 | 26 | | |
23 | 27 | | |
24 | | - | |
25 | | - | |
| 28 | + | |
| 29 | + | |
26 | 30 | | |
27 | 31 | | |
28 | 32 | | |
| |||
254 | 258 | | |
255 | 259 | | |
256 | 260 | | |
257 | | - | |
258 | | - | |
| 261 | + | |
| 262 | + | |
259 | 263 | | |
260 | 264 | | |
261 | 265 | | |
| |||
306 | 310 | | |
307 | 311 | | |
308 | 312 | | |
309 | | - | |
310 | | - | |
| 313 | + | |
| 314 | + | |
311 | 315 | | |
312 | 316 | | |
313 | 317 | | |
| |||
380 | 384 | | |
381 | 385 | | |
382 | 386 | | |
383 | | - | |
384 | | - | |
385 | | - | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
386 | 390 | | |
387 | | - | |
388 | | - | |
| 391 | + | |
| 392 | + | |
389 | 393 | | |
390 | 394 | | |
391 | 395 | | |
| |||
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
| 5 | + | |
4 | 6 | | |
5 | 7 | | |
6 | 8 | | |
7 | 9 | | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
8 | 16 | | |
9 | 17 | | |
10 | 18 | | |
| |||
Lines changed: 14 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
| 16 | + | |
15 | 17 | | |
16 | 18 | | |
17 | 19 | | |
18 | 20 | | |
19 | 21 | | |
| 22 | + | |
| 23 | + | |
20 | 24 | | |
21 | 25 | | |
22 | 26 | | |
23 | 27 | | |
24 | 28 | | |
25 | 29 | | |
26 | | - | |
27 | | - | |
| 30 | + | |
| 31 | + | |
28 | 32 | | |
29 | | - | |
30 | | - | |
| 33 | + | |
| 34 | + | |
31 | 35 | | |
32 | 36 | | |
33 | 37 | | |
| |||
273 | 277 | | |
274 | 278 | | |
275 | 279 | | |
276 | | - | |
277 | | - | |
| 280 | + | |
| 281 | + | |
278 | 282 | | |
279 | 283 | | |
280 | 284 | | |
| |||
419 | 423 | | |
420 | 424 | | |
421 | 425 | | |
422 | | - | |
423 | | - | |
| 426 | + | |
| 427 | + | |
424 | 428 | | |
425 | 429 | | |
426 | 430 | | |
| |||
517 | 521 | | |
518 | 522 | | |
519 | 523 | | |
520 | | - | |
521 | | - | |
| 524 | + | |
| 525 | + | |
522 | 526 | | |
523 | 527 | | |
524 | 528 | | |
| |||
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
| 5 | + | |
4 | 6 | | |
5 | 7 | | |
6 | 8 | | |
7 | 9 | | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
8 | 16 | | |
9 | 17 | | |
10 | 18 | | |
| |||
Lines changed: 16 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
| 16 | + | |
15 | 17 | | |
16 | 18 | | |
17 | 19 | | |
18 | 20 | | |
19 | 21 | | |
20 | 22 | | |
| 23 | + | |
| 24 | + | |
21 | 25 | | |
22 | 26 | | |
23 | 27 | | |
| |||
27 | 31 | | |
28 | 32 | | |
29 | 33 | | |
30 | | - | |
31 | | - | |
| 34 | + | |
| 35 | + | |
32 | 36 | | |
33 | 37 | | |
34 | 38 | | |
35 | 39 | | |
36 | 40 | | |
37 | 41 | | |
38 | | - | |
39 | | - | |
| 42 | + | |
| 43 | + | |
40 | 44 | | |
41 | 45 | | |
42 | 46 | | |
43 | 47 | | |
44 | 48 | | |
45 | 49 | | |
46 | 50 | | |
47 | | - | |
48 | | - | |
| 51 | + | |
| 52 | + | |
49 | 53 | | |
50 | 54 | | |
51 | 55 | | |
| |||
54 | 58 | | |
55 | 59 | | |
56 | 60 | | |
57 | | - | |
58 | | - | |
| 61 | + | |
| 62 | + | |
59 | 63 | | |
60 | 64 | | |
61 | 65 | | |
| |||
150 | 154 | | |
151 | 155 | | |
152 | 156 | | |
153 | | - | |
| 157 | + | |
154 | 158 | | |
155 | 159 | | |
156 | 160 | | |
| |||
185 | 189 | | |
186 | 190 | | |
187 | 191 | | |
188 | | - | |
| 192 | + | |
189 | 193 | | |
190 | 194 | | |
191 | 195 | | |
| |||
224 | 228 | | |
225 | 229 | | |
226 | 230 | | |
227 | | - | |
| 231 | + | |
228 | 232 | | |
229 | 233 | | |
230 | 234 | | |
| |||
247 | 251 | | |
248 | 252 | | |
249 | 253 | | |
250 | | - | |
| 254 | + | |
251 | 255 | | |
252 | 256 | | |
253 | 257 | | |
| |||
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
5 | 7 | | |
6 | 8 | | |
7 | 9 | | |
| |||
12 | 14 | | |
13 | 15 | | |
14 | 16 | | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
15 | 23 | | |
16 | 24 | | |
17 | 25 | | |
| |||
0 commit comments