Commit 8ae1b54
authored
[GPU] Use padding in IGEMM pipeline to support unaligned to intrinsic shapes (#19484)
This PR does two things
1. Allow all GEMM shapes to use padded TileAndFuse Matmul configuration.
This is still behind the
`iree-codegen-llvmgpu-test-tile-and-fuse-matmul=false` flag by default
and does not change the default behavior. However following PRs that
have landed in the past month make it possible to relax the guards we
originally had on this.
#19196
#19307
llvm/llvm-project#117340
2. Allow fused producers to use use padded TileAndFuse Matmul
configuration. Following PRs make this possible now
#19399
llvm/llvm-project#119039
Together this allows us to do padded IGEMM with intrinsics for shapes
unaligned to intrinsic which we use by default.
[Here](https://docs.google.com/spreadsheets/d/1O-SdUZCn5pHsxx7JTGjIIdH6PWCFnvlfe4XBbjEBaIM/edit?gid=0#gid=0)
is the performance difference observed in conv cases in
iree-kernel-benchmark-module that utilize this change. A median speedup
of 2.26x was observed.
The numeric changes I observed with enabling this path were the same
between any aligned shape when comparing intrinsic vs no intrinsic use.
Generally some differences are noticed for narrow types like f16 but
they are within a relative error of 0.001 but since our tests use
absolute errors we may have to change some test values to account for
this change.
The perf difference in CI seem to be within noise margin compared to
main,
https://github.com/iree-org/iree/actions/runs/12323399269/attempts/1#summary-34399247902
---------
Signed-off-by: Nirvedh <[email protected]>1 parent 78ea0ad commit 8ae1b54
File tree
4 files changed
+112
-18
lines changed- compiler/src/iree/compiler/Codegen
- Dialect/GPU/TargetUtils
- LLVMGPU
- test/ROCDL
4 files changed
+112
-18
lines changedLines changed: 4 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
182 | 182 | | |
183 | 183 | | |
184 | 184 | | |
185 | | - | |
186 | | - | |
| 185 | + | |
187 | 186 | | |
188 | 187 | | |
189 | 188 | | |
| |||
253 | 252 | | |
254 | 253 | | |
255 | 254 | | |
256 | | - | |
257 | | - | |
258 | 255 | | |
259 | 256 | | |
260 | 257 | | |
261 | 258 | | |
262 | | - | |
| 259 | + | |
263 | 260 | | |
264 | 261 | | |
265 | 262 | | |
| |||
342 | 339 | | |
343 | 340 | | |
344 | 341 | | |
345 | | - | |
346 | | - | |
347 | | - | |
348 | 342 | | |
349 | 343 | | |
350 | 344 | | |
| |||
391 | 385 | | |
392 | 386 | | |
393 | 387 | | |
394 | | - | |
395 | | - | |
| 388 | + | |
396 | 389 | | |
397 | 390 | | |
398 | 391 | | |
| |||
435 | 428 | | |
436 | 429 | | |
437 | 430 | | |
438 | | - | |
439 | | - | |
| 431 | + | |
440 | 432 | | |
441 | 433 | | |
442 | 434 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1033 | 1033 | | |
1034 | 1034 | | |
1035 | 1035 | | |
| 1036 | + | |
| 1037 | + | |
1036 | 1038 | | |
1037 | 1039 | | |
1038 | 1040 | | |
| |||
Lines changed: 26 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
62 | | - | |
| 62 | + | |
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
| |||
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
77 | | - | |
78 | | - | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
79 | 89 | | |
80 | 90 | | |
81 | 91 | | |
82 | | - | |
| 92 | + | |
83 | 93 | | |
84 | 94 | | |
85 | 95 | | |
| |||
94 | 104 | | |
95 | 105 | | |
96 | 106 | | |
97 | | - | |
98 | | - | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
Lines changed: 80 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
78 | 78 | | |
79 | 79 | | |
80 | 80 | | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
0 commit comments