Commit ad7f009
[Codegen] Bubble up Transpose attention V and try fuse with others before attention (iree-org#19250)
Flash Attention transpose_V variant is significantly faster than the
non-transpose_V variant. This is due to many matmul intrinsics being
mmtb by default. Hence, doing FA transpose_V will allow for better/more
contiguous reads from shared memory to register, improving the attention
performance quite a bit.
This PR exposes the attention_transposeV form by generating a
linalg.transpose on the V during bubbling up of transpose S.T we can
give the graph some opportunities to fuse the transpose-V to it's
producer. I have also confirmed that if we do not find any producer, the
transpose will indeed fuse back with the attenionOp. Hence worse case,
we will get same perf as before this PR.
Additionally, we modify elementwise op fusion to try fuse transpose with
other ops before letting it get fused back into attention.
---------
Signed-off-by: Stanley Winata <[email protected]>1 parent f6e52d8 commit ad7f009
File tree
8 files changed
+314
-17
lines changed- .github/workflows
- build_tools/pkgci/external_test_suite
- compiler/src/iree/compiler
- Dialect/LinalgExt/Transforms
- DispatchCreation
- test
- GlobalOptimization
- test
8 files changed
+314
-17
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
220 | 220 | | |
221 | 221 | | |
222 | 222 | | |
223 | | - | |
| 223 | + | |
224 | 224 | | |
225 | 225 | | |
226 | 226 | | |
| |||
238 | 238 | | |
239 | 239 | | |
240 | 240 | | |
241 | | - | |
242 | | - | |
| 241 | + | |
| 242 | + | |
243 | 243 | | |
244 | 244 | | |
245 | | - | |
| 245 | + | |
246 | 246 | | |
247 | 247 | | |
248 | 248 | | |
249 | 249 | | |
250 | 250 | | |
251 | | - | |
252 | | - | |
| 251 | + | |
| 252 | + | |
253 | 253 | | |
254 | | - | |
255 | | - | |
| 254 | + | |
| 255 | + | |
256 | 256 | | |
257 | 257 | | |
258 | 258 | | |
| |||
Lines changed: 71 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
208 | 208 | | |
209 | 209 | | |
210 | 210 | | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
211 | 246 | | |
212 | 247 | | |
213 | 248 | | |
| |||
239 | 274 | | |
240 | 275 | | |
241 | 276 | | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
242 | 309 | | |
243 | 310 | | |
244 | 311 | | |
| |||
302 | 369 | | |
303 | 370 | | |
304 | 371 | | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
305 | 376 | | |
306 | 377 | | |
307 | 378 | | |
| |||
Lines changed: 6 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
22 | 28 | | |
23 | 29 | | |
24 | 30 | | |
| |||
Lines changed: 105 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| |||
101 | 102 | | |
102 | 103 | | |
103 | 104 | | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
104 | 202 | | |
105 | 203 | | |
106 | 204 | | |
| |||
110 | 208 | | |
111 | 209 | | |
112 | 210 | | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
113 | 218 | | |
Lines changed: 19 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
107 | | - | |
108 | 107 | | |
109 | 108 | | |
110 | 109 | | |
| |||
135 | 134 | | |
136 | 135 | | |
137 | 136 | | |
138 | | - | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
139 | 140 | | |
140 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
141 | 152 | | |
142 | 153 | | |
143 | 154 | | |
144 | 155 | | |
145 | 156 | | |
146 | 157 | | |
| 158 | + | |
147 | 159 | | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
| 160 | + | |
| 161 | + | |
153 | 162 | | |
154 | | - | |
155 | | - | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
156 | 166 | | |
157 | 167 | | |
158 | 168 | | |
| |||
Lines changed: 51 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
207 | 207 | | |
208 | 208 | | |
209 | 209 | | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
0 commit comments