Commit 933f798
authored
[DT] Fuse encoding ops more aggressively for multi-use, gather, and slices ops. (iree-org#21830)
The fusion constraint of multi-use dispatch is only required by
SetEncoding pass, because it has to move consumer dispatches around. It
is not required by encoding fusion, because it is just moving a
SetEncoding op into its producer dispatch.
The revision also allows the fusion when the dispatch region contains
tensor.extract_slice op and iree_linalg_ext.gather ops. It reduces the
number of dispatches to 644 in llama fp8 model, the same as without data
tiling. The latency drops 25ms, from 378ms to 353ms.
| | No Data Tiling | Data Tiling w/o the revision | Data Tiling w/ the
revision |
| ------------- | ------------- | ------------- | ------------- |
| Benchmark latency | 243ms | 378ms | 353ms |
| Memory usage (HIP unpooled) | 15.9GB | 31.14GB | 31.11GB |
| Number of dispatches | 644 | 741 | 644 |
| | No Data Tiling (ms) | Data Tiling w/o the revision | Data Tiling w/
the revision |
| ------------- | ------------- | ------------- | ------------- |
| dispatch_15_attention_4x8x4xDx128xf8 | 62.29 | 55.35 | 59.21 |
| dispatch_20_matmul_like_Dx14336x4096_f8xf8xf32 | 40.13 | 89.14 |
93.72|
| dispatch_19_matmul_like_Dx14336x4096_f8xf8xf32 | 28.01 | 44.78 | 44.59
|
| dispatch_21_matmul_like_Dx4096x14336_f8xf8xf32 | 27.25 | 40.18 | 39.99
|
| dispatch_643_matmul_like_Dx128256x4096_f16xf16xf32 | 17.1 | 29.76 |
29.21 |
| dispatch_16_matmul_like_Dx4096x4096_f8xf8xf32 | 8.83 | 17.92 | 17.91 |
| dispatch_23_matmul_like_Dx4096x4096_f8xf8xf32 | 9.27 | 16.69 | 16.59 |
| encoding_10_encode_Dx4096xf8_to_Dx4096xf8 | - | 32.15 | - |
| encoding_6_encode_Dx14336xf32_to_Dx14336xf32 | - | 0.318 | - |
---------
Signed-off-by: hanhanW <hanhan0912@gmail.com>1 parent b4da7b2 commit 933f798
File tree
5 files changed
+101
-26
lines changed- compiler/src/iree/compiler/DispatchCreation
- test
5 files changed
+101
-26
lines changedLines changed: 10 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
40 | 41 | | |
41 | 42 | | |
42 | 43 | | |
| |||
49 | 50 | | |
50 | 51 | | |
51 | 52 | | |
52 | | - | |
53 | | - | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
54 | 56 | | |
55 | 57 | | |
56 | 58 | | |
| |||
Lines changed: 1 addition & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
145 | 145 | | |
146 | 146 | | |
147 | 147 | | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | 148 | | |
153 | | - | |
154 | | - | |
| 149 | + | |
155 | 150 | | |
156 | 151 | | |
157 | 152 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
32 | | - | |
| 31 | + | |
| 32 | + | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| |||
Lines changed: 18 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
392 | 392 | | |
393 | 393 | | |
394 | 394 | | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
399 | | - | |
400 | | - | |
401 | | - | |
402 | | - | |
403 | | - | |
404 | | - | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
405 | 413 | | |
406 | 414 | | |
407 | 415 | | |
| |||
Lines changed: 70 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
280 | 280 | | |
281 | 281 | | |
282 | 282 | | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
283 | 301 | | |
284 | 302 | | |
285 | 303 | | |
| |||
309 | 327 | | |
310 | 328 | | |
311 | 329 | | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
0 commit comments