Commit a09be42
authored
[Dispatch] Bubble extract_slice through all parallel generics (#20161)
Fixes llama fp8 perf regression introduced by
#20106. The PR stopped the
linalg.generic from getting hoisted. This was causing a broadcast to get
fused and `tensor<1x1x131072x131072xi1>` to be recomputed on each
prefill call.
---------
Signed-off-by: Ian Wood <[email protected]>1 parent ec128bf commit a09be42
File tree
2 files changed
+23
-5
lines changed- compiler/src/iree/compiler/DispatchCreation
- test
2 files changed
+23
-5
lines changedLines changed: 0 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | 61 | | |
67 | 62 | | |
68 | 63 | | |
| |||
Lines changed: 23 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
0 commit comments