Commit f800d64
committed
[AMDGPU] Optimize out s_barrier_signal/_wait
Extend the optimization that converts s_barrier to wave_barrier (nop)
when the number of work items is not larger than wave size.
This handles the "split barrier" form of s_barrier where the barrier
is represented by separate intrinsics (s_barrier_signal/s_barrier_wait).
Note: the version where s_barrier is used in gfx12 (and later split)
has the optimization already, but some front-ends may prefer to use
split intrinsics and this is being addressed by the patch.1 parent 3955c2b commit f800d64
File tree
3 files changed
+15
-7
lines changed- llvm
- lib/Target/AMDGPU
- test/CodeGen/AMDGPU
3 files changed
+15
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1843 | 1843 | | |
1844 | 1844 | | |
1845 | 1845 | | |
1846 | | - | |
1847 | | - | |
| 1846 | + | |
| 1847 | + | |
| 1848 | + | |
1848 | 1849 | | |
1849 | 1850 | | |
1850 | 1851 | | |
| |||
2161 | 2162 | | |
2162 | 2163 | | |
2163 | 2164 | | |
| 2165 | + | |
| 2166 | + | |
2164 | 2167 | | |
2165 | 2168 | | |
2166 | 2169 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9605 | 9605 | | |
9606 | 9606 | | |
9607 | 9607 | | |
9608 | | - | |
| 9608 | + | |
| 9609 | + | |
| 9610 | + | |
9609 | 9611 | | |
9610 | 9612 | | |
9611 | 9613 | | |
| |||
9615 | 9617 | | |
9616 | 9618 | | |
9617 | 9619 | | |
9618 | | - | |
9619 | | - | |
| 9620 | + | |
| 9621 | + | |
9620 | 9622 | | |
9621 | 9623 | | |
9622 | 9624 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
20 | 22 | | |
21 | 23 | | |
22 | 24 | | |
| |||
39 | 41 | | |
40 | 42 | | |
41 | 43 | | |
42 | | - | |
| 44 | + | |
| 45 | + | |
43 | 46 | | |
44 | 47 | | |
45 | 48 | | |
| |||
0 commit comments