Commit 485b3af
authored
[RISCV] Reduce minimum VL needed for vslidedown.vx in RISCVVLOptimizer (#168392)
Whenever #149042 is relanded we will soon start EVL tail folding
vectorized loops that have live-outs, e.g.:
```c
int f(int *x, int n) {
for (int i = 0; i < n; i++) {
int y = x[i] + 1;
x[y] = y;
}
return y;
}
```
These are vectorized by extracting the last "active lane" in the loop's
exit:
```llvm
loop:
%vl = call i32 @llvm.experimental.get.vector.length(i64 %avl, i32 4, i1 true)
...
exit:
%lastidx = sub i64 %vl, 1
%lastelt = extractelement <vscale x 4 x i32> %y, i64 %lastidx
```
Which in RISC-V translates to a vslidedown.vx with a VL of 1:
```llvm
bb.loop:
%vl:gprnox0 = PseudoVSETVLI ...
%y:vr = PseudoVADD_VI_M1 $noreg, %x, 1, AVL=-1
...
bb.exit:
%lastidx:gprnox0 = ADDI %vl, -1
%w:vr = PseudoVSLIDEDOWN_VX_M1 $noreg, %y, %lastidx, AVL=1
```
However today we will fail to reduce the VL of %y in the loop and will
end up with two extra VL toggles. The reason being that today
RISCVVLOptimizer is conservative with vslidedown.vx as it can read the
lanes of %y past its own VL. So in `getMinimumVLForUser` we say that
vslidedown.vx demands the entirety of %y.
One observation with the sequence above is that it only actually needs
to read the first %vl lanes of %y, because the last lane of vs2 used is
offset + 1. In this case, that's `%lastidx + 1 = %vl - 1 + 1 = %vl`.
This PR teaches RISCVVLOptimizer about this case in
`getMinimumVLForVSLIDEDOWN_VX`, and in doing so removes the VL toggles.
The one case that I had to think about for a bit was whenever `ADDI %vl,
-1` wraps, i.e. when %vl=0 and the resulting offset is all ones. This
should always be larger than the largest VLMAX, so vs2 will be
completely slid down and absent from the output. So we don't need to
read anything from vs2.
This patch on its own has no observable effect on llvm-test-suite or
SPEC CPU 2017 w/ rva23u64 today.1 parent ea26d92 commit 485b3af
File tree
3 files changed
+120
-1
lines changed- llvm
- lib/Target/RISCV
- test/CodeGen/RISCV/rvv
3 files changed
+120
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
65 | | - | |
| 65 | + | |
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
| |||
1392 | 1392 | | |
1393 | 1393 | | |
1394 | 1394 | | |
| 1395 | + | |
| 1396 | + | |
| 1397 | + | |
| 1398 | + | |
| 1399 | + | |
| 1400 | + | |
| 1401 | + | |
| 1402 | + | |
| 1403 | + | |
| 1404 | + | |
| 1405 | + | |
| 1406 | + | |
| 1407 | + | |
| 1408 | + | |
| 1409 | + | |
| 1410 | + | |
| 1411 | + | |
| 1412 | + | |
| 1413 | + | |
| 1414 | + | |
| 1415 | + | |
| 1416 | + | |
| 1417 | + | |
| 1418 | + | |
| 1419 | + | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
| 1424 | + | |
| 1425 | + | |
| 1426 | + | |
| 1427 | + | |
| 1428 | + | |
| 1429 | + | |
| 1430 | + | |
1395 | 1431 | | |
1396 | 1432 | | |
1397 | 1433 | | |
| |||
1406 | 1442 | | |
1407 | 1443 | | |
1408 | 1444 | | |
| 1445 | + | |
| 1446 | + | |
| 1447 | + | |
1409 | 1448 | | |
1410 | 1449 | | |
1411 | 1450 | | |
| |||
1624 | 1663 | | |
1625 | 1664 | | |
1626 | 1665 | | |
| 1666 | + | |
1627 | 1667 | | |
1628 | 1668 | | |
1629 | 1669 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
778 | 778 | | |
779 | 779 | | |
780 | 780 | | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
0 commit comments