Skip to content

Commit 25b65be

Browse files
authored
[RISCV][LSR] Account for temporary register for base addition (llvm#92296)
An LSR formula may require the addition of multiple base or scale registers, this sum reduction requires a temporary register to perform. Since the formulas are independent, we only need one temporary, regardless of the number of unique formula. Each formula can reuse the same temporary. A later CSE pass may come along and combine sub-expressions - but then the register pressure would be that passes problem to consider. This change fixes up the costing in the RISCV specific way, but this is really a generic LSR problem. I just didn't feel like fighting with LSR and dealing with all the various targets swinging slightly in hard to reason about ways. This problem is more pronounced on RISCV than any other target due to our lack of addressing modes. This change is not hugely important on it's own, but I have an upcoming change to add support fo shNadd in LSR which biases us fairly strongly towards adding more "base adds". Without this change, we see net regression due to the increase in register pressure which is not accounted for.
1 parent 6de14c6 commit 25b65be

File tree

2 files changed

+12
-10
lines changed

2 files changed

+12
-10
lines changed

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1881,10 +1881,14 @@ unsigned RISCVTTIImpl::getMaximumVF(unsigned ElemWidth, unsigned Opcode) const {
18811881
bool RISCVTTIImpl::isLSRCostLess(const TargetTransformInfo::LSRCost &C1,
18821882
const TargetTransformInfo::LSRCost &C2) {
18831883
// RISC-V specific here are "instruction number 1st priority".
1884-
return std::tie(C1.Insns, C1.NumRegs, C1.AddRecCost,
1884+
// If we need to emit adds inside the loop to add up base registers, then
1885+
// we need at least one extra temporary register.
1886+
unsigned C1NumRegs = C1.NumRegs + (C1.NumBaseAdds != 0);
1887+
unsigned C2NumRegs = C2.NumRegs + (C2.NumBaseAdds != 0);
1888+
return std::tie(C1.Insns, C1NumRegs, C1.AddRecCost,
18851889
C1.NumIVMuls, C1.NumBaseAdds,
18861890
C1.ScaleCost, C1.ImmCost, C1.SetupCost) <
1887-
std::tie(C2.Insns, C2.NumRegs, C2.AddRecCost,
1891+
std::tie(C2.Insns, C2NumRegs, C2.AddRecCost,
18881892
C2.NumIVMuls, C2.NumBaseAdds,
18891893
C2.ScaleCost, C2.ImmCost, C2.SetupCost);
18901894
}

llvm/test/CodeGen/RISCV/loop-strength-reduce-loop-invar.ll

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -53,26 +53,24 @@ define void @test(i32 signext %row, i32 signext %N.in) nounwind {
5353
; RV64: # %bb.0: # %entry
5454
; RV64-NEXT: blez a1, .LBB0_3
5555
; RV64-NEXT: # %bb.1: # %cond_true.preheader
56-
; RV64-NEXT: negw a1, a1
5756
; RV64-NEXT: slli a0, a0, 6
5857
; RV64-NEXT: lui a2, %hi(A)
5958
; RV64-NEXT: addi a2, a2, %lo(A)
6059
; RV64-NEXT: add a0, a0, a2
6160
; RV64-NEXT: addi a2, a0, 4
61+
; RV64-NEXT: addiw a1, a1, 2
6262
; RV64-NEXT: li a3, 2
6363
; RV64-NEXT: li a4, 4
6464
; RV64-NEXT: li a5, 5
65-
; RV64-NEXT: li a6, 2
6665
; RV64-NEXT: .LBB0_2: # %cond_true
6766
; RV64-NEXT: # =>This Inner Loop Header: Depth=1
6867
; RV64-NEXT: sw a4, 0(a2)
69-
; RV64-NEXT: slli a7, a6, 2
70-
; RV64-NEXT: add a7, a0, a7
71-
; RV64-NEXT: sw a5, 0(a7)
72-
; RV64-NEXT: addiw a6, a6, 1
73-
; RV64-NEXT: addw a7, a1, a6
68+
; RV64-NEXT: slli a6, a3, 2
69+
; RV64-NEXT: add a6, a0, a6
70+
; RV64-NEXT: sw a5, 0(a6)
71+
; RV64-NEXT: addiw a3, a3, 1
7472
; RV64-NEXT: addi a2, a2, 4
75-
; RV64-NEXT: bne a7, a3, .LBB0_2
73+
; RV64-NEXT: bne a3, a1, .LBB0_2
7674
; RV64-NEXT: .LBB0_3: # %return
7775
; RV64-NEXT: ret
7876
entry:

0 commit comments

Comments
 (0)