-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[RISCV] Use Zilsd Pseudos in ISel #169580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is proposed as an alternative to llvm#169529. The idea here is during selection, to choose between directly generating `LD`/`SD` or generating `PseudoLD_RV32_OPT`/`PseudoSD_RV32_OPT` based on the volatility of the access. Volatile operations will always become `LD`/`SD`, but non-volatile operations have a chance of becoming a pair of `LW`/`SW` depending on the register allocation, which might save some `MV` instructions. The advantage of this approach is that we don't need to go searching for instructions to pair (including comparing their memory operands) in the pre-ra pass, we already know these are paired, but they don't constrain the register allocator, unlike `LD`/`SD`. This PR is maybe not enough - we probably have to check the passes between ISel and the Pre-RA Load/Store Pairing pass cope with this correctly. This also fixes a verifier error with the kill flags.
|
@llvm/pr-subscribers-backend-risc-v Author: Sam Elliott (lenary) ChangesThis is proposed as an alternative to #169529. The idea here is during selection, to choose between directly generating Volatile operations will always become The advantage of this approach is that we don't need to go searching for instructions to pair (including comparing their memory operands) in the pre-ra pass, we already know these are paired, but they don't constrain the register allocator, unlike This PR is maybe not enough - we probably have to check the passes between ISel and the Pre-RA Load/Store Pairing pass cope with this correctly, especially MergeBaseOffset. Full diff: https://github.com/llvm/llvm-project/pull/169580.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 5025122db3681..0be5d8e731b60 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -1817,52 +1817,77 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
case RISCVISD::LD_RV32: {
assert(Subtarget->hasStdExtZilsd() && "LD_RV32 is only used with Zilsd");
+ auto *MemNode = cast<MemSDNode>(Node);
+
SDValue Base, Offset;
- SDValue Chain = Node->getOperand(0);
- SDValue Addr = Node->getOperand(1);
+ SDValue Chain = MemNode->getChain();
+ SDValue Addr = MemNode->getBasePtr();
SelectAddrRegImm(Addr, Base, Offset);
SDValue Ops[] = {Base, Offset, Chain};
- MachineSDNode *New = CurDAG->getMachineNode(
- RISCV::LD_RV32, DL, {MVT::Untyped, MVT::Other}, Ops);
- SDValue Lo = CurDAG->getTargetExtractSubreg(RISCV::sub_gpr_even, DL,
- MVT::i32, SDValue(New, 0));
- SDValue Hi = CurDAG->getTargetExtractSubreg(RISCV::sub_gpr_odd, DL,
- MVT::i32, SDValue(New, 0));
- CurDAG->setNodeMemRefs(New, {cast<MemSDNode>(Node)->getMemOperand()});
+ MachineSDNode *New;
+ SDValue Lo, Hi, OutChain;
+ if (MemNode->isVolatile()) {
+ New = CurDAG->getMachineNode(RISCV::LD_RV32, DL,
+ {MVT::Untyped, MVT::Other}, Ops);
+
+ Lo = CurDAG->getTargetExtractSubreg(RISCV::sub_gpr_even, DL, MVT::i32,
+ SDValue(New, 0));
+ Hi = CurDAG->getTargetExtractSubreg(RISCV::sub_gpr_odd, DL, MVT::i32,
+ SDValue(New, 0));
+ OutChain = SDValue(New, 1);
+ } else {
+ New = CurDAG->getMachineNode(RISCV::PseudoLD_RV32_OPT, DL,
+ {MVT::i32, MVT::i32, MVT::Other}, Ops);
+ Lo = SDValue(New, 0);
+ Hi = SDValue(New, 1);
+ OutChain = SDValue(New, 2);
+ }
+
+ CurDAG->setNodeMemRefs(New, {MemNode->getMemOperand()});
ReplaceUses(SDValue(Node, 0), Lo);
ReplaceUses(SDValue(Node, 1), Hi);
- ReplaceUses(SDValue(Node, 2), SDValue(New, 1));
+ ReplaceUses(SDValue(Node, 2), OutChain);
CurDAG->RemoveDeadNode(Node);
return;
}
case RISCVISD::SD_RV32: {
+ auto *MemNode = cast<MemSDNode>(Node);
+
SDValue Base, Offset;
- SDValue Chain = Node->getOperand(0);
+ SDValue Chain = MemNode->getChain();
SDValue Addr = Node->getOperand(3);
SelectAddrRegImm(Addr, Base, Offset);
SDValue Lo = Node->getOperand(1);
SDValue Hi = Node->getOperand(2);
- SDValue RegPair;
- // Peephole to use X0_Pair for storing zero.
- if (isNullConstant(Lo) && isNullConstant(Hi)) {
- RegPair = CurDAG->getRegister(RISCV::X0_Pair, MVT::Untyped);
- } else {
- SDValue Ops[] = {
- CurDAG->getTargetConstant(RISCV::GPRPairRegClassID, DL, MVT::i32), Lo,
- CurDAG->getTargetConstant(RISCV::sub_gpr_even, DL, MVT::i32), Hi,
- CurDAG->getTargetConstant(RISCV::sub_gpr_odd, DL, MVT::i32)};
+ MachineSDNode *New;
+ if (MemNode->isVolatile()) {
+ SDValue RegPair;
+ // Peephole to use X0_Pair for storing zero.
+ if (isNullConstant(Lo) && isNullConstant(Hi)) {
+ RegPair = CurDAG->getRegister(RISCV::X0_Pair, MVT::Untyped);
+ } else {
+ SDValue Ops[] = {
+ CurDAG->getTargetConstant(RISCV::GPRPairRegClassID, DL, MVT::i32),
+ Lo, CurDAG->getTargetConstant(RISCV::sub_gpr_even, DL, MVT::i32),
+ Hi, CurDAG->getTargetConstant(RISCV::sub_gpr_odd, DL, MVT::i32)};
+
+ RegPair = SDValue(CurDAG->getMachineNode(TargetOpcode::REG_SEQUENCE, DL,
+ MVT::Untyped, Ops),
+ 0);
+ }
- RegPair = SDValue(CurDAG->getMachineNode(TargetOpcode::REG_SEQUENCE, DL,
- MVT::Untyped, Ops),
- 0);
+ New = CurDAG->getMachineNode(RISCV::SD_RV32, DL, MVT::Other,
+ {RegPair, Base, Offset, Chain});
+ } else {
+ New = CurDAG->getMachineNode(RISCV::PseudoSD_RV32_OPT, DL, MVT::Other,
+ {Lo, Hi, Base, Offset, Chain});
}
- MachineSDNode *New = CurDAG->getMachineNode(RISCV::SD_RV32, DL, MVT::Other,
- {RegPair, Base, Offset, Chain});
- CurDAG->setNodeMemRefs(New, {cast<MemSDNode>(Node)->getMemOperand()});
+ CurDAG->setNodeMemRefs(New, {MemNode->getMemOperand()});
+
ReplaceUses(SDValue(Node, 0), SDValue(New, 0));
CurDAG->RemoveDeadNode(Node);
return;
diff --git a/llvm/test/CodeGen/RISCV/zilsd.ll b/llvm/test/CodeGen/RISCV/zilsd.ll
index 27b1ff76f6f05..4146535318fb8 100644
--- a/llvm/test/CodeGen/RISCV/zilsd.ll
+++ b/llvm/test/CodeGen/RISCV/zilsd.ll
@@ -9,9 +9,10 @@
define i64 @load(ptr %a) nounwind {
; CHECK-LABEL: load:
; CHECK: # %bb.0:
-; CHECK-NEXT: mv a2, a0
-; CHECK-NEXT: ld a0, 80(a0)
-; CHECK-NEXT: ld zero, 0(a2)
+; CHECK-NEXT: lw a2, 80(a0)
+; CHECK-NEXT: lw a1, 84(a0)
+; CHECK-NEXT: ld zero, 0(a0)
+; CHECK-NEXT: mv a0, a2
; CHECK-NEXT: ret
%1 = getelementptr i64, ptr %a, i32 10
%2 = load i64, ptr %1
@@ -44,10 +45,10 @@ define i64 @load_align4(ptr %a) nounwind {
define void @store(ptr %a, i64 %b) nounwind {
; CHECK-LABEL: store:
; CHECK: # %bb.0:
-; CHECK-NEXT: mv a3, a2
-; CHECK-NEXT: mv a2, a1
-; CHECK-NEXT: sd a2, 0(a0)
-; CHECK-NEXT: sd a2, 88(a0)
+; CHECK-NEXT: sw a1, 0(a0)
+; CHECK-NEXT: sw a2, 4(a0)
+; CHECK-NEXT: sw a1, 88(a0)
+; CHECK-NEXT: sw a2, 92(a0)
; CHECK-NEXT: ret
store i64 %b, ptr %a
%1 = getelementptr i64, ptr %a, i32 11
@@ -56,25 +57,11 @@ define void @store(ptr %a, i64 %b) nounwind {
}
define void @store_align4(ptr %a, i64 %b) nounwind {
-; SLOW-LABEL: store_align4:
-; SLOW: # %bb.0:
-; SLOW-NEXT: sw a1, 88(a0)
-; SLOW-NEXT: sw a2, 92(a0)
-; SLOW-NEXT: ret
-;
-; FAST-LABEL: store_align4:
-; FAST: # %bb.0:
-; FAST-NEXT: mv a3, a2
-; FAST-NEXT: mv a2, a1
-; FAST-NEXT: sd a2, 88(a0)
-; FAST-NEXT: ret
-;
-; 4BYTEALIGN-LABEL: store_align4:
-; 4BYTEALIGN: # %bb.0:
-; 4BYTEALIGN-NEXT: mv a3, a2
-; 4BYTEALIGN-NEXT: mv a2, a1
-; 4BYTEALIGN-NEXT: sd a2, 88(a0)
-; 4BYTEALIGN-NEXT: ret
+; CHECK-LABEL: store_align4:
+; CHECK: # %bb.0:
+; CHECK-NEXT: sw a1, 88(a0)
+; CHECK-NEXT: sw a2, 92(a0)
+; CHECK-NEXT: ret
%1 = getelementptr i64, ptr %a, i32 11
store i64 %b, ptr %1, align 4
ret void
@@ -158,9 +145,8 @@ define void @store_unaligned(ptr %p, i64 %v) {
;
; FAST-LABEL: store_unaligned:
; FAST: # %bb.0:
-; FAST-NEXT: mv a3, a2
-; FAST-NEXT: mv a2, a1
-; FAST-NEXT: sd a2, 0(a0)
+; FAST-NEXT: sw a1, 0(a0)
+; FAST-NEXT: sw a2, 4(a0)
; FAST-NEXT: ret
;
; 4BYTEALIGN-LABEL: store_unaligned:
@@ -213,11 +199,13 @@ define void @large_offset(ptr nocapture %p, i64 %d) nounwind {
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: lui a1, 4
; CHECK-NEXT: add a0, a0, a1
-; CHECK-NEXT: ld a2, -384(a0)
-; CHECK-NEXT: addi a2, a2, 1
-; CHECK-NEXT: seqz a1, a2
-; CHECK-NEXT: add a3, a3, a1
-; CHECK-NEXT: sd a2, -384(a0)
+; CHECK-NEXT: lw a1, -384(a0)
+; CHECK-NEXT: lw a2, -380(a0)
+; CHECK-NEXT: addi a1, a1, 1
+; CHECK-NEXT: seqz a3, a1
+; CHECK-NEXT: add a2, a2, a3
+; CHECK-NEXT: sw a1, -384(a0)
+; CHECK-NEXT: sw a2, -380(a0)
; CHECK-NEXT: ret
entry:
%add.ptr = getelementptr inbounds i64, ptr %p, i64 2000
|
|
I'm not familiar with |
|
The change with this PR is that I think with this change, we might generate a I think we already have code for this for Zdinx, so hopefully we can reuse that. I don't think the lack of MergeBaseOffset changes should prevent this change. |
|
Don't we need to call MRI->setRegAllocationHint for the PseudoLD_RV32_OPT registers after isel? |
|
Maybe! I guess I forgot that step. I have a prospective patch to add eliminateFrameIndex support to these instructions following the rv32Zdinx spill pseudos, but it's entirely untested right now. I did have a bit of a look at merge base offset but i didn't make progress there. |
🐧 Linux x64 Test Results
Failed Tests(click on a test name to see its output) LLVMLLVM.CodeGen/RISCV/double-convert.llLLVM.CodeGen/RISCV/double-imm.llLLVM.CodeGen/RISCV/double-intrinsics-strict.llLLVM.CodeGen/RISCV/double-intrinsics.llLLVM.CodeGen/RISCV/double-mem.llLLVM.CodeGen/RISCV/double-previous-failure.llLLVM.CodeGen/RISCV/double-round-conv-sat.llIf these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the |
| let hasSideEffects = false; | ||
| let mayLoad = true; | ||
| let mayStore = false; | ||
| let isCodeGenOnly = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought isCodeGenOnly was set by Pseudo already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh it might be. I should have checked.
This is proposed as an alternative to #169529. The idea here is during selection, to choose between directly generating
LD/SDor generatingPseudoLD_RV32_OPT/PseudoSD_RV32_OPTbased on the volatility of the access.Volatile operations will always become
LD/SD, but non-volatile operations have a chance of becoming a pair ofLW/SWdepending on the register allocation, which might save someMVinstructions.The advantage of this approach is that we don't need to go searching for instructions to pair (including comparing their memory operands) in the pre-ra pass, we already know these are paired, but they don't constrain the register allocator, unlike
LD/SD.This PR is maybe not enough - we probably have to check the passes between ISel and the Pre-RA Load/Store Pairing pass cope with this correctly, especially MergeBaseOffset.