- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15k
[DAG] Fold rem(rem(A, BCst), Op1Cst) -> rem(A, Op1Cst) #159517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-selectiondag Author: None (kper) ChangesFixes 157370 General proof: https://alive2.llvm.org/ce/z/b_GQJX I have added it as rv32i and rv64i tests because they are the only architectures where I could verify that it works. Full diff: https://github.com/llvm/llvm-project/pull/159517.diff 2 Files Affected: 
 diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 77bc47f28fc80..8e23282aad0fd 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -54,6 +54,7 @@
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Function.h"
 #include "llvm/IR/Metadata.h"
+#include "llvm/IR/PatternMatch.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/CodeGen.h"
 #include "llvm/Support/CommandLine.h"
@@ -5405,6 +5406,14 @@ SDValue DAGCombiner::visitREM(SDNode *N) {
   if (SDValue DivRem = useDivRem(N))
     return DivRem.getValue(1);
 
+  // fold urem(urem(A, BCst), Op1Cst) -> urem(A, Op1Cst)
+  SDValue A;
+  APInt Op1Cst, BCCst;
+  if (sd_match(N0, m_URem(m_Value(A), m_ConstInt(BCCst))) &&
+      sd_match(N1, m_ConstInt(Op1Cst)) && BCCst.urem(Op1Cst).isZero()) {
+    return DAG.getNode(ISD::UREM, DL, VT, A, DAG.getConstant(Op1Cst, DL, VT));
+  }
+
   return SDValue();
 }
 
diff --git a/llvm/test/CodeGen/RISCV/urem.ll b/llvm/test/CodeGen/RISCV/urem.ll
new file mode 100644
index 0000000000000..aeec9abd0ed9f
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/urem.ll
@@ -0,0 +1,71 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=CHECK,RV32I %s
+; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=CHECK,RV64I %s
+
+define i32 @fold_urem_constants(i32 %v0) {
+; RV32I-LABEL: fold_urem_constants:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    li a1, 5
+; RV32I-NEXT:    tail __umodsi3
+;
+; RV64I-LABEL: fold_urem_constants:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    addi sp, sp, -16
+; RV64I-NEXT:    .cfi_def_cfa_offset 16
+; RV64I-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-NEXT:    .cfi_offset ra, -8
+; RV64I-NEXT:    slli a0, a0, 32
+; RV64I-NEXT:    srli a0, a0, 32
+; RV64I-NEXT:    li a1, 5
+; RV64I-NEXT:    call __umoddi3
+; RV64I-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-NEXT:    .cfi_restore ra
+; RV64I-NEXT:    addi sp, sp, 16
+; RV64I-NEXT:    .cfi_def_cfa_offset 0
+; RV64I-NEXT:    ret
+  %v1 = urem i32 %v0, 25
+  %v2 = urem i32 %v1, 5
+  ret i32 %v2
+}
+
+define i32 @dont_test_fold_urem_constants(i32 %v0) {
+; RV32I-LABEL: dont_test_fold_urem_constants:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    .cfi_def_cfa_offset 16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    .cfi_offset ra, -4
+; RV32I-NEXT:    li a1, 25
+; RV32I-NEXT:    call __umodsi3
+; RV32I-NEXT:    li a1, 3
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    .cfi_restore ra
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    .cfi_def_cfa_offset 0
+; RV32I-NEXT:    tail __umodsi3
+;
+; RV64I-LABEL: dont_test_fold_urem_constants:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    addi sp, sp, -16
+; RV64I-NEXT:    .cfi_def_cfa_offset 16
+; RV64I-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-NEXT:    .cfi_offset ra, -8
+; RV64I-NEXT:    slli a0, a0, 32
+; RV64I-NEXT:    srli a0, a0, 32
+; RV64I-NEXT:    li a1, 25
+; RV64I-NEXT:    call __umoddi3
+; RV64I-NEXT:    li a1, 3
+; RV64I-NEXT:    call __umoddi3
+; RV64I-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-NEXT:    .cfi_restore ra
+; RV64I-NEXT:    addi sp, sp, 16
+; RV64I-NEXT:    .cfi_def_cfa_offset 0
+; RV64I-NEXT:    ret
+  %v1 = urem i32 %v0, 25
+  %v2 = urem i32 %v1, 3
+  ret i32 %v2
+}
+;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; CHECK: {{.*}}
 | 
| cc @RKSimon | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic looks correct to me. But should it have a "one use" check, like was mentioned in #157644?
|  | ||
| // fold urem(urem(A, BCst), Op1Cst) -> urem(A, Op1Cst) | ||
| SDValue A; | ||
| APInt Op1Cst, BCCst; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: call it BCst to match the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it was a typo :)
| 
 I wasn't really sure. We only added it because it showed that we're breaking up the div+rem patterns. Beside that there wasn't a functional reason. So should I add it again? | 
        
          
                llvm/test/CodeGen/RISCV/srem.ll
              
                Outdated
          
        
      | ; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \ | ||
| ; RUN: | FileCheck -check-prefixes=CHECK,RV64I %s | ||
|  | ||
| define i32 @fold_srem_constants(i32 %v0) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add nounwind to get rid of all the cfi noise
| #include "llvm/IR/DerivedTypes.h" | ||
| #include "llvm/IR/Function.h" | ||
| #include "llvm/IR/Metadata.h" | ||
| #include "llvm/IR/Type.h" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary?
| if (sd_match(N, m_SRem(m_SRem(m_Value(A), m_ConstInt(BCst)), | ||
| m_ConstInt(Op1Cst))) && | ||
| BCst.srem(Op1Cst).isZero() && | ||
| Op1Cst.ne(APInt::getAllOnes(Op1Cst.getBitWidth()))) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!Op1Cst.isAllOnes() ?
| Gentle ping | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| LLVM Buildbot has detected a new failure on builder  Full details are available at: https://lab.llvm.org/buildbot/#/builders/27/builds/16404 Here is the relevant piece of the build log for the reference | 
| LLVM Buildbot has detected a new failure on builder  Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/22972 Here is the relevant piece of the build log for the reference | 
Fixes 157370
UREM General proof: https://alive2.llvm.org/ce/z/b_GQJX
SREM General proof: https://alive2.llvm.org/ce/z/Whkaxh
I have added it as rv32i and rv64i tests because they are the only architectures where I could verify that it works.