Skip to content

Conversation

anoopkg6
Copy link
Contributor

@anoopkg6 anoopkg6 commented Feb 6, 2025

Added Support for flag output operand "=@cc", inline assembly constraint for
SystemZ.

  • Clang now accepts "=@cc" assembly operands, and sets 2-bits condition code
    for output operand for SyatemZ.

  • Clang currently emits an assertion that flag output operands are boolean
    values, i.e. in the range [0, 2). Generalize this mechanism to allow
    targets to specify arbitrary range assertions for any inline assembly
    output operand. This will be used to assert that SystemZ two-bit
    condition-code values are in the range [0, 4).

  • SystemZ backend lowers "@cc" targets by using ipm sequence to extract
    condition code from PSW.

  • DAGCombine tries to optimize lowered ipm sequence by combining
    CCReg and computing effective CCMask and CCValid in combineCCMask for
    select_ccmask and br_ccmask.

  • Cost computation is done for merging conditionals for branch instruction
    in SelectionDAG, as split may cause branches conditions evaluation goes
    across basic block and difficult to combine.

… conditional branch for 14 possible combinations of CC mask.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:SystemZ clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:SelectionDAG SelectionDAGISel as well labels Feb 6, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-backend-systemz

@llvm/pr-subscribers-clang

Author: None (anoopkg6)

Changes

Add support for flag output operand "=@cc" for SystemZ and optimizing conditional branch for 14 possible combinations of CC mask.


Patch is 616.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125970.diff

21 Files Affected:

  • (modified) clang/lib/Basic/Targets/SystemZ.cpp (+11)
  • (modified) clang/lib/Basic/Targets/SystemZ.h (+5)
  • (modified) clang/lib/CodeGen/CGStmt.cpp (+8-2)
  • (added) clang/test/CodeGen/inline-asm-systemz-flag-output.c (+149)
  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+61-9)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+4)
  • (modified) llvm/lib/Target/SystemZ/SystemZISelLowering.cpp (+598-2)
  • (modified) llvm/lib/Target/SystemZ/SystemZISelLowering.h (+14)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccand.ll (+500)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccand_eq_noteq.ll (+939)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccand_not.ll (+779)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccmixed.ll (+2427)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccmixed_eq_noteq.ll (+5248)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccmixed_not.ll (+2543)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccor.ll (+1047)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccor_eq_noteq.ll (+854)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccor_not.ll (+806)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccxor.ll (+784)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccxor_eq_noteq.ll (+1083)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccxor_not.ll (+778)
diff --git a/clang/lib/Basic/Targets/SystemZ.cpp b/clang/lib/Basic/Targets/SystemZ.cpp
index 06f08db2eadd475..49f88b45220d0c4 100644
--- a/clang/lib/Basic/Targets/SystemZ.cpp
+++ b/clang/lib/Basic/Targets/SystemZ.cpp
@@ -90,6 +90,14 @@ bool SystemZTargetInfo::validateAsmConstraint(
   case 'T': // Likewise, plus an index
     Info.setAllowsMemory();
     return true;
+  case '@':
+    // CC condition changes.
+    if (strlen(Name) >= 3 && *(Name + 1) == 'c' && *(Name + 2) == 'c') {
+      Name += 2;
+      Info.setAllowsRegister();
+      return true;
+    }
+    return false;
   }
 }
 
@@ -150,6 +158,9 @@ unsigned SystemZTargetInfo::getMinGlobalAlign(uint64_t Size,
 
 void SystemZTargetInfo::getTargetDefines(const LangOptions &Opts,
                                          MacroBuilder &Builder) const {
+  // Inline assembly supports SystemZ flag outputs.
+  Builder.defineMacro("__GCC_ASM_FLAG_OUTPUTS__");
+
   Builder.defineMacro("__s390__");
   Builder.defineMacro("__s390x__");
   Builder.defineMacro("__zarch__");
diff --git a/clang/lib/Basic/Targets/SystemZ.h b/clang/lib/Basic/Targets/SystemZ.h
index ef9a07033a6e4ff..a6909ababdec001 100644
--- a/clang/lib/Basic/Targets/SystemZ.h
+++ b/clang/lib/Basic/Targets/SystemZ.h
@@ -118,6 +118,11 @@ class LLVM_LIBRARY_VISIBILITY SystemZTargetInfo : public TargetInfo {
                              TargetInfo::ConstraintInfo &info) const override;
 
   std::string convertConstraint(const char *&Constraint) const override {
+    if (strncmp(Constraint, "@cc", 3) == 0) {
+      std::string Converted = "{" + std::string(Constraint, 3) + "}";
+      Constraint += 3;
+      return Converted;
+    }
     switch (Constraint[0]) {
     case 'p': // Keep 'p' constraint.
       return std::string("p");
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 41dc91c578c800a..27f7bb652895839 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -2563,9 +2563,15 @@ EmitAsmStores(CodeGenFunction &CGF, const AsmStmt &S,
     if ((i < ResultRegIsFlagReg.size()) && ResultRegIsFlagReg[i]) {
       // Target must guarantee the Value `Tmp` here is lowered to a boolean
       // value.
-      llvm::Constant *Two = llvm::ConstantInt::get(Tmp->getType(), 2);
+      unsigned CCUpperBound = 2;
+      if (CGF.getTarget().getTriple().getArch() == llvm::Triple::systemz) {
+        // On this target CC value can be in range [0, 3].
+        CCUpperBound = 4;
+      }
+      llvm::Constant *CCUpperBoundConst =
+          llvm::ConstantInt::get(Tmp->getType(), CCUpperBound);
       llvm::Value *IsBooleanValue =
-          Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, Two);
+          Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, CCUpperBoundConst);
       llvm::Function *FnAssume = CGM.getIntrinsic(llvm::Intrinsic::assume);
       Builder.CreateCall(FnAssume, IsBooleanValue);
     }
diff --git a/clang/test/CodeGen/inline-asm-systemz-flag-output.c b/clang/test/CodeGen/inline-asm-systemz-flag-output.c
new file mode 100644
index 000000000000000..ab90e031df1f2b8
--- /dev/null
+++ b/clang/test/CodeGen/inline-asm-systemz-flag-output.c
@@ -0,0 +1,149 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple s390x-linux -emit-llvm -o - %s | FileCheck %s
+// CHECK-LABEL: define dso_local signext i32 @foo_012(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2:[0-9]+]], !srcloc [[META2:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 0
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 1
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 2
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_012(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 0 || cc == 1 || cc == 2 ? 42 : 0;
+}
+
+// CHECK-LABEL: define dso_local signext i32 @foo_013(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2]], !srcloc [[META3:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 0
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 1
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 3
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_013(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 0 || cc == 1 || cc == 3 ? 42 : 0;
+}
+
+// CHECK-LABEL: define dso_local signext i32 @foo_023(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2]], !srcloc [[META4:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 0
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 2
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 3
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_023(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 0 || cc == 2 || cc == 3 ? 42 : 0;
+}
+
+// CHECK-LABEL: define dso_local signext i32 @foo_123(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2]], !srcloc [[META5:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 1
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 2
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 3
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_123(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 1 || cc == 2 || cc == 3 ? 42 : 0;
+}
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index e0b638201a04740..cb136fe2f446b43 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -5071,6 +5071,9 @@ class TargetLowering : public TargetLoweringBase {
                                             std::vector<SDValue> &Ops,
                                             SelectionDAG &DAG) const;
 
+  // Lower switch statement for flag output operand with SRL/IPM Sequence.
+  virtual bool canLowerSRL_IPM_Switch(SDValue Cond) const;
+
   // Lower custom output constraints. If invalid, return SDValue().
   virtual SDValue LowerAsmOutputForConstraint(SDValue &Chain, SDValue &Glue,
                                               const SDLoc &DL,
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 3b046aa25f54440..a32787bc882f175 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -2831,8 +2831,37 @@ void SelectionDAGBuilder::visitBr(const BranchInst &I) {
       Opcode = Instruction::And;
     else if (match(BOp, m_LogicalOr(m_Value(BOp0), m_Value(BOp1))))
       Opcode = Instruction::Or;
-
-    if (Opcode &&
+    auto &TLI = DAG.getTargetLoweringInfo();
+    bool BrSrlIPM = FuncInfo.MF->getTarget().getTargetTriple().getArch() ==
+                    Triple::ArchType::systemz;
+    // For Flag output operands SRL/IPM sequence, we want to avoid
+    // creating switch case, as it creates Basic Block and inhibits
+    // optimization in DAGCombiner for flag output operands.
+    const auto checkSRLIPM = [&TLI](const SDValue &Op) {
+      if (!Op.getNumOperands())
+        return false;
+      SDValue OpVal = Op.getOperand(0);
+      SDNode *N = OpVal.getNode();
+      if (N && N->getOpcode() == ISD::SRL)
+        return TLI.canLowerSRL_IPM_Switch(OpVal);
+      else if (N && OpVal.getNumOperands() &&
+               (N->getOpcode() == ISD::AND || N->getOpcode() == ISD::OR)) {
+        SDValue OpVal1 = OpVal.getOperand(0);
+        SDNode *N1 = OpVal1.getNode();
+        if (N1 && N1->getOpcode() == ISD::SRL)
+          return TLI.canLowerSRL_IPM_Switch(OpVal1);
+      }
+      return false;
+    };
+    if (BrSrlIPM) {
+      if (NodeMap.count(BOp0) && NodeMap[BOp0].getNode()) {
+        BrSrlIPM &= checkSRLIPM(getValue(BOp0));
+        if (NodeMap.count(BOp1) && NodeMap[BOp1].getNode())
+          BrSrlIPM &= checkSRLIPM(getValue(BOp1));
+      } else
+        BrSrlIPM = false;
+    }
+    if (Opcode && !BrSrlIPM &&
         !(match(BOp0, m_ExtractElt(m_Value(Vec), m_Value())) &&
           match(BOp1, m_ExtractElt(m_Specific(Vec), m_Value()))) &&
         !shouldKeepJumpConditionsTogether(
@@ -12043,18 +12072,41 @@ void SelectionDAGBuilder::lowerWorkItem(SwitchWorkListItem W, Value *Cond,
       const APInt &SmallValue = Small.Low->getValue();
       const APInt &BigValue = Big.Low->getValue();
 
+      // Creating switch cases optimizing tranformation inhibits DAGCombiner
+      // for SystemZ for flag output operands. DAGCobiner compute cumulative
+      // CCMask for flag output operands SRL/IPM sequence, we want to avoid
+      // creating switch case, as it creates Basic Block and inhibits
+      // optimization in DAGCombiner for flag output operands.
+      // cases like (CC == 0) || (CC == 2) || (CC == 3), or
+      // (CC == 0) || (CC == 1) ^ (CC == 3), there could potentially be
+      // more cases like this.
+      const TargetLowering &TLI = DAG.getTargetLoweringInfo();
+      bool IsSrlIPM = false;
+      if (NodeMap.count(Cond) && NodeMap[Cond].getNode())
+        IsSrlIPM = CurMF->getTarget().getTargetTriple().getArch() ==
+                       Triple::ArchType::systemz &&
+                   TLI.canLowerSRL_IPM_Switch(getValue(Cond));
       // Check that there is only one bit different.
       APInt CommonBit = BigValue ^ SmallValue;
-      if (CommonBit.isPowerOf2()) {
+      if (CommonBit.isPowerOf2() || IsSrlIPM) {
         SDValue CondLHS = getValue(Cond);
         EVT VT = CondLHS.getValueType();
         SDLoc DL = getCurSDLoc();
-
-        SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
-                                 DAG.getConstant(CommonBit, DL, VT));
-        SDValue Cond = DAG.getSetCC(
-            DL, MVT::i1, Or, DAG.getConstant(BigValue | SmallValue, DL, VT),
-            ISD::SETEQ);
+        SDValue Cond;
+
+        if (CommonBit.isPowerOf2()) {
+          SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
+                                   DAG.getConstant(CommonBit, DL, VT));
+          Cond = DAG.getSetCC(DL, MVT::i1, Or,
+                              DAG.getConstant(BigValue | SmallValue, DL, VT),
+                              ISD::SETEQ);
+        } else if (IsSrlIPM && BigValue == 3 && SmallValue == 0) {
+          SDValue SetCC =
+              DAG.getSetCC(DL, MVT::i32, CondLHS,
+                           DAG.getConstant(SmallValue, DL, VT), ISD::SETEQ);
+          Cond = DAG.getSetCC(DL, MVT::i32, SetCC,
+                              DAG.getConstant(BigValue, DL, VT), ISD::SETEQ);
+        }
 
         // Update successor info.
         // Both Small and Big will jump to Small.BB, so we sum up the
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 8287565336b54d1..3d48adac509cb9e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -5563,6 +5563,10 @@ const char *TargetLowering::LowerXConstraint(EVT ConstraintVT) const {
   return nullptr;
 }
 
+bool TargetLowering::canLowerSRL_IPM_Switch(SDValue Cond) const {
+  return false;
+}
+
 SDValue TargetLowering::LowerAsmOutputForConstraint(
     SDValue &Chain, SDValue &Glue, const SDLoc &DL,
     const AsmOperandInfo &OpInfo, SelectionDAG &DAG) const {
diff --git a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
index 3999b54de81b657..259da48a3b22321 100644
--- a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
@@ -1207,6 +1207,9 @@ SystemZTargetLowering::getConstraintType(StringRef Constraint) const {
     default:
       break;
     }
+  } else if (Constraint.size() == 5 && Constraint.starts_with("{")) {
+    if (StringRef("{@cc}").compare(Constraint) == 0)
+      return C_Other;
   }
   return TargetLowering::getConstraintType(Constraint);
 }
@@ -1389,6 +1392,10 @@ SystemZTargetLowering::getRegForInlineAsmConstraint(
       return parseRegisterNumber(Constraint, &SystemZ::VR128BitRegClass,
                                  SystemZMC::VR128Regs, 32);
     }
+    if (Constraint[1] == '@') {
+      if (StringRef("{@cc}").compare(Constraint) == 0)
+        return std::make_pair(0u, &SystemZ::GR32BitRegClass);
+    }
   }
   return TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT);
 }
@@ -1421,6 +1428,35 @@ Register SystemZTargetLowering::getExceptionSelectorRegister(
   return Subtarget.isTargetXPLINK64() ? SystemZ::R2D : SystemZ::R7D;
 }
 
+// Lower @cc targets via setcc.
+SDValue SystemZTargetLowering::LowerAsmOutputForConstraint(
+    SDValue &Chain, SDValue &Glue, const SDLoc &DL,
+    const AsmOperandInfo &OpInfo, SelectionDAG &DAG) const {
+  if (StringRef("{@cc}").compare(OpInfo.ConstraintCode) != 0)
+    return SDValue();
+
+  // Check that return type is valid.
+  if (OpInfo.ConstraintVT.isVector() || !OpInfo.ConstraintVT.isInteger() ||
+      OpInfo.ConstraintVT.getSizeInBits() < 8)
+    report_fatal_error("Glue output operand is of invalid type");
+
+  MachineFunction &MF = DAG.getMachineFunction();
+  MachineRegisterInfo &MRI = MF.getRegInfo();
+  MRI.addLiveIn(SystemZ::CC);
+
+  if (Glue.getNode()) {
+    Glue = DAG.getCopyFromReg(Chain, DL, SystemZ::CC, MVT::i32, Glue);
+    Chain = Glue.getValue(1);
+  } else
+    Glue = DAG.getCopyFromReg(Chain, DL, SystemZ::CC, MVT::i32);
+
+  SDValue IPM = DAG.getNode(SystemZISD::IPM, DL, MVT::i32, Glue);
+  SDValue CC = DAG.getNode(ISD::SRL, DL, MVT::i32, IPM,
+                           DAG.getConstant(SystemZ::IPM_CC, DL, MVT::i32));
+
+  return CC;
+}
+
 void SystemZTargetLowering::LowerAsmOperandForConstraint(
     SDValue Op, StringRef Constraint, std::vector<SDValue> &Ops,
     SelectionDAG &DAG) const {
@@ -2485,6 +2521,21 @@ static unsigned CCMaskForCondCode(ISD::CondCode CC) {
 #undef CONV
 }
 
+static unsigned CCMaskForSystemZCCVal(unsigned CC) {
+  switch (CC) {
+  default:
+    llvm_unreachable("invalid integer condition!");
+  case 0:
+    return SystemZ::CCMASK_CMP_EQ;
+  case 1:
+    return SystemZ::CCMASK_CMP_LT;
+  case 2:
+    return SystemZ::CCMASK_CMP_GT;
+  case 3:
+    return SystemZ::CCMASK_CMP_UO;
+  }
+}
+
 // If C can be converted to a comparison against zero, ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/pr-subscribers-clang-codegen

Author: None (anoopkg6)

Changes

Add support for flag output operand "=@cc" for SystemZ and optimizing conditional branch for 14 possible combinations of CC mask.


Patch is 616.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125970.diff

21 Files Affected:

  • (modified) clang/lib/Basic/Targets/SystemZ.cpp (+11)
  • (modified) clang/lib/Basic/Targets/SystemZ.h (+5)
  • (modified) clang/lib/CodeGen/CGStmt.cpp (+8-2)
  • (added) clang/test/CodeGen/inline-asm-systemz-flag-output.c (+149)
  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+61-9)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+4)
  • (modified) llvm/lib/Target/SystemZ/SystemZISelLowering.cpp (+598-2)
  • (modified) llvm/lib/Target/SystemZ/SystemZISelLowering.h (+14)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccand.ll (+500)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccand_eq_noteq.ll (+939)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccand_not.ll (+779)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccmixed.ll (+2427)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccmixed_eq_noteq.ll (+5248)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccmixed_not.ll (+2543)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccor.ll (+1047)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccor_eq_noteq.ll (+854)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccor_not.ll (+806)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccxor.ll (+784)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccxor_eq_noteq.ll (+1083)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccxor_not.ll (+778)
diff --git a/clang/lib/Basic/Targets/SystemZ.cpp b/clang/lib/Basic/Targets/SystemZ.cpp
index 06f08db2eadd475..49f88b45220d0c4 100644
--- a/clang/lib/Basic/Targets/SystemZ.cpp
+++ b/clang/lib/Basic/Targets/SystemZ.cpp
@@ -90,6 +90,14 @@ bool SystemZTargetInfo::validateAsmConstraint(
   case 'T': // Likewise, plus an index
     Info.setAllowsMemory();
     return true;
+  case '@':
+    // CC condition changes.
+    if (strlen(Name) >= 3 && *(Name + 1) == 'c' && *(Name + 2) == 'c') {
+      Name += 2;
+      Info.setAllowsRegister();
+      return true;
+    }
+    return false;
   }
 }
 
@@ -150,6 +158,9 @@ unsigned SystemZTargetInfo::getMinGlobalAlign(uint64_t Size,
 
 void SystemZTargetInfo::getTargetDefines(const LangOptions &Opts,
                                          MacroBuilder &Builder) const {
+  // Inline assembly supports SystemZ flag outputs.
+  Builder.defineMacro("__GCC_ASM_FLAG_OUTPUTS__");
+
   Builder.defineMacro("__s390__");
   Builder.defineMacro("__s390x__");
   Builder.defineMacro("__zarch__");
diff --git a/clang/lib/Basic/Targets/SystemZ.h b/clang/lib/Basic/Targets/SystemZ.h
index ef9a07033a6e4ff..a6909ababdec001 100644
--- a/clang/lib/Basic/Targets/SystemZ.h
+++ b/clang/lib/Basic/Targets/SystemZ.h
@@ -118,6 +118,11 @@ class LLVM_LIBRARY_VISIBILITY SystemZTargetInfo : public TargetInfo {
                              TargetInfo::ConstraintInfo &info) const override;
 
   std::string convertConstraint(const char *&Constraint) const override {
+    if (strncmp(Constraint, "@cc", 3) == 0) {
+      std::string Converted = "{" + std::string(Constraint, 3) + "}";
+      Constraint += 3;
+      return Converted;
+    }
     switch (Constraint[0]) {
     case 'p': // Keep 'p' constraint.
       return std::string("p");
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 41dc91c578c800a..27f7bb652895839 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -2563,9 +2563,15 @@ EmitAsmStores(CodeGenFunction &CGF, const AsmStmt &S,
     if ((i < ResultRegIsFlagReg.size()) && ResultRegIsFlagReg[i]) {
       // Target must guarantee the Value `Tmp` here is lowered to a boolean
       // value.
-      llvm::Constant *Two = llvm::ConstantInt::get(Tmp->getType(), 2);
+      unsigned CCUpperBound = 2;
+      if (CGF.getTarget().getTriple().getArch() == llvm::Triple::systemz) {
+        // On this target CC value can be in range [0, 3].
+        CCUpperBound = 4;
+      }
+      llvm::Constant *CCUpperBoundConst =
+          llvm::ConstantInt::get(Tmp->getType(), CCUpperBound);
       llvm::Value *IsBooleanValue =
-          Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, Two);
+          Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, CCUpperBoundConst);
       llvm::Function *FnAssume = CGM.getIntrinsic(llvm::Intrinsic::assume);
       Builder.CreateCall(FnAssume, IsBooleanValue);
     }
diff --git a/clang/test/CodeGen/inline-asm-systemz-flag-output.c b/clang/test/CodeGen/inline-asm-systemz-flag-output.c
new file mode 100644
index 000000000000000..ab90e031df1f2b8
--- /dev/null
+++ b/clang/test/CodeGen/inline-asm-systemz-flag-output.c
@@ -0,0 +1,149 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple s390x-linux -emit-llvm -o - %s | FileCheck %s
+// CHECK-LABEL: define dso_local signext i32 @foo_012(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2:[0-9]+]], !srcloc [[META2:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 0
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 1
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 2
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_012(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 0 || cc == 1 || cc == 2 ? 42 : 0;
+}
+
+// CHECK-LABEL: define dso_local signext i32 @foo_013(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2]], !srcloc [[META3:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 0
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 1
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 3
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_013(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 0 || cc == 1 || cc == 3 ? 42 : 0;
+}
+
+// CHECK-LABEL: define dso_local signext i32 @foo_023(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2]], !srcloc [[META4:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 0
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 2
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 3
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_023(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 0 || cc == 2 || cc == 3 ? 42 : 0;
+}
+
+// CHECK-LABEL: define dso_local signext i32 @foo_123(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2]], !srcloc [[META5:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 1
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 2
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 3
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_123(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 1 || cc == 2 || cc == 3 ? 42 : 0;
+}
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index e0b638201a04740..cb136fe2f446b43 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -5071,6 +5071,9 @@ class TargetLowering : public TargetLoweringBase {
                                             std::vector<SDValue> &Ops,
                                             SelectionDAG &DAG) const;
 
+  // Lower switch statement for flag output operand with SRL/IPM Sequence.
+  virtual bool canLowerSRL_IPM_Switch(SDValue Cond) const;
+
   // Lower custom output constraints. If invalid, return SDValue().
   virtual SDValue LowerAsmOutputForConstraint(SDValue &Chain, SDValue &Glue,
                                               const SDLoc &DL,
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 3b046aa25f54440..a32787bc882f175 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -2831,8 +2831,37 @@ void SelectionDAGBuilder::visitBr(const BranchInst &I) {
       Opcode = Instruction::And;
     else if (match(BOp, m_LogicalOr(m_Value(BOp0), m_Value(BOp1))))
       Opcode = Instruction::Or;
-
-    if (Opcode &&
+    auto &TLI = DAG.getTargetLoweringInfo();
+    bool BrSrlIPM = FuncInfo.MF->getTarget().getTargetTriple().getArch() ==
+                    Triple::ArchType::systemz;
+    // For Flag output operands SRL/IPM sequence, we want to avoid
+    // creating switch case, as it creates Basic Block and inhibits
+    // optimization in DAGCombiner for flag output operands.
+    const auto checkSRLIPM = [&TLI](const SDValue &Op) {
+      if (!Op.getNumOperands())
+        return false;
+      SDValue OpVal = Op.getOperand(0);
+      SDNode *N = OpVal.getNode();
+      if (N && N->getOpcode() == ISD::SRL)
+        return TLI.canLowerSRL_IPM_Switch(OpVal);
+      else if (N && OpVal.getNumOperands() &&
+               (N->getOpcode() == ISD::AND || N->getOpcode() == ISD::OR)) {
+        SDValue OpVal1 = OpVal.getOperand(0);
+        SDNode *N1 = OpVal1.getNode();
+        if (N1 && N1->getOpcode() == ISD::SRL)
+          return TLI.canLowerSRL_IPM_Switch(OpVal1);
+      }
+      return false;
+    };
+    if (BrSrlIPM) {
+      if (NodeMap.count(BOp0) && NodeMap[BOp0].getNode()) {
+        BrSrlIPM &= checkSRLIPM(getValue(BOp0));
+        if (NodeMap.count(BOp1) && NodeMap[BOp1].getNode())
+          BrSrlIPM &= checkSRLIPM(getValue(BOp1));
+      } else
+        BrSrlIPM = false;
+    }
+    if (Opcode && !BrSrlIPM &&
         !(match(BOp0, m_ExtractElt(m_Value(Vec), m_Value())) &&
           match(BOp1, m_ExtractElt(m_Specific(Vec), m_Value()))) &&
         !shouldKeepJumpConditionsTogether(
@@ -12043,18 +12072,41 @@ void SelectionDAGBuilder::lowerWorkItem(SwitchWorkListItem W, Value *Cond,
       const APInt &SmallValue = Small.Low->getValue();
       const APInt &BigValue = Big.Low->getValue();
 
+      // Creating switch cases optimizing tranformation inhibits DAGCombiner
+      // for SystemZ for flag output operands. DAGCobiner compute cumulative
+      // CCMask for flag output operands SRL/IPM sequence, we want to avoid
+      // creating switch case, as it creates Basic Block and inhibits
+      // optimization in DAGCombiner for flag output operands.
+      // cases like (CC == 0) || (CC == 2) || (CC == 3), or
+      // (CC == 0) || (CC == 1) ^ (CC == 3), there could potentially be
+      // more cases like this.
+      const TargetLowering &TLI = DAG.getTargetLoweringInfo();
+      bool IsSrlIPM = false;
+      if (NodeMap.count(Cond) && NodeMap[Cond].getNode())
+        IsSrlIPM = CurMF->getTarget().getTargetTriple().getArch() ==
+                       Triple::ArchType::systemz &&
+                   TLI.canLowerSRL_IPM_Switch(getValue(Cond));
       // Check that there is only one bit different.
       APInt CommonBit = BigValue ^ SmallValue;
-      if (CommonBit.isPowerOf2()) {
+      if (CommonBit.isPowerOf2() || IsSrlIPM) {
         SDValue CondLHS = getValue(Cond);
         EVT VT = CondLHS.getValueType();
         SDLoc DL = getCurSDLoc();
-
-        SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
-                                 DAG.getConstant(CommonBit, DL, VT));
-        SDValue Cond = DAG.getSetCC(
-            DL, MVT::i1, Or, DAG.getConstant(BigValue | SmallValue, DL, VT),
-            ISD::SETEQ);
+        SDValue Cond;
+
+        if (CommonBit.isPowerOf2()) {
+          SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
+                                   DAG.getConstant(CommonBit, DL, VT));
+          Cond = DAG.getSetCC(DL, MVT::i1, Or,
+                              DAG.getConstant(BigValue | SmallValue, DL, VT),
+                              ISD::SETEQ);
+        } else if (IsSrlIPM && BigValue == 3 && SmallValue == 0) {
+          SDValue SetCC =
+              DAG.getSetCC(DL, MVT::i32, CondLHS,
+                           DAG.getConstant(SmallValue, DL, VT), ISD::SETEQ);
+          Cond = DAG.getSetCC(DL, MVT::i32, SetCC,
+                              DAG.getConstant(BigValue, DL, VT), ISD::SETEQ);
+        }
 
         // Update successor info.
         // Both Small and Big will jump to Small.BB, so we sum up the
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 8287565336b54d1..3d48adac509cb9e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -5563,6 +5563,10 @@ const char *TargetLowering::LowerXConstraint(EVT ConstraintVT) const {
   return nullptr;
 }
 
+bool TargetLowering::canLowerSRL_IPM_Switch(SDValue Cond) const {
+  return false;
+}
+
 SDValue TargetLowering::LowerAsmOutputForConstraint(
     SDValue &Chain, SDValue &Glue, const SDLoc &DL,
     const AsmOperandInfo &OpInfo, SelectionDAG &DAG) const {
diff --git a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
index 3999b54de81b657..259da48a3b22321 100644
--- a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
@@ -1207,6 +1207,9 @@ SystemZTargetLowering::getConstraintType(StringRef Constraint) const {
     default:
       break;
     }
+  } else if (Constraint.size() == 5 && Constraint.starts_with("{")) {
+    if (StringRef("{@cc}").compare(Constraint) == 0)
+      return C_Other;
   }
   return TargetLowering::getConstraintType(Constraint);
 }
@@ -1389,6 +1392,10 @@ SystemZTargetLowering::getRegForInlineAsmConstraint(
       return parseRegisterNumber(Constraint, &SystemZ::VR128BitRegClass,
                                  SystemZMC::VR128Regs, 32);
     }
+    if (Constraint[1] == '@') {
+      if (StringRef("{@cc}").compare(Constraint) == 0)
+        return std::make_pair(0u, &SystemZ::GR32BitRegClass);
+    }
   }
   return TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT);
 }
@@ -1421,6 +1428,35 @@ Register SystemZTargetLowering::getExceptionSelectorRegister(
   return Subtarget.isTargetXPLINK64() ? SystemZ::R2D : SystemZ::R7D;
 }
 
+// Lower @cc targets via setcc.
+SDValue SystemZTargetLowering::LowerAsmOutputForConstraint(
+    SDValue &Chain, SDValue &Glue, const SDLoc &DL,
+    const AsmOperandInfo &OpInfo, SelectionDAG &DAG) const {
+  if (StringRef("{@cc}").compare(OpInfo.ConstraintCode) != 0)
+    return SDValue();
+
+  // Check that return type is valid.
+  if (OpInfo.ConstraintVT.isVector() || !OpInfo.ConstraintVT.isInteger() ||
+      OpInfo.ConstraintVT.getSizeInBits() < 8)
+    report_fatal_error("Glue output operand is of invalid type");
+
+  MachineFunction &MF = DAG.getMachineFunction();
+  MachineRegisterInfo &MRI = MF.getRegInfo();
+  MRI.addLiveIn(SystemZ::CC);
+
+  if (Glue.getNode()) {
+    Glue = DAG.getCopyFromReg(Chain, DL, SystemZ::CC, MVT::i32, Glue);
+    Chain = Glue.getValue(1);
+  } else
+    Glue = DAG.getCopyFromReg(Chain, DL, SystemZ::CC, MVT::i32);
+
+  SDValue IPM = DAG.getNode(SystemZISD::IPM, DL, MVT::i32, Glue);
+  SDValue CC = DAG.getNode(ISD::SRL, DL, MVT::i32, IPM,
+                           DAG.getConstant(SystemZ::IPM_CC, DL, MVT::i32));
+
+  return CC;
+}
+
 void SystemZTargetLowering::LowerAsmOperandForConstraint(
     SDValue Op, StringRef Constraint, std::vector<SDValue> &Ops,
     SelectionDAG &DAG) const {
@@ -2485,6 +2521,21 @@ static unsigned CCMaskForCondCode(ISD::CondCode CC) {
 #undef CONV
 }
 
+static unsigned CCMaskForSystemZCCVal(unsigned CC) {
+  switch (CC) {
+  default:
+    llvm_unreachable("invalid integer condition!");
+  case 0:
+    return SystemZ::CCMASK_CMP_EQ;
+  case 1:
+    return SystemZ::CCMASK_CMP_LT;
+  case 2:
+    return SystemZ::CCMASK_CMP_GT;
+  case 3:
+    return SystemZ::CCMASK_CMP_UO;
+  }
+}
+
 // If C can be converted to a comparison against zero, ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/pr-subscribers-llvm-selectiondag

Author: None (anoopkg6)

Changes

Add support for flag output operand "=@cc" for SystemZ and optimizing conditional branch for 14 possible combinations of CC mask.


Patch is 616.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125970.diff

21 Files Affected:

  • (modified) clang/lib/Basic/Targets/SystemZ.cpp (+11)
  • (modified) clang/lib/Basic/Targets/SystemZ.h (+5)
  • (modified) clang/lib/CodeGen/CGStmt.cpp (+8-2)
  • (added) clang/test/CodeGen/inline-asm-systemz-flag-output.c (+149)
  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+61-9)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+4)
  • (modified) llvm/lib/Target/SystemZ/SystemZISelLowering.cpp (+598-2)
  • (modified) llvm/lib/Target/SystemZ/SystemZISelLowering.h (+14)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccand.ll (+500)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccand_eq_noteq.ll (+939)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccand_not.ll (+779)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccmixed.ll (+2427)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccmixed_eq_noteq.ll (+5248)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccmixed_not.ll (+2543)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccor.ll (+1047)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccor_eq_noteq.ll (+854)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccor_not.ll (+806)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccxor.ll (+784)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccxor_eq_noteq.ll (+1083)
  • (added) llvm/test/CodeGen/SystemZ/flag_output_operand_ccxor_not.ll (+778)
diff --git a/clang/lib/Basic/Targets/SystemZ.cpp b/clang/lib/Basic/Targets/SystemZ.cpp
index 06f08db2eadd475..49f88b45220d0c4 100644
--- a/clang/lib/Basic/Targets/SystemZ.cpp
+++ b/clang/lib/Basic/Targets/SystemZ.cpp
@@ -90,6 +90,14 @@ bool SystemZTargetInfo::validateAsmConstraint(
   case 'T': // Likewise, plus an index
     Info.setAllowsMemory();
     return true;
+  case '@':
+    // CC condition changes.
+    if (strlen(Name) >= 3 && *(Name + 1) == 'c' && *(Name + 2) == 'c') {
+      Name += 2;
+      Info.setAllowsRegister();
+      return true;
+    }
+    return false;
   }
 }
 
@@ -150,6 +158,9 @@ unsigned SystemZTargetInfo::getMinGlobalAlign(uint64_t Size,
 
 void SystemZTargetInfo::getTargetDefines(const LangOptions &Opts,
                                          MacroBuilder &Builder) const {
+  // Inline assembly supports SystemZ flag outputs.
+  Builder.defineMacro("__GCC_ASM_FLAG_OUTPUTS__");
+
   Builder.defineMacro("__s390__");
   Builder.defineMacro("__s390x__");
   Builder.defineMacro("__zarch__");
diff --git a/clang/lib/Basic/Targets/SystemZ.h b/clang/lib/Basic/Targets/SystemZ.h
index ef9a07033a6e4ff..a6909ababdec001 100644
--- a/clang/lib/Basic/Targets/SystemZ.h
+++ b/clang/lib/Basic/Targets/SystemZ.h
@@ -118,6 +118,11 @@ class LLVM_LIBRARY_VISIBILITY SystemZTargetInfo : public TargetInfo {
                              TargetInfo::ConstraintInfo &info) const override;
 
   std::string convertConstraint(const char *&Constraint) const override {
+    if (strncmp(Constraint, "@cc", 3) == 0) {
+      std::string Converted = "{" + std::string(Constraint, 3) + "}";
+      Constraint += 3;
+      return Converted;
+    }
     switch (Constraint[0]) {
     case 'p': // Keep 'p' constraint.
       return std::string("p");
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 41dc91c578c800a..27f7bb652895839 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -2563,9 +2563,15 @@ EmitAsmStores(CodeGenFunction &CGF, const AsmStmt &S,
     if ((i < ResultRegIsFlagReg.size()) && ResultRegIsFlagReg[i]) {
       // Target must guarantee the Value `Tmp` here is lowered to a boolean
       // value.
-      llvm::Constant *Two = llvm::ConstantInt::get(Tmp->getType(), 2);
+      unsigned CCUpperBound = 2;
+      if (CGF.getTarget().getTriple().getArch() == llvm::Triple::systemz) {
+        // On this target CC value can be in range [0, 3].
+        CCUpperBound = 4;
+      }
+      llvm::Constant *CCUpperBoundConst =
+          llvm::ConstantInt::get(Tmp->getType(), CCUpperBound);
       llvm::Value *IsBooleanValue =
-          Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, Two);
+          Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, CCUpperBoundConst);
       llvm::Function *FnAssume = CGM.getIntrinsic(llvm::Intrinsic::assume);
       Builder.CreateCall(FnAssume, IsBooleanValue);
     }
diff --git a/clang/test/CodeGen/inline-asm-systemz-flag-output.c b/clang/test/CodeGen/inline-asm-systemz-flag-output.c
new file mode 100644
index 000000000000000..ab90e031df1f2b8
--- /dev/null
+++ b/clang/test/CodeGen/inline-asm-systemz-flag-output.c
@@ -0,0 +1,149 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple s390x-linux -emit-llvm -o - %s | FileCheck %s
+// CHECK-LABEL: define dso_local signext i32 @foo_012(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2:[0-9]+]], !srcloc [[META2:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 0
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 1
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 2
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_012(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 0 || cc == 1 || cc == 2 ? 42 : 0;
+}
+
+// CHECK-LABEL: define dso_local signext i32 @foo_013(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2]], !srcloc [[META3:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 0
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 1
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 3
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_013(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 0 || cc == 1 || cc == 3 ? 42 : 0;
+}
+
+// CHECK-LABEL: define dso_local signext i32 @foo_023(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2]], !srcloc [[META4:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 0
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 2
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 3
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_023(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 0 || cc == 2 || cc == 3 ? 42 : 0;
+}
+
+// CHECK-LABEL: define dso_local signext i32 @foo_123(
+// CHECK-SAME: i32 noundef signext [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*]]:
+// CHECK-NEXT:    [[X_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[CC:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store i32 [[X]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = call { i32, i32 } asm sideeffect "ahi $0,42\0A", "=d,={@cc},0"(i32 [[TMP0]]) #[[ATTR2]], !srcloc [[META5:![0-9]+]]
+// CHECK-NEXT:    [[ASMRESULT:%.*]] = extractvalue { i32, i32 } [[TMP1]], 0
+// CHECK-NEXT:    [[ASMRESULT1:%.*]] = extractvalue { i32, i32 } [[TMP1]], 1
+// CHECK-NEXT:    store i32 [[ASMRESULT]], ptr [[X_ADDR]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp ult i32 [[ASMRESULT1]], 4
+// CHECK-NEXT:    call void @llvm.assume(i1 [[TMP2]])
+// CHECK-NEXT:    store i32 [[ASMRESULT1]], ptr [[CC]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[TMP3]], 1
+// CHECK-NEXT:    br i1 [[CMP]], label %[[LOR_END:.*]], label %[[LOR_LHS_FALSE:.*]]
+// CHECK:       [[LOR_LHS_FALSE]]:
+// CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i32 [[TMP4]], 2
+// CHECK-NEXT:    br i1 [[CMP2]], label %[[LOR_END]], label %[[LOR_RHS:.*]]
+// CHECK:       [[LOR_RHS]]:
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[CC]], align 4
+// CHECK-NEXT:    [[CMP3:%.*]] = icmp eq i32 [[TMP5]], 3
+// CHECK-NEXT:    br label %[[LOR_END]]
+// CHECK:       [[LOR_END]]:
+// CHECK-NEXT:    [[TMP6:%.*]] = phi i1 [ true, %[[LOR_LHS_FALSE]] ], [ true, %[[ENTRY]] ], [ [[CMP3]], %[[LOR_RHS]] ]
+// CHECK-NEXT:    [[TMP7:%.*]] = zext i1 [[TMP6]] to i64
+// CHECK-NEXT:    [[COND:%.*]] = select i1 [[TMP6]], i32 42, i32 0
+// CHECK-NEXT:    ret i32 [[COND]]
+//
+int foo_123(int x) {
+  int cc;
+  asm volatile ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
+  return cc == 1 || cc == 2 || cc == 3 ? 42 : 0;
+}
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index e0b638201a04740..cb136fe2f446b43 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -5071,6 +5071,9 @@ class TargetLowering : public TargetLoweringBase {
                                             std::vector<SDValue> &Ops,
                                             SelectionDAG &DAG) const;
 
+  // Lower switch statement for flag output operand with SRL/IPM Sequence.
+  virtual bool canLowerSRL_IPM_Switch(SDValue Cond) const;
+
   // Lower custom output constraints. If invalid, return SDValue().
   virtual SDValue LowerAsmOutputForConstraint(SDValue &Chain, SDValue &Glue,
                                               const SDLoc &DL,
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 3b046aa25f54440..a32787bc882f175 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -2831,8 +2831,37 @@ void SelectionDAGBuilder::visitBr(const BranchInst &I) {
       Opcode = Instruction::And;
     else if (match(BOp, m_LogicalOr(m_Value(BOp0), m_Value(BOp1))))
       Opcode = Instruction::Or;
-
-    if (Opcode &&
+    auto &TLI = DAG.getTargetLoweringInfo();
+    bool BrSrlIPM = FuncInfo.MF->getTarget().getTargetTriple().getArch() ==
+                    Triple::ArchType::systemz;
+    // For Flag output operands SRL/IPM sequence, we want to avoid
+    // creating switch case, as it creates Basic Block and inhibits
+    // optimization in DAGCombiner for flag output operands.
+    const auto checkSRLIPM = [&TLI](const SDValue &Op) {
+      if (!Op.getNumOperands())
+        return false;
+      SDValue OpVal = Op.getOperand(0);
+      SDNode *N = OpVal.getNode();
+      if (N && N->getOpcode() == ISD::SRL)
+        return TLI.canLowerSRL_IPM_Switch(OpVal);
+      else if (N && OpVal.getNumOperands() &&
+               (N->getOpcode() == ISD::AND || N->getOpcode() == ISD::OR)) {
+        SDValue OpVal1 = OpVal.getOperand(0);
+        SDNode *N1 = OpVal1.getNode();
+        if (N1 && N1->getOpcode() == ISD::SRL)
+          return TLI.canLowerSRL_IPM_Switch(OpVal1);
+      }
+      return false;
+    };
+    if (BrSrlIPM) {
+      if (NodeMap.count(BOp0) && NodeMap[BOp0].getNode()) {
+        BrSrlIPM &= checkSRLIPM(getValue(BOp0));
+        if (NodeMap.count(BOp1) && NodeMap[BOp1].getNode())
+          BrSrlIPM &= checkSRLIPM(getValue(BOp1));
+      } else
+        BrSrlIPM = false;
+    }
+    if (Opcode && !BrSrlIPM &&
         !(match(BOp0, m_ExtractElt(m_Value(Vec), m_Value())) &&
           match(BOp1, m_ExtractElt(m_Specific(Vec), m_Value()))) &&
         !shouldKeepJumpConditionsTogether(
@@ -12043,18 +12072,41 @@ void SelectionDAGBuilder::lowerWorkItem(SwitchWorkListItem W, Value *Cond,
       const APInt &SmallValue = Small.Low->getValue();
       const APInt &BigValue = Big.Low->getValue();
 
+      // Creating switch cases optimizing tranformation inhibits DAGCombiner
+      // for SystemZ for flag output operands. DAGCobiner compute cumulative
+      // CCMask for flag output operands SRL/IPM sequence, we want to avoid
+      // creating switch case, as it creates Basic Block and inhibits
+      // optimization in DAGCombiner for flag output operands.
+      // cases like (CC == 0) || (CC == 2) || (CC == 3), or
+      // (CC == 0) || (CC == 1) ^ (CC == 3), there could potentially be
+      // more cases like this.
+      const TargetLowering &TLI = DAG.getTargetLoweringInfo();
+      bool IsSrlIPM = false;
+      if (NodeMap.count(Cond) && NodeMap[Cond].getNode())
+        IsSrlIPM = CurMF->getTarget().getTargetTriple().getArch() ==
+                       Triple::ArchType::systemz &&
+                   TLI.canLowerSRL_IPM_Switch(getValue(Cond));
       // Check that there is only one bit different.
       APInt CommonBit = BigValue ^ SmallValue;
-      if (CommonBit.isPowerOf2()) {
+      if (CommonBit.isPowerOf2() || IsSrlIPM) {
         SDValue CondLHS = getValue(Cond);
         EVT VT = CondLHS.getValueType();
         SDLoc DL = getCurSDLoc();
-
-        SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
-                                 DAG.getConstant(CommonBit, DL, VT));
-        SDValue Cond = DAG.getSetCC(
-            DL, MVT::i1, Or, DAG.getConstant(BigValue | SmallValue, DL, VT),
-            ISD::SETEQ);
+        SDValue Cond;
+
+        if (CommonBit.isPowerOf2()) {
+          SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
+                                   DAG.getConstant(CommonBit, DL, VT));
+          Cond = DAG.getSetCC(DL, MVT::i1, Or,
+                              DAG.getConstant(BigValue | SmallValue, DL, VT),
+                              ISD::SETEQ);
+        } else if (IsSrlIPM && BigValue == 3 && SmallValue == 0) {
+          SDValue SetCC =
+              DAG.getSetCC(DL, MVT::i32, CondLHS,
+                           DAG.getConstant(SmallValue, DL, VT), ISD::SETEQ);
+          Cond = DAG.getSetCC(DL, MVT::i32, SetCC,
+                              DAG.getConstant(BigValue, DL, VT), ISD::SETEQ);
+        }
 
         // Update successor info.
         // Both Small and Big will jump to Small.BB, so we sum up the
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 8287565336b54d1..3d48adac509cb9e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -5563,6 +5563,10 @@ const char *TargetLowering::LowerXConstraint(EVT ConstraintVT) const {
   return nullptr;
 }
 
+bool TargetLowering::canLowerSRL_IPM_Switch(SDValue Cond) const {
+  return false;
+}
+
 SDValue TargetLowering::LowerAsmOutputForConstraint(
     SDValue &Chain, SDValue &Glue, const SDLoc &DL,
     const AsmOperandInfo &OpInfo, SelectionDAG &DAG) const {
diff --git a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
index 3999b54de81b657..259da48a3b22321 100644
--- a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
@@ -1207,6 +1207,9 @@ SystemZTargetLowering::getConstraintType(StringRef Constraint) const {
     default:
       break;
     }
+  } else if (Constraint.size() == 5 && Constraint.starts_with("{")) {
+    if (StringRef("{@cc}").compare(Constraint) == 0)
+      return C_Other;
   }
   return TargetLowering::getConstraintType(Constraint);
 }
@@ -1389,6 +1392,10 @@ SystemZTargetLowering::getRegForInlineAsmConstraint(
       return parseRegisterNumber(Constraint, &SystemZ::VR128BitRegClass,
                                  SystemZMC::VR128Regs, 32);
     }
+    if (Constraint[1] == '@') {
+      if (StringRef("{@cc}").compare(Constraint) == 0)
+        return std::make_pair(0u, &SystemZ::GR32BitRegClass);
+    }
   }
   return TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT);
 }
@@ -1421,6 +1428,35 @@ Register SystemZTargetLowering::getExceptionSelectorRegister(
   return Subtarget.isTargetXPLINK64() ? SystemZ::R2D : SystemZ::R7D;
 }
 
+// Lower @cc targets via setcc.
+SDValue SystemZTargetLowering::LowerAsmOutputForConstraint(
+    SDValue &Chain, SDValue &Glue, const SDLoc &DL,
+    const AsmOperandInfo &OpInfo, SelectionDAG &DAG) const {
+  if (StringRef("{@cc}").compare(OpInfo.ConstraintCode) != 0)
+    return SDValue();
+
+  // Check that return type is valid.
+  if (OpInfo.ConstraintVT.isVector() || !OpInfo.ConstraintVT.isInteger() ||
+      OpInfo.ConstraintVT.getSizeInBits() < 8)
+    report_fatal_error("Glue output operand is of invalid type");
+
+  MachineFunction &MF = DAG.getMachineFunction();
+  MachineRegisterInfo &MRI = MF.getRegInfo();
+  MRI.addLiveIn(SystemZ::CC);
+
+  if (Glue.getNode()) {
+    Glue = DAG.getCopyFromReg(Chain, DL, SystemZ::CC, MVT::i32, Glue);
+    Chain = Glue.getValue(1);
+  } else
+    Glue = DAG.getCopyFromReg(Chain, DL, SystemZ::CC, MVT::i32);
+
+  SDValue IPM = DAG.getNode(SystemZISD::IPM, DL, MVT::i32, Glue);
+  SDValue CC = DAG.getNode(ISD::SRL, DL, MVT::i32, IPM,
+                           DAG.getConstant(SystemZ::IPM_CC, DL, MVT::i32));
+
+  return CC;
+}
+
 void SystemZTargetLowering::LowerAsmOperandForConstraint(
     SDValue Op, StringRef Constraint, std::vector<SDValue> &Ops,
     SelectionDAG &DAG) const {
@@ -2485,6 +2521,21 @@ static unsigned CCMaskForCondCode(ISD::CondCode CC) {
 #undef CONV
 }
 
+static unsigned CCMaskForSystemZCCVal(unsigned CC) {
+  switch (CC) {
+  default:
+    llvm_unreachable("invalid integer condition!");
+  case 0:
+    return SystemZ::CCMASK_CMP_EQ;
+  case 1:
+    return SystemZ::CCMASK_CMP_LT;
+  case 2:
+    return SystemZ::CCMASK_CMP_GT;
+  case 3:
+    return SystemZ::CCMASK_CMP_UO;
+  }
+}
+
 // If C can be converted to a comparison against zero, ...
[truncated]

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing llvm.assume intrinsic will cause performance hit.

I think it's more likely including the assume is the hit

… bound

  for all backend suuporting flag output operand (X86, AARCH64 and SystemZ).
- Remove all changes target specific changes from SelectionDAGBuiler.cpp.
- Added getJumpConditionMergingParams for SystemZ for setting cost for
  merging srl/ipm/cc.
- TODO: Handle the cases where simplifyBranchOnICmpChain creates switch table
  while folding branch on And'd or Or'd chain of icmp instructions.
Copy link
Member

@uweigand uweigand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a full review, just some initial comments on combineCCMask. I think it would be good to have more comments explaining the specific transformations you're attempting to implement, with an argument why they are correct for all inputs.

@anoopkg6
Copy link
Contributor Author

anoopkg6 commented Apr 28, 2025 via email

@anoopkg6
Copy link
Contributor Author

anoopkg6 commented Apr 28, 2025 via email

anoopkg6 and others added 2 commits August 28, 2025 17:59
…and()

2. Change setOutputOperand() implementation by directly setting ImmRange fields
   Min/Max/isConstrainedlds.
anoopkg6 and others added 2 commits August 29, 2025 08:21
2. Ignore git-clang-format one error in SystemZISelLowering.cpp in this commit.
Copy link
Member

@uweigand uweigand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes now look good to me, but the tests still need some work. In a nutshell, most of the tests don't actually seem to verify what you wanted to verify. These are IL tests, that only verify the translation from IL to assembler. You appear to be more interested in the translation from source to IL - but that isn't actually tested.

Apparently you generated those tests from source code. But that means the test doesn't actually verify this - if common optimizers change in the future, your test will not notice this. Also, this leads to most of the tests actually not being interesting at the IL level - the IL was in some cases fully optimized away (due to tautological optimizations), or else many different tests are actually the same (or extremely similar) at the IL level.

You should rather design a set of IL tests starting with IL, i.e. make sure that interesting combinations of IL are actually optimized by the back-end code you're adding.

If you do want to add tests that certain source code is transformed to particular IL, these should be clang-level test cases. This may not be very useful if that ends up to verify just generic optimizations, but there may be some interesting cases remaining (e.g. because optimizers are able to exploit the < 4 assertion).

Copy link
Member

@uweigand uweigand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now looks good, I cannot see any functional issues any more. Two last remaining cosmetic issues pointed out inline. Otherwise, we only need a more complete commit message and then this looks good to go.

Copy link
Member

@uweigand uweigand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now everything LGTM, thanks!

@uweigand uweigand merged commit 6712e20 into llvm:main Oct 14, 2025
10 checks passed
akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025
Added Support for flag output operand "=@cc", inline assembly constraint
for
SystemZ.

- Clang now accepts "=@cc" assembly operands, and sets 2-bits condition
code
    for output operand for SyatemZ.

- Clang currently emits an assertion that flag output operands are
boolean
values, i.e. in the range [0, 2). Generalize this mechanism to allow
targets to specify arbitrary range assertions for any inline assembly
    output operand.  This will be used to assert that SystemZ two-bit
    condition-code values are in the range [0, 4).

- SystemZ backend lowers "@cc" targets by using ipm sequence to extract
    condition code from PSW.

  - DAGCombine tries to optimize lowered ipm sequence by combining
CCReg and computing effective CCMask and CCValid in combineCCMask for
    select_ccmask and br_ccmask.

- Cost computation is done for merging conditionals for branch
instruction
in SelectionDAG, as split may cause branches conditions evaluation goes
    across basic block and difficult to combine.

---------

Co-authored-by: anoopkg6 <[email protected]>
Co-authored-by: Ulrich Weigand <[email protected]>
@nathanchance
Copy link
Member

I am seeing a crash while building the Linux kernel for ARCH=s390 after this change.

# bad: [96da982128bf7b005afa24a8e6e41e5867d30bc4] [sanitizers] COMPILER_RT_ASAN_UNIT_TESTS_USE_HOST_RUNTIME to build only unit tests (#161455)
# good: [e8f721e621d85a2670f13307b1b99528cf5e8708] [clang][docs] Update doc and release note for probe instrumentation (#162606)
git bisect start '96da982128bf7b005afa24a8e6e41e5867d30bc4' 'e8f721e621d85a2670f13307b1b99528cf5e8708'
# bad: [da5fb5e964c213d0ec834ad0b560a523a57ce5cc] [ObjCopy][DX] Support for -dump-section flag (#159999)
git bisect bad da5fb5e964c213d0ec834ad0b560a523a57ce5cc
# bad: [69e0fd6d8dea666205fca52265f09b3eb5ee2f3d] [X86] Remove PREFETCHI from PTL (#163196)
git bisect bad 69e0fd6d8dea666205fca52265f09b3eb5ee2f3d
# good: [782dd178fcb3b146dd16792b54c867095b863ccc] [SPIRV] Do not emit @llvm.compiler.used (#162678)
git bisect good 782dd178fcb3b146dd16792b54c867095b863ccc
# good: [4a8dd4998dae8b7d67e416d20a1fa8a9451c64f5] [BOLT][NFC] Fix for a dangling reference UB (#163344)
git bisect good 4a8dd4998dae8b7d67e416d20a1fa8a9451c64f5
# bad: [d7fc7703402184792319f65570ad6a49ffe8cde7] [LLVM][DAGCombiner] Improve simplifyDivRem's effectiveness after type legalisation. (#162706)
git bisect bad d7fc7703402184792319f65570ad6a49ffe8cde7
# good: [3793e75b7af7e4908316e7869d8fc61517401865] [libc++][C++03] Cherry-pick #129348 (#162821)
git bisect good 3793e75b7af7e4908316e7869d8fc61517401865
# bad: [6712e20c5261376a6b0015fb3c8d15124757d47d] Add support for flag output operand "=@cc" for SystemZ. (#125970)
git bisect bad 6712e20c5261376a6b0015fb3c8d15124757d47d
# first bad commit: [6712e20c5261376a6b0015fb3c8d15124757d47d] Add support for flag output operand "=@cc" for SystemZ. (#125970)
$ make -skj"$(nproc)" ARCH=s390 LLVM=1 clean allmodconfig drivers/gpu/drm/nouveau/nouveau_fence.o

# Machine code for function nouveau_fence_context_kill: NoPHIs, TracksLiveness, TiedOpsRewritten
Function Live Ins: $r2d in %14, $r3d in %15, $cc, $cc

bb.0.entry:
  successors: %bb.1(0x50000000), %bb.2(0x30000000); %bb.1(62.50%), %bb.2(37.50%)
  liveins: $r2d, $r3d, $cc, $cc
  %15:gr64bit = COPY $r3d
  %14:addr64bit = COPY $r2d
  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  $r2d = COPY %14:addr64bit
  CallBRASL @_raw_spin_lock_irqsave, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc, implicit-def $r2d
  %17:gr64bit = COPY $r2d
  %1:addr64bit = LA %14:addr64bit, 80, $noreg
  %18:gr64bit = SRLG %1:addr64bit, $noreg, 3
  %19:gr64bit = LLIHH 28
  %20:addr64bit = AGRK %18:gr64bit, %19:gr64bit, implicit-def dead $cc
  CLI %20:addr64bit, 0, 0, implicit-def $cc :: (load (s8) from %ir.3)
  BRC 14, 8, %bb.2, implicit $cc
  J %bb.1

bb.1 (%ir-block.6):
; predecessors: %bb.0
  successors: %bb.2(0x80000000); %bb.2(100.00%)

  $r2d = COPY %1:addr64bit
  nomerge CallBRASL @__asan_report_load8_noabort, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc

bb.2 (%ir-block.8):
; predecessors: %bb.0, %bb.1
  successors: %bb.3(0x30000000), %bb.4(0x50000000); %bb.3(37.50%), %bb.4(62.50%)

  %2:gr64bit = LG %1:addr64bit, 0, $noreg :: (load (s64) from %ir.pending)
  CGR %2:gr64bit, %1:addr64bit, implicit-def $cc
  BRC 14, 6, %bb.4, implicit $cc
  J %bb.3

bb.3.entry.for.end_crit_edge:
; predecessors: %bb.2
  successors: %bb.36(0x80000000); %bb.36(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  J %bb.36

bb.4.for.body.lr.ph:
; predecessors: %bb.2
  successors: %bb.5(0x80000000); %bb.5(100.00%)

  %16:gr32bit = COPY %15.subreg_l32:gr64bit
  %21:gr64bit = LLGFR %16:gr32bit
  %22:gr64bit = LGHI 0
  $r2d = COPY %22:gr64bit
  $r3d = COPY %21:gr64bit
  CallBRASL @__sanitizer_cov_trace_const_cmp4, $r2d, $r3d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  %23:gr64bit = LLILF 4294963201
  $r2d = COPY %23:gr64bit
  $r3d = COPY %21:gr64bit
  CallBRASL @__sanitizer_cov_trace_const_cmp4, $r2d, $r3d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  %3:gr64bit = LA %14:addr64bit, 272, $noreg
  %25:gr64bit = LLIHH 28
  %36:gr64bit = LGHI 4
  %37:gr64bit = LGHI 3
  %41:gr64bit = LGHI 40
  %76:addr64bit = COPY %2:gr64bit

bb.5.for.body:
; predecessors: %bb.4, %bb.34
  successors: %bb.6(0x50000000), %bb.7(0x30000000); %bb.6(62.50%), %bb.7(37.50%)

  %4:addr64bit = COPY %76:addr64bit
  %24:gr64bit = SRLG %4:addr64bit, $noreg, 3
  %26:addr64bit = AGRK %24:gr64bit, %25:gr64bit, implicit-def dead $cc
  CLI %26:addr64bit, 0, 0, implicit-def $cc :: (load (s8) from %ir.13)
  BRC 14, 8, %bb.7, implicit $cc
  J %bb.6

bb.6 (%ir-block.16):
; predecessors: %bb.5
  successors: %bb.7(0x80000000); %bb.7(100.00%)

  $r2d = COPY %4:addr64bit
  nomerge CallBRASL @__asan_report_load8_noabort, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc

bb.7 (%ir-block.18):
; predecessors: %bb.5, %bb.6
  successors: %bb.8(0x30000000), %bb.9(0x50000000); %bb.8(37.50%), %bb.9(62.50%)

  %5:gr64bit = LAY %4:addr64bit, -64, $noreg
  %6:gr64bit = LG %4:addr64bit, 0, $noreg :: (load (s64) from %ir..pn.in55)
  CHIMux %16:gr32bit, 0, implicit-def $cc
  BRC 14, 6, %bb.9, implicit $cc
  J %bb.8

bb.8.for.body.if.end_crit_edge:
; predecessors: %bb.7
  successors: %bb.30(0x80000000); %bb.30(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  J %bb.30

bb.9.cond.false.i:
; predecessors: %bb.7
  successors: %bb.10(0x40000000), %bb.11(0x40000000); %bb.10(50.00%), %bb.11(50.00%)

  %28:gr64bit = AGHIK %4:addr64bit, -9, implicit-def $cc
  %29:addr64bit = COPY %28:gr64bit
  INLINEASM &"\09tm\09$1,$2" [sideeffect] [mayload] [maystore] [attdialect], $0:[regdef:GR32Bit], def dead %27:gr32bit, $1:[mem:Q], %29:addr64bit, 0, $noreg, $2:[imm], 2, !4
  %74:gr32bit = IPM implicit $cc
  %30:gr32bit = COPY %74:gr32bit
  %31:gr32bit = IPM implicit $cc
  undef %32.subreg_l32:gr64bit = COPY %31:gr32bit
  %34:gr64bit = RISBG undef %34:gr64bit(tied-def 0), %32:gr64bit, 60, 191, 36, implicit-def dead $cc
  $r2d = COPY %36:gr64bit
  $r3d = COPY %34:gr64bit
  CallBRASL @__sanitizer_cov_trace_const_cmp4, $r2d, $r3d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  $r2d = COPY %37:gr64bit
  $r3d = COPY %34:gr64bit
  CallBRASL @__sanitizer_cov_trace_const_cmp4, $r2d, $r3d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  $cc = COPY %30:gr32bit
  BRC 15, 14, %bb.11, implicit $cc
  J %bb.10

bb.10.cond.false.i.if.end_crit_edge:
; predecessors: %bb.9
  successors: %bb.30(0x80000000); %bb.30(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  J %bb.30

bb.11.if.end.i:
; predecessors: %bb.9
  successors: %bb.12(0x50000000), %bb.13(0x30000000); %bb.12(62.50%), %bb.13(37.50%)

  %7:addr64bit = AGHIK %4:addr64bit, -56, implicit-def dead $cc
  %38:gr64bit = SRLG %7:addr64bit, $noreg, 3
  %40:addr64bit = AGRK %38:gr64bit, %25:gr64bit, implicit-def dead $cc
  CLI %40:addr64bit, 0, 0, implicit-def $cc :: (load (s8) from %ir.24)
  BRC 14, 8, %bb.13, implicit $cc
  J %bb.12

bb.12 (%ir-block.27):
; predecessors: %bb.11
  successors: %bb.13(0x80000000); %bb.13(100.00%)

  $r2d = COPY %7:addr64bit
  nomerge CallBRASL @__asan_report_load8_noabort, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc

bb.13 (%ir-block.29):
; predecessors: %bb.11, %bb.12
  successors: %bb.14(0x50000000), %bb.15(0x30000000); %bb.14(62.50%), %bb.15(37.50%)

  %8:addr64bit = COPY %41:gr64bit
  %8:addr64bit = AG %8:addr64bit(tied-def 0), %7:addr64bit, 0, $noreg, implicit-def dead $cc :: (load (s64) from %ir.ops.i)
  %42:gr64bit = SRLG %8:addr64bit, $noreg, 3
  %44:addr64bit = AGRK %42:gr64bit, %25:gr64bit, implicit-def dead $cc
  CLI %44:addr64bit, 0, 0, implicit-def $cc :: (load (s8) from %ir.34)
  BRC 14, 8, %bb.15, implicit $cc
  J %bb.14

bb.14 (%ir-block.37):
; predecessors: %bb.13
  successors: %bb.15(0x80000000); %bb.15(100.00%)

  $r2d = COPY %8:addr64bit
  nomerge CallBRASL @__asan_report_load8_noabort, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc

bb.15 (%ir-block.39):
; predecessors: %bb.13, %bb.14
  successors: %bb.16(0x30000000), %bb.17(0x50000000); %bb.16(37.50%), %bb.17(62.50%)

  %9:addr64bit = LG %8:addr64bit, 0, $noreg :: (load (s64) from %ir.signaled.i)
  CGHI %9:addr64bit, 0, implicit-def $cc
  BRC 14, 6, %bb.17, implicit $cc
  J %bb.16

bb.16.if.end.i.cond.false.i45_crit_edge:
; predecessors: %bb.15
  successors: %bb.20(0x80000000); %bb.20(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  J %bb.20

bb.17.land.lhs.true11.i:
; predecessors: %bb.15
  successors: %bb.19(0x40000000), %bb.18(0x40000000); %bb.19(50.00%), %bb.18(50.00%)

  $r2d = COPY %5:gr64bit
  CallBASR %9:addr64bit, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc, implicit-def $r2d
  %45:gr64bit = COPY $r2d
  %46:grx32bit = COPY %45.subreg_l32:gr64bit
  CHIMux %46:grx32bit, 0, implicit-def $cc
  BRC 14, 6, %bb.19, implicit $cc
  J %bb.18

bb.18.land.lhs.true11.i.cond.false.i45_crit_edge:
; predecessors: %bb.17
  successors: %bb.20(0x80000000); %bb.20(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  J %bb.20

bb.19.if.then16.i:
; predecessors: %bb.17
  successors: %bb.30(0x80000000); %bb.30(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  $r2d = COPY %5:gr64bit
  CallBRASL @dma_fence_signal_locked, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc, implicit-def dead $r2d
  J %bb.30

bb.20.cond.false.i45:
; predecessors: %bb.18, %bb.16
  successors: %bb.22(0x00106035), %bb.21(0x7fef9fcb); %bb.22(0.05%), %bb.21(99.95%)

  %50:addr64bit = COPY %28:gr64bit
  INLINEASM &"\09tm\09$1,$2" [sideeffect] [mayload] [maystore] [attdialect], $0:[regdef:GR32Bit], def dead %48:gr32bit, $1:[mem:Q], %50:addr64bit, 0, $noreg, $2:[imm], 2, !4
  %75:gr32bit = IPM implicit $cc
  %51:gr32bit = COPY %75:gr32bit
  %52:gr32bit = IPM implicit $cc
  undef %53.subreg_l32:gr64bit = COPY %52:gr32bit
  %55:gr64bit = RISBG undef %55:gr64bit(tied-def 0), %53:gr64bit, 60, 191, 36, implicit-def dead $cc
  $r2d = COPY %36:gr64bit
  $r3d = COPY %55:gr64bit
  CallBRASL @__sanitizer_cov_trace_const_cmp4, $r2d, $r3d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  $r2d = COPY %37:gr64bit
  $r3d = COPY %55:gr64bit
  CallBRASL @__sanitizer_cov_trace_const_cmp4, $r2d, $r3d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  $cc = COPY %51:gr32bit
  BRC 15, 1, %bb.22, implicit killed $cc
  J %bb.21

bb.21.cond.false.i45.if.else59.i_crit_edge:
; predecessors: %bb.20
  successors: %bb.23(0x80000000); %bb.23(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  J %bb.23

bb.22.do.body26.i:
; predecessors: %bb.20
  successors: %bb.23(0x80000000); %bb.23(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  INLINEASM &"0:\09mc\090,0\0A.section .rodata.str,\22aMS\22,@progbits,1\0A1:\09.asciz\09\22include/linux/dma-fence.h\22\0A.previous\0A.section __bug_table,\22aw\22\0A2:\09.long\090b-.\0A\09.long\091b-.\0A\09.short\09$0,$1\0A\09.org\092b+$2\0A.previous\0A" [sideeffect] [mayload] [attdialect], $0:[imm], 585, $1:[imm], 2305, $2:[imm], 12, !6

bb.23.if.else59.i:
; predecessors: %bb.21, %bb.22
  successors: %bb.25(0x00106035), %bb.24(0x7fef9fcb); %bb.25(0.05%), %bb.24(99.95%)

  CLFIMux %16:gr32bit, 4294963201, implicit-def $cc
  BRC 14, 4, %bb.25, implicit killed $cc
  J %bb.24

bb.24.if.else59.i.dma_fence_set_error.exit_crit_edge:
; predecessors: %bb.23
  successors: %bb.26(0x80000000); %bb.26(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  J %bb.26

bb.25.do.body70.i:
; predecessors: %bb.23
  successors: %bb.26(0x80000000); %bb.26(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  INLINEASM &"0:\09mc\090,0\0A.section .rodata.str,\22aMS\22,@progbits,1\0A1:\09.asciz\09\22include/linux/dma-fence.h\22\0A.previous\0A.section __bug_table,\22aw\22\0A2:\09.long\090b-.\0A\09.long\091b-.\0A\09.short\09$0,$1\0A\09.org\092b+$2\0A.previous\0A" [sideeffect] [mayload] [attdialect], $0:[imm], 586, $1:[imm], 2305, $2:[imm], 12, !7

bb.26.dma_fence_set_error.exit:
; predecessors: %bb.24, %bb.25
  successors: %bb.27(0x00000800), %bb.29(0x7ffff800); %bb.27(0.00%), %bb.29(100.00%)

  %10:addr64bit = AGHIK %4:addr64bit, -4, implicit-def dead $cc
  %59:gr64bit = SRLG %10:addr64bit, $noreg, 3
  %61:addr64bit = AGRK %59:gr64bit, %25:gr64bit, implicit-def dead $cc
  %11:gr32bit = LBMux %61:addr64bit, 0, $noreg :: (load (s8) from %ir.46)
  CHIMux %11:gr32bit, 0, implicit-def $cc
  BRC 14, 8, %bb.29, implicit killed $cc
  J %bb.27

bb.27 (%ir-block.49):
; predecessors: %bb.26
  successors: %bb.28(0x40000000), %bb.29(0x40000000); %bb.28(50.00%), %bb.29(50.00%)

  %62:grx32bit = COPY %10.subreg_l32:addr64bit
  %63:grx32bit = RISBMux $noreg(tied-def 0), %62:grx32bit, 29, 159, 0
  %64:gr32bit = AHIMuxK %63:grx32bit, 3, implicit-def dead $cc
  CR %64:gr32bit, %11:gr32bit, implicit-def $cc
  BRC 14, 4, %bb.29, implicit killed $cc
  J %bb.28

bb.28 (%ir-block.55):
; predecessors: %bb.27
  successors: %bb.29(0x80000000); %bb.29(100.00%)

  $r2d = COPY %10:addr64bit
  nomerge CallBRASL @__asan_report_store4_noabort, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc

bb.29 (%ir-block.57):
; predecessors: %bb.26, %bb.27, %bb.28
  successors: %bb.30(0x80000000); %bb.30(100.00%)

  STMux %16:gr32bit, %10:addr64bit, 0, $noreg :: (store (s32) into %ir.error85.i)

bb.30.if.end:
; predecessors: %bb.19, %bb.29, %bb.10, %bb.8
  successors: %bb.32(0x40000000), %bb.31(0x40000000); %bb.32(50.00%), %bb.31(50.00%)

  $r2d = COPY %5:gr64bit
  CallBRASL @nouveau_fence_signal, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc, implicit-def $r2d
  %65:gr64bit = COPY $r2d
  %66:grx32bit = COPY %65.subreg_l32:gr64bit
  CHIMux %66:grx32bit, 0, implicit-def $cc
  BRC 14, 6, %bb.32, implicit killed $cc
  J %bb.31

bb.31.if.end.for.inc_crit_edge:
; predecessors: %bb.30
  successors: %bb.33(0x80000000); %bb.33(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  J %bb.33

bb.32.if.then18:
; predecessors: %bb.30
  successors: %bb.33(0x80000000); %bb.33(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  $r2d = COPY %3:gr64bit
  CallBRASL @nvif_event_block, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc, implicit-def dead $r2d

bb.33.for.inc:
; predecessors: %bb.31, %bb.32
  successors: %bb.35(0x04000000), %bb.34(0x7c000000); %bb.35(3.12%), %bb.34(96.88%)

  CGR %6:gr64bit, %1:addr64bit, implicit-def $cc
  BRC 14, 8, %bb.35, implicit killed $cc
  J %bb.34

bb.34.for.inc.for.body_crit_edge:
; predecessors: %bb.33
  successors: %bb.5(0x80000000); %bb.5(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  %76:addr64bit = COPY %6:gr64bit
  J %bb.5

bb.35.for.inc.for.end_crit_edge:
; predecessors: %bb.33
  successors: %bb.36(0x80000000); %bb.36(100.00%)

  nomerge CallBRASL @__sanitizer_cov_trace_pc, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc

bb.36.for.end:
; predecessors: %bb.35, %bb.3
  successors: %bb.37(0x00000800), %bb.39(0x7ffff800); %bb.37(0.00%), %bb.39(100.00%)

  %0:gr64bit = COPY %17:gr64bit
  %12:gr64bit = LA %14:addr64bit, 344, $noreg
  %68:gr64bit = SRLG %12:gr64bit, $noreg, 3
  %69:gr64bit = LLIHH 28
  %70:addr64bit = AGRK %68:gr64bit, %69:gr64bit, implicit-def dead $cc
  %13:gr32bit = LBMux %70:addr64bit, 0, $noreg :: (load (s8) from %ir.61)
  CHIMux %13:gr32bit, 0, implicit-def $cc
  BRC 14, 8, %bb.39, implicit killed $cc
  J %bb.37

bb.37 (%ir-block.64):
; predecessors: %bb.36
  successors: %bb.38(0x40000000), %bb.39(0x40000000); %bb.38(50.00%), %bb.39(50.00%)

  %71:grx32bit = COPY %12.subreg_l32:gr64bit
  %72:grx32bit = RISBMux $noreg(tied-def 0), %71:grx32bit, 29, 159, 0
  %73:gr32bit = AHIMuxK %72:grx32bit, 3, implicit-def dead $cc
  CR %73:gr32bit, %13:gr32bit, implicit-def $cc
  BRC 14, 4, %bb.39, implicit killed $cc
  J %bb.38

bb.38 (%ir-block.70):
; predecessors: %bb.37
  successors: %bb.39(0x80000000); %bb.39(100.00%)

  $r2d = COPY %12:gr64bit
  nomerge CallBRASL @__asan_report_store4_noabort, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc

bb.39 (%ir-block.72):
; predecessors: %bb.36, %bb.37, %bb.38

  MVHI %14:addr64bit, 344, 1 :: (store (s32) into %ir.sunkaddr, align 8)
  $r2d = COPY %14:addr64bit
  $r3d = COPY %0:gr64bit
  CallJG @_raw_spin_unlock_irqrestore, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit $r2d, implicit $r3d

# End machine code for function nouveau_fence_context_kill.

*** Bad machine code: Using an undefined physical register ***
- function:    nouveau_fence_context_kill
- basic block: %bb.20 cond.false.i45 (0x5559e7064b80)
- instruction: %75:gr32bit = IPM implicit $cc
- operand 1:   implicit $cc

*** Bad machine code: Using an undefined physical register ***
- function:    nouveau_fence_context_kill
- basic block: %bb.20 cond.false.i45 (0x5559e7064b80)
- instruction: %52:gr32bit = IPM implicit $cc
- operand 1:   implicit $cc
fatal error: error in backend: Found 2 machine code errors.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang ... -o drivers/gpu/drm/nouveau/nouveau_fence.o drivers/gpu/drm/nouveau/nouveau_fence.c
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module 'drivers/gpu/drm/nouveau/nouveau_fence.c'.
4.	Running pass 'Live Interval Analysis' on function '@nouveau_fence_context_kill'
 #0 0x00005559dd792528 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (clang+0x36fa528)
 #1 0x00005559dd78fca5 llvm::sys::RunSignalHandlers() (clang+0x36f7ca5)
 #2 0x00005559dd711e47 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) CrashRecoveryContext.cpp:0:0
 #3 0x00005559dd711ddf llvm::CrashRecoveryContext::HandleExit(int) (clang+0x3679ddf)
 #4 0x00005559dd78c767 llvm::sys::Process::Exit(int, bool) (clang+0x36f4767)
 #5 0x00005559dc3cb136 (clang+0x2333136)
 #6 0x00005559dd718206 llvm::report_fatal_error(llvm::Twine const&, bool) (clang+0x3680206)
 #7 0x00005559dcdaa41e (clang+0x2d1241e)
 #8 0x00005559dcdaaa86 llvm::MachineFunction::verify(llvm::Pass*, char const*, llvm::raw_ostream*, bool) const (clang+0x2d12a86)
 #9 0x00005559dcc43cc2 llvm::LiveRangeCalc::findReachingDefs(llvm::LiveRange&, llvm::MachineBasicBlock&, llvm::SlotIndex, llvm::Register, llvm::ArrayRef<llvm::SlotIndex>) (clang+0x2babcc2)
#10 0x00005559dcc42f5b llvm::LiveRangeCalc::extend(llvm::LiveRange&, llvm::SlotIndex, llvm::Register, llvm::ArrayRef<llvm::SlotIndex>) (clang+0x2baaf5b)
#11 0x00005559dcc46d19 llvm::LiveIntervalCalc::extendToUses(llvm::LiveRange&, llvm::Register, llvm::LaneBitmask, llvm::LiveInterval*) (clang+0x2baed19)
#12 0x00005559dcc2ee43 llvm::LiveIntervals::computeRegUnitRange(llvm::LiveRange&, unsigned int) (clang+0x2b96e43)
#13 0x00005559dcc2e1af llvm::LiveIntervals::computeLiveInRegUnits() (clang+0x2b961af)
#14 0x00005559dcc2caff llvm::LiveIntervals::analyze(llvm::MachineFunction&) (clang+0x2b94aff)
#15 0x00005559dcc2c8ac llvm::LiveIntervalsWrapperPass::runOnMachineFunction(llvm::MachineFunction&) (clang+0x2b948ac)
#16 0x00005559dccdff23 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (clang+0x2c47f23)
#17 0x00005559dd278cb8 llvm::FPPassManager::runOnFunction(llvm::Function&) (clang+0x31e0cb8)
#18 0x00005559dd2803d2 llvm::FPPassManager::runOnModule(llvm::Module&) (clang+0x31e83d2)
#19 0x00005559dd2796a0 llvm::legacy::PassManagerImpl::run(llvm::Module&) (clang+0x31e16a0)
#20 0x00005559ddeed7de clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (clang+0x3e557de)
#21 0x00005559ddf03008 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (clang+0x3e6b008)
#22 0x00005559df505819 clang::ParseAST(clang::Sema&, bool, bool) (clang+0x546d819)
#23 0x00005559de3ffb16 clang::FrontendAction::Execute() (clang+0x4367b16)
#24 0x00005559de3687cd clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (clang+0x42d07cd)
#25 0x00005559de4d7775 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (clang+0x443f775)
#26 0x00005559dc3caa77 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (clang+0x2332a77)
#27 0x00005559dc3c6875 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>) driver.cpp:0:0
#28 0x00005559dc3c8e4d int llvm::function_ref<int (llvm::SmallVectorImpl<char const*>&)>::callback_fn<clang_main(int, char**, llvm::ToolContext const&)::$_0>(long, llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#29 0x00005559de1cda69 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0
#30 0x00005559dd711d7e llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (clang+0x3679d7e)
#31 0x00005559de1cd2a3 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (clang+0x41352a3)
#32 0x00005559de18eadc clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (clang+0x40f6adc)
#33 0x00005559de18ecf7 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (clang+0x40f6cf7)
#34 0x00005559de1a8578 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (clang+0x4110578)
#35 0x00005559dc3c6130 clang_main(int, char**, llvm::ToolContext const&) (clang+0x232e130)
#36 0x00005559dc3d6247 main (clang+0x233e247)
#37 0x00007f2cc3027635 (/usr/lib/libc.so.6+0x27635)
#38 0x00007f2cc30276e9 __libc_start_main (/usr/lib/libc.so.6+0x276e9)
#39 0x00005559dc3c42e5 _start (clang+0x232c2e5)
clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
ClangBuiltLinux clang version 22.0.0git (https://github.com/llvm/llvm-project.git 097f1e7625966673b881df63a241f755317b0bb9)
...

cvise spits out:

int arch_test_bit_cc, nouveau_fence_wait_legacy_fence;
long jiffies, nouveau_fence_wait_legacy_wait;
long nouveau_fence_wait_legacy() {
  long t = jiffies, timeout = nouveau_fence_wait_legacy_wait;
  while (nouveau_fence_wait_legacy_fence)
    asm("" : "=@cc"(arch_test_bit_cc));
  return timeout - t;
}
$ clang --target=s390x-linux-gnu -c -o /dev/null nouveau_fence.i

$ clang --target=s390x-linux-gnu -O2 -c -o /dev/null nouveau_fence.i

# Machine code for function nouveau_fence_wait_legacy: NoPHIs, TracksLiveness, TiedOpsRewritten
Function Live Ins: $cc

bb.0.entry:
  successors: %bb.3(0x30000000), %bb.1(0x50000000); %bb.3(37.50%), %bb.1(62.50%)
  liveins: $cc
  %0:addr64bit = LARL @nouveau_fence_wait_legacy_fence
  CHSI %0:addr64bit, 0, 0, implicit-def $cc :: (dereferenceable load (s32) from @nouveau_fence_wait_legacy_fence, !tbaa !4)
  BRC 14, 8, %bb.3, implicit $cc
  J %bb.1

bb.1.while.body.lr.ph:
; predecessors: %bb.0
  successors: %bb.2(0x80000000); %bb.2(100.00%)

  INLINEASM &"" [maystore] [attdialect], $0:[regdef:GR32Bit], def dead %1:gr32bit, !8
  %2:gr32bit = IPM implicit $cc
  %3:gr32bit = COPY %2:gr32bit
  %3:gr32bit = SRL %3:gr32bit(tied-def 0), $noreg, 28
  STRL %3:gr32bit, @arch_test_bit_cc :: (store (s32) into @arch_test_bit_cc, !tbaa !4)

bb.2.while.body:
; predecessors: %bb.1, %bb.2
  successors: %bb.2(0x80000000); %bb.2(100.00%)

  J %bb.2

bb.3.while.end:
; predecessors: %bb.0

  %4:gr64bit = LGRL @nouveau_fence_wait_legacy_wait :: (dereferenceable load (s64) from @nouveau_fence_wait_legacy_wait, !tbaa !9)
  %5:addr64bit = LARL @jiffies
  %6:gr64bit = COPY %4:gr64bit
  %6:gr64bit = nsw SG %6:gr64bit(tied-def 0), %5:addr64bit, 0, $noreg, implicit-def dead $cc :: (dereferenceable load (s64) from @jiffies, !tbaa !9)
  $r2d = COPY %6:gr64bit
  Return implicit killed $r2d

# End machine code for function nouveau_fence_wait_legacy.

*** Bad machine code: Using an undefined physical register ***
- function:    nouveau_fence_wait_legacy
- basic block: %bb.1 while.body.lr.ph (0x55dba4f72718)
- instruction: %2:gr32bit = IPM implicit $cc
- operand 1:   implicit $cc
fatal error: error in backend: Found 1 machine code errors.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang --target=s390x-linux-gnu -O2 -c -o /dev/null nouveau_fence.i
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'nouveau_fence.i'.
4.      Running pass 'Live Interval Analysis' on function '@nouveau_fence_wait_legacy'
 #0 0x000055db9b16c2b8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (clang-22+0x36f42b8)
 #1 0x000055db9b169a35 llvm::sys::RunSignalHandlers() (clang-22+0x36f1a35)
 #2 0x000055db9b0ebb97 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) CrashRecoveryContext.cpp:0:0
 #3 0x000055db9b0ebb2f llvm::CrashRecoveryContext::HandleExit(int) (clang-22+0x3673b2f)
 #4 0x000055db9b1664f7 llvm::sys::Process::Exit(int, bool) (clang-22+0x36ee4f7)
 #5 0x000055db99da40c6 (clang-22+0x232c0c6)
 #6 0x000055db9b0f1f56 llvm::report_fatal_error(llvm::Twine const&, bool) (clang-22+0x3679f56)
 #7 0x000055db9a783e5e (clang-22+0x2d0be5e)
 #8 0x000055db9a7844c6 llvm::MachineFunction::verify(llvm::Pass*, char const*, llvm::raw_ostream*, bool) const (clang-22+0x2d0c4c6)
 #9 0x000055db9a61d5c2 llvm::LiveRangeCalc::findReachingDefs(llvm::LiveRange&, llvm::MachineBasicBlock&, llvm::SlotIndex, llvm::Register, llvm::ArrayRef<llvm::SlotIndex>) (clang-22+0x2ba55c2)
#10 0x000055db9a61c85b llvm::LiveRangeCalc::extend(llvm::LiveRange&, llvm::SlotIndex, llvm::Register, llvm::ArrayRef<llvm::SlotIndex>) (clang-22+0x2ba485b)
#11 0x000055db9a620629 llvm::LiveIntervalCalc::extendToUses(llvm::LiveRange&, llvm::Register, llvm::LaneBitmask, llvm::LiveInterval*) (clang-22+0x2ba8629)
#12 0x000055db9a6087b3 llvm::LiveIntervals::computeRegUnitRange(llvm::LiveRange&, unsigned int) (clang-22+0x2b907b3)
#13 0x000055db9a607b1f llvm::LiveIntervals::computeLiveInRegUnits() (clang-22+0x2b8fb1f)
#14 0x000055db9a60646f llvm::LiveIntervals::analyze(llvm::MachineFunction&) (clang-22+0x2b8e46f)
#15 0x000055db9a60621c llvm::LiveIntervalsWrapperPass::runOnMachineFunction(llvm::MachineFunction&) (clang-22+0x2b8e21c)
#16 0x000055db9a6b9903 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (clang-22+0x2c41903)
#17 0x000055db9ac52a48 llvm::FPPassManager::runOnFunction(llvm::Function&) (clang-22+0x31daa48)
#18 0x000055db9ac5a162 llvm::FPPassManager::runOnModule(llvm::Module&) (clang-22+0x31e2162)
#19 0x000055db9ac53430 llvm::legacy::PassManagerImpl::run(llvm::Module&) (clang-22+0x31db430)
#20 0x000055db9b8c473e clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (clang-22+0x3e4c73e)
#21 0x000055db9b8d9f68 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (clang-22+0x3e61f68)
#22 0x000055db9cee0559 clang::ParseAST(clang::Sema&, bool, bool) (clang-22+0x5468559)
#23 0x000055db9bdd6b16 clang::FrontendAction::Execute() (clang-22+0x435eb16)
#24 0x000055db9bd3f77d clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (clang-22+0x42c777d)
#25 0x000055db9beaec05 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (clang-22+0x4436c05)
#26 0x000055db99da3a07 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (clang-22+0x232ba07)
#27 0x000055db99d9f7f5 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>) driver.cpp:0:0
#28 0x000055db99da1ddd int llvm::function_ref<int (llvm::SmallVectorImpl<char const*>&)>::callback_fn<clang_main(int, char**, llvm::ToolContext const&)::$_0>(long, llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#29 0x000055db9bba4b69 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0
#30 0x000055db9b0ebace llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (clang-22+0x3673ace)
#31 0x000055db9bba43a3 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (clang-22+0x412c3a3)
#32 0x000055db9bb65b2c clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (clang-22+0x40edb2c)
#33 0x000055db9bb65d47 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (clang-22+0x40edd47)
#34 0x000055db9bb7f5f8 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (clang-22+0x41075f8)
#35 0x000055db99d9f0b0 clang_main(int, char**, llvm::ToolContext const&) (clang-22+0x23270b0)
#36 0x000055db99daf1d7 main (clang-22+0x23371d7)
#37 0x00007f45f4827635 (/usr/lib/libc.so.6+0x27635)
#38 0x00007f45f48276e9 __libc_start_main (/usr/lib/libc.so.6+0x276e9)
#39 0x000055db99d9d265 _start (clang-22+0x2325265)
clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
ClangBuiltLinux clang version 22.0.0git (https://github.com/llvm/llvm-project.git 6712e20c5261376a6b0015fb3c8d15124757d47d)
...

@anoopkg6
Copy link
Contributor Author

anoopkg6 commented Oct 16, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 backend:SystemZ clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:SelectionDAG SelectionDAGISel as well

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants