Skip to content

Conversation

@arsenm
Copy link
Contributor

@arsenm arsenm commented May 27, 2025

The API is broken and doesn't provide a way to report the used
subregister, so it's unsafe to use. This will produce illegal
folds if the subregister is silently dropped.

The API is broken and doesn't provide a way to report the used
subregister, so it's unsafe to use. This will produce illegal
folds if the subregister is silently dropped.
@arsenm arsenm marked this pull request as ready for review May 27, 2025 18:35
Copy link
Contributor Author

arsenm commented May 27, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@llvmbot
Copy link
Member

llvmbot commented May 27, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Matt Arsenault (arsenm)

Changes

The API is broken and doesn't provide a way to report the used
subregister, so it's unsafe to use. This will produce illegal
folds if the subregister is silently dropped.


Full diff: https://github.com/llvm/llvm-project/pull/141654.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+10-1)
  • (added) llvm/test/CodeGen/AArch64/peephole-opt-analyzeCompare-subreg-use.mir (+67)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index d6225bca233b6..f95d7b7ac8c73 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -1277,8 +1277,9 @@ bool AArch64InstrInfo::analyzeCompare(const MachineInstr &MI, Register &SrcReg,
                                       int64_t &CmpValue) const {
   // The first operand can be a frame index where we'd normally expect a
   // register.
+  // FIXME: Pass subregisters out of analyzeCompare
   assert(MI.getNumOperands() >= 2 && "All AArch64 cmps should have 2 operands");
-  if (!MI.getOperand(1).isReg())
+  if (!MI.getOperand(1).isReg() || MI.getOperand(1).getSubReg())
     return false;
 
   switch (MI.getOpcode()) {
@@ -1288,6 +1289,9 @@ bool AArch64InstrInfo::analyzeCompare(const MachineInstr &MI, Register &SrcReg,
   case AArch64::PTEST_PP_ANY:
     SrcReg = MI.getOperand(0).getReg();
     SrcReg2 = MI.getOperand(1).getReg();
+    if (MI.getOperand(2).getSubReg())
+      return false;
+
     // Not sure about the mask and value for now...
     CmpMask = ~0;
     CmpValue = 0;
@@ -1307,6 +1311,11 @@ bool AArch64InstrInfo::analyzeCompare(const MachineInstr &MI, Register &SrcReg,
     // Replace SUBSWrr with SUBWrr if NZCV is not used.
     SrcReg = MI.getOperand(1).getReg();
     SrcReg2 = MI.getOperand(2).getReg();
+
+    // FIXME: Pass subregisters out of analyzeCompare
+    if (MI.getOperand(2).getSubReg())
+      return false;
+
     CmpMask = ~0;
     CmpValue = 0;
     return true;
diff --git a/llvm/test/CodeGen/AArch64/peephole-opt-analyzeCompare-subreg-use.mir b/llvm/test/CodeGen/AArch64/peephole-opt-analyzeCompare-subreg-use.mir
new file mode 100644
index 0000000000000..c9d6bc7cc8f87
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/peephole-opt-analyzeCompare-subreg-use.mir
@@ -0,0 +1,67 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=aarch64-- -run-pass=peephole-opt -o - %s | FileCheck %s
+
+# Make sure that analyzeCompare doesn't produce illegal folds due to
+# ignoring the subregister index on the use operands.
+
+---
+name:            analyze_compare_subreg_use_lhs
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $x0, $x1
+
+    ; CHECK-LABEL: name: analyze_compare_subreg_use_lhs
+    ; CHECK: liveins: $x0, $x1
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x1
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x0
+    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr32 = COPY $wzr
+    ; CHECK-NEXT: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr [[COPY]].sub_32, [[COPY2]], implicit-def dead $nzcv
+    ; CHECK-NEXT: [[DEF:%[0-9]+]]:gpr64all = IMPLICIT_DEF
+    ; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:gpr64 = INSERT_SUBREG [[DEF]], killed [[SUBSWrr]], %subreg.sub_32
+    ; CHECK-NEXT: [[RORVXr:%[0-9]+]]:gpr64 = RORVXr [[COPY1]], killed [[INSERT_SUBREG]]
+    ; CHECK-NEXT: $x0 = COPY [[RORVXr]]
+    ; CHECK-NEXT: RET_ReallyLR implicit $x0
+    %0:gpr64 = COPY $x1
+    %1:gpr64 = COPY $x0
+    %2:gpr32 = COPY $wzr
+    %3:gpr32 = SUBSWrr %0.sub_32, %2, implicit-def dead $nzcv
+    %4:gpr64all = IMPLICIT_DEF
+    %5:gpr64 = INSERT_SUBREG %4, killed %3, %subreg.sub_32
+    %6:gpr64 = RORVXr %1, killed %5
+    $x0 = COPY %6
+    RET_ReallyLR implicit $x0
+
+...
+
+---
+name:            analyze_compare_subreg_use_rhs
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $x0, $x1
+
+    ; CHECK-LABEL: name: analyze_compare_subreg_use_rhs
+    ; CHECK: liveins: $x0, $x1
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64 = COPY $x1
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x0
+    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr32 = COPY $wzr
+    ; CHECK-NEXT: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr [[COPY2]], [[COPY]].sub_32, implicit-def dead $nzcv
+    ; CHECK-NEXT: [[DEF:%[0-9]+]]:gpr64all = IMPLICIT_DEF
+    ; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:gpr64 = INSERT_SUBREG [[DEF]], killed [[SUBSWrr]], %subreg.sub_32
+    ; CHECK-NEXT: [[RORVXr:%[0-9]+]]:gpr64 = RORVXr [[COPY1]], killed [[INSERT_SUBREG]]
+    ; CHECK-NEXT: $x0 = COPY [[RORVXr]]
+    ; CHECK-NEXT: RET_ReallyLR implicit $x0
+    %0:gpr64 = COPY $x1
+    %1:gpr64 = COPY $x0
+    %2:gpr32 = COPY $wzr
+    %3:gpr32 = SUBSWrr %2, %0.sub_32, implicit-def dead $nzcv
+    %4:gpr64all = IMPLICIT_DEF
+    %5:gpr64 = INSERT_SUBREG %4, killed %3, %subreg.sub_32
+    %6:gpr64 = RORVXr %1, killed %5
+    $x0 = COPY %6
+    RET_ReallyLR implicit $x0
+
+...

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Note that analyzeCompare has exactly one caller, which immediately passes the result to optimizeCompareInstr; maybe we should just merge them together.

@arsenm arsenm merged commit f4ca6d9 into main Jun 3, 2025
15 checks passed
@arsenm arsenm deleted the users/arsenm/aarch64-skip-analyzeCompare-subregisters branch June 3, 2025 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants