Skip to content

Conversation

@diggerlin
Copy link
Contributor

@diggerlin diggerlin commented Sep 9, 2024

the patch is based on Esme's patch #99766.

This patch utilizes getReservedRegs() to find asm clobberable registers.
And to make the result of getReservedRegs() accurate, this patch implements the todo, which is to make r2 allocatable on AIX for some leaf functions.

Thanks for Esme's work, I take over the patch.

@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2024

@llvm/pr-subscribers-backend-powerpc

Author: zhijian lin (diggerlin)

Changes

the patch is based on Esme's patch #99766.

This patch utilizes getReservedRegs() to find asm clobberable registers.
And to make the result of getReservedRegs() accurate, this patch implements the todo, which is to make r2 allocatable on AIX for some leaf functions.

Thanks for Esme's work, I take over the patch.


Patch is 24.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107863.diff

12 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCCallingConv.td (+3-1)
  • (modified) llvm/lib/Target/PowerPC/PPCISelLowering.cpp (+2)
  • (modified) llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp (+18-17)
  • (modified) llvm/lib/Target/PowerPC/PPCRegisterInfo.td (+12-4)
  • (modified) llvm/lib/Target/PowerPC/PPCSubtarget.h (+1-1)
  • (added) llvm/test/CodeGen/PowerPC/aix-inline-asm-clobber-warning.ll (+12)
  • (modified) llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir (+2-3)
  • (modified) llvm/test/CodeGen/PowerPC/inline-asm-clobber-warning.ll (+23-2)
  • (modified) llvm/test/CodeGen/PowerPC/ldst-16-byte.mir (+33-37)
  • (modified) llvm/test/CodeGen/PowerPC/mflr-store.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/peephole-replaceInstr-after-eliminate-extsw.mir (+7-4)
  • (modified) llvm/test/CodeGen/PowerPC/tocdata-non-zero-addend.mir (+2)
diff --git a/llvm/lib/Target/PowerPC/PPCCallingConv.td b/llvm/lib/Target/PowerPC/PPCCallingConv.td
index 825c1a29ed62cb..d966d2a09aa78c 100644
--- a/llvm/lib/Target/PowerPC/PPCCallingConv.td
+++ b/llvm/lib/Target/PowerPC/PPCCallingConv.td
@@ -423,8 +423,10 @@ def CSR_SVR64_ColdCC_R2_VSRP : CalleeSavedRegs<(add CSR_SVR64_ColdCC_VSRP, X2)>;
 def CSR_64_AllRegs_VSRP :
   CalleeSavedRegs<(add CSR_64_AllRegs_VSX, CSR_ALL_VSRP)>;
 
+def CSR_AIX64_R2 : CalleeSavedRegs<(add X2, CSR_PPC64)>;
+
 def CSR_AIX64_VSRP : CalleeSavedRegs<(add CSR_PPC64_Altivec, CSR_VSRP)>;
 
-def CSR_AIX64_R2_VSRP : CalleeSavedRegs<(add CSR_AIX64_VSRP, X2)>;
+def CSR_AIX64_R2_VSRP : CalleeSavedRegs<(add X2, CSR_AIX64_VSRP)>;
 
 def CSR_AIX32_VSRP : CalleeSavedRegs<(add CSR_AIX32_Altivec, CSR_VSRP)>;
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 459a96eca1ff20..4ee9f3301e3bc1 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -3434,6 +3434,8 @@ SDValue PPCTargetLowering::LowerGlobalTLSAddressAIX(SDValue Op,
   if (Subtarget.hasAIXShLibTLSModelOpt())
     updateForAIXShLibTLSModelOpt(Model, DAG, getTargetMachine());
 
+  setUsesTOCBasePtr(DAG);
+
   bool IsTLSLocalExecModel = Model == TLSModel::LocalExec;
 
   if (IsTLSLocalExecModel || Model == TLSModel::InitialExec) {
diff --git a/llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp b/llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
index 9e8da59615dfb3..d43bf473d80cfc 100644
--- a/llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
+++ b/llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
@@ -240,7 +240,7 @@ PPCRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
     if (Subtarget.pairedVectorMemops()) {
       if (Subtarget.isAIXABI()) {
         if (!TM.getAIXExtendedAltivecABI())
-          return SaveR2 ? CSR_PPC64_R2_SaveList : CSR_PPC64_SaveList;
+          return SaveR2 ? CSR_AIX64_R2_SaveList : CSR_PPC64_SaveList;
         return SaveR2 ? CSR_AIX64_R2_VSRP_SaveList : CSR_AIX64_VSRP_SaveList;
       }
       return SaveR2 ? CSR_SVR464_R2_VSRP_SaveList : CSR_SVR464_VSRP_SaveList;
@@ -250,7 +250,9 @@ PPCRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
       return SaveR2 ? CSR_PPC64_R2_Altivec_SaveList
                     : CSR_PPC64_Altivec_SaveList;
     }
-    return SaveR2 ? CSR_PPC64_R2_SaveList : CSR_PPC64_SaveList;
+    return SaveR2 ? (Subtarget.isAIXABI() ? CSR_AIX64_R2_SaveList
+                                          : CSR_PPC64_R2_SaveList)
+                  : CSR_PPC64_SaveList;
   }
   // 32-bit targets.
   if (Subtarget.isAIXABI()) {
@@ -380,6 +382,8 @@ BitVector PPCRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
 
   markSuperRegs(Reserved, PPC::VRSAVE);
 
+  const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
+  bool UsesTOCBasePtr = FuncInfo->usesTOCBasePtr();
   // The SVR4 ABI reserves r2 and r13
   if (Subtarget.isSVR4ABI()) {
     // We only reserve r2 if we need to use the TOC pointer. If we have no
@@ -387,16 +391,15 @@ BitVector PPCRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
     // no constant-pool loads, etc.) and we have no potential uses inside an
     // inline asm block, then we can treat r2 has an ordinary callee-saved
     // register.
-    const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
-    if (!TM.isPPC64() || FuncInfo->usesTOCBasePtr() || MF.hasInlineAsm())
-      markSuperRegs(Reserved, PPC::R2);  // System-reserved register
-    markSuperRegs(Reserved, PPC::R13); // Small Data Area pointer register
+    if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
+      markSuperRegs(Reserved, PPC::R2); // System-reserved register.
+    markSuperRegs(Reserved, PPC::R13);  // Small Data Area pointer register.
   }
 
-  // Always reserve r2 on AIX for now.
-  // TODO: Make r2 allocatable on AIX/XCOFF for some leaf functions.
   if (Subtarget.isAIXABI())
-    markSuperRegs(Reserved, PPC::R2);  // System-reserved register
+    // We only reserve r2 if we need to use the TOC pointer on AIX.
+    if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
+      markSuperRegs(Reserved, PPC::R2); // System-reserved register.
 
   // On PPC64, r13 is the thread pointer. Never allocate this register.
   if (TM.isPPC64())
@@ -441,14 +444,12 @@ BitVector PPCRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
 
 bool PPCRegisterInfo::isAsmClobberable(const MachineFunction &MF,
                                        MCRegister PhysReg) const {
-  // We cannot use getReservedRegs() to find the registers that are not asm
-  // clobberable because there are some reserved registers which can be
-  // clobbered by inline asm. For example, when LR is clobbered, the register is
-  // saved and restored. We will hardcode the registers that are not asm
-  // cloberable in this function.
-
-  // The stack pointer (R1/X1) is not clobberable by inline asm
-  return PhysReg != PPC::R1 && PhysReg != PPC::X1;
+  // CTR and LR registers are always reserved, but they are asm clobberable.
+  if (PhysReg == PPC::CTR || PhysReg == PPC::CTR8 || PhysReg == PPC::LR ||
+      PhysReg == PPC::LR8)
+    return true;
+
+  return !getReservedRegs(MF).test(PhysReg);
 }
 
 bool PPCRegisterInfo::requiresFrameIndexScavenging(const MachineFunction &MF) const {
diff --git a/llvm/lib/Target/PowerPC/PPCRegisterInfo.td b/llvm/lib/Target/PowerPC/PPCRegisterInfo.td
index 3cb7cd9d8f2299..56e170b1230f6f 100644
--- a/llvm/lib/Target/PowerPC/PPCRegisterInfo.td
+++ b/llvm/lib/Target/PowerPC/PPCRegisterInfo.td
@@ -341,7 +341,9 @@ def GPRC : RegisterClass<"PPC", [i32,f32], 32, (add (sequence "R%u", 2, 12),
   // This also helps setting the correct `NumOfGPRsSaved' in traceback table.
   let AltOrders = [(add (sub GPRC, R2), R2),
                    (add (sequence "R%u", 2, 12),
-                        (sequence "R%u", 31, 13), R0, R1, FP, BP)];
+                        (sequence "R%u", 31, 13), R0, R1, FP, BP),
+                   (add (sequence "R%u", 3, 12),
+                        (sequence "R%u", 31, 13), R2, R0, R1, FP, BP)];
   let AltOrderSelect = [{
     return MF.getSubtarget<PPCSubtarget>().getGPRAllocationOrderIdx();
   }];
@@ -354,7 +356,9 @@ def G8RC : RegisterClass<"PPC", [i64], 64, (add (sequence "X%u", 2, 12),
   // put it at the end of the list.
   let AltOrders = [(add (sub G8RC, X2), X2),
                    (add (sequence "X%u", 2, 12),
-                        (sequence "X%u", 31, 13), X0, X1, FP8, BP8)];
+                        (sequence "X%u", 31, 13), X0, X1, FP8, BP8),
+                   (add (sequence "X%u", 3, 12),
+                        (sequence "X%u", 31, 13), X2, X0, X1, FP8, BP8)];
   let AltOrderSelect = [{
     return MF.getSubtarget<PPCSubtarget>().getGPRAllocationOrderIdx();
   }];
@@ -368,7 +372,9 @@ def GPRC_NOR0 : RegisterClass<"PPC", [i32,f32], 32, (add (sub GPRC, R0), ZERO)>
   // put it at the end of the list.
   let AltOrders = [(add (sub GPRC_NOR0, R2), R2),
                    (add (sequence "R%u", 2, 12),
-                        (sequence "R%u", 31, 13), R1, FP, BP, ZERO)];
+                        (sequence "R%u", 31, 13), R1, FP, BP, ZERO),
+                   (add (sequence "R%u", 3, 12),
+                        (sequence "R%u", 31, 13), R2, R1, FP, BP, ZERO)];
   let AltOrderSelect = [{
     return MF.getSubtarget<PPCSubtarget>().getGPRAllocationOrderIdx();
   }];
@@ -379,7 +385,9 @@ def G8RC_NOX0 : RegisterClass<"PPC", [i64], 64, (add (sub G8RC, X0), ZERO8)> {
   // put it at the end of the list.
   let AltOrders = [(add (sub G8RC_NOX0, X2), X2),
                    (add (sequence "X%u", 2, 12),
-                        (sequence "X%u", 31, 13), X1, FP8, BP8, ZERO8)];
+                        (sequence "X%u", 31, 13), X1, FP8, BP8, ZERO8),
+                   (add (sequence "X%u", 3, 12),
+                        (sequence "X%u", 31, 13), X2, X1, FP8, BP8, ZERO8)];
   let AltOrderSelect = [{
     return MF.getSubtarget<PPCSubtarget>().getGPRAllocationOrderIdx();
   }];
diff --git a/llvm/lib/Target/PowerPC/PPCSubtarget.h b/llvm/lib/Target/PowerPC/PPCSubtarget.h
index 2079dc0acc3cf7..9453f692add597 100644
--- a/llvm/lib/Target/PowerPC/PPCSubtarget.h
+++ b/llvm/lib/Target/PowerPC/PPCSubtarget.h
@@ -303,7 +303,7 @@ class PPCSubtarget : public PPCGenSubtargetInfo {
     if (is64BitELFABI())
       return 1;
     if (isAIXABI())
-      return 2;
+      return IsPPC64 ? 3 : 2;
     return 0;
   }
 
diff --git a/llvm/test/CodeGen/PowerPC/aix-inline-asm-clobber-warning.ll b/llvm/test/CodeGen/PowerPC/aix-inline-asm-clobber-warning.ll
new file mode 100644
index 00000000000000..933bd9837f9a66
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/aix-inline-asm-clobber-warning.ll
@@ -0,0 +1,12 @@
+; RUN: llc < %s -mtriple=powerpc-unknown-aix-xcoff -verify-machineinstrs  2>&1 | FileCheck %s
+
+; CHECK: warning: inline asm clobber list contains reserved registers: R2
+; CHECK-NEXT: note: Reserved registers on the clobber list may not be preserved across the asm statement, and clobbering them may lead to undefined behaviour.
+
+@a = external global i32, align 4
+
+define void @bar() {
+  store i32 0, ptr @a, align 4
+  call void asm sideeffect "li 2, 1", "~{r2}"()
+  ret void
+}
diff --git a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
index 7d96f7feabe2be..9a1483d5dac48c 100644
--- a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
+++ b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
@@ -17,6 +17,5 @@ body: |
     BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $f1
 ...
 # CHECK-DAG: AllocationOrder(VFRC) = [ $vf2 $vf3 $vf4 $vf5 $vf0 $vf1 $vf6 $vf7 $vf8 $vf9 $vf10 $vf11 $vf12 $vf13 $vf14 $vf15 $vf16 $vf17 $vf18 $vf19 $vf31 $vf30 $vf29 $vf28 $vf27 $vf26 $vf25 $vf24 $vf23 $vf22 $vf21 $vf20 ]
-# CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x1
-# CHECK-DAG: 4 ]
-# CHECK-DAG: AllocationOrder(F8RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
\ No newline at end of file
+# CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 $x2 ]
+# CHECK-DAG: AllocationOrder(F8RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
diff --git a/llvm/test/CodeGen/PowerPC/inline-asm-clobber-warning.ll b/llvm/test/CodeGen/PowerPC/inline-asm-clobber-warning.ll
index 7f13f5072d97f1..4c460cf6d8059e 100644
--- a/llvm/test/CodeGen/PowerPC/inline-asm-clobber-warning.ll
+++ b/llvm/test/CodeGen/PowerPC/inline-asm-clobber-warning.ll
@@ -1,7 +1,7 @@
 ; RUN: llc < %s -verify-machineinstrs -mtriple=powerpc-unknown-unkown \
-; RUN:   -mcpu=pwr7 2>&1 | FileCheck %s
+; RUN:   -mcpu=pwr7 -O0 2>&1 | FileCheck %s
 ; RUN: llc < %s -verify-machineinstrs -mtriple=powerpc64-unknown-unkown \
-; RUN:   -mcpu=pwr7 2>&1 | FileCheck %s
+; RUN:   -mcpu=pwr7 -O0 2>&1 | FileCheck %s
 
 define void @test_r1_clobber() {
 entry:
@@ -20,3 +20,24 @@ entry:
 
 ; CHECK: warning: inline asm clobber list contains reserved registers: X1
 ; CHECK-NEXT: note: Reserved registers on the clobber list may not be preserved across the asm statement, and clobbering them may lead to undefined behaviour.
+
+; CHECK: warning: inline asm clobber list contains reserved registers: R31
+; CHECK-NEXT: note: Reserved registers on the clobber list may not be preserved across the asm statement, and clobbering them may lead to undefined behaviour.
+
+@a = dso_local global i32 100, align 4
+define dso_local signext i32 @test_r31_r30_clobber() {
+entry:
+  %retval = alloca i32, align 4
+  %old = alloca i64, align 8
+  store i32 0, ptr %retval, align 4
+  call void asm sideeffect "li 31, 1", "~{r31}"()
+  call void asm sideeffect "li 30, 1", "~{r30}"()
+  %0 = call i64 asm sideeffect "mr $0, 31", "=r"()
+  store i64 %0, ptr %old, align 8
+  %1 = load i32, ptr @a, align 4
+  %conv = sext i32 %1 to i64
+  %2 = alloca i8, i64 %conv, align 16
+  %3 = load i64, ptr %old, align 8
+  %conv1 = trunc i64 %3 to i32
+  ret i32 %conv1
+}
diff --git a/llvm/test/CodeGen/PowerPC/ldst-16-byte.mir b/llvm/test/CodeGen/PowerPC/ldst-16-byte.mir
index b9c541feae5acf..7888d297072346 100644
--- a/llvm/test/CodeGen/PowerPC/ldst-16-byte.mir
+++ b/llvm/test/CodeGen/PowerPC/ldst-16-byte.mir
@@ -8,18 +8,18 @@ alignment: 8
 tracksRegLiveness: true
 body: |
   bb.0.entry:
-  liveins: $x3, $x4
+  liveins: $x5, $x4
     ; CHECK-LABEL: name: foo
-    ; CHECK: liveins: $x3, $x4
+    ; CHECK: liveins: $x4, $x5
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: early-clobber renamable $g8p3 = LQ 128, $x4
-    ; CHECK-NEXT: $x3 = OR8 $x7, $x7
-    ; CHECK-NEXT: STQ killed renamable $g8p3, 160, $x3
-    ; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3
+    ; CHECK-NEXT: $x5 = OR8 $x7, $x7
+    ; CHECK-NEXT: STQ killed renamable $g8p3, 160, $x5
+    ; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x5
   %0:g8prc = LQ 128, $x4
-  $x3 = COPY %0.sub_gp8_x1:g8prc
-  STQ %0, 160, $x3
-  BLR8 implicit $lr8, implicit undef $rm, implicit $x3
+  $x5 = COPY %0.sub_gp8_x1:g8prc
+  STQ %0, 160, $x5
+  BLR8 implicit $lr8, implicit undef $rm, implicit $x5
 ...
 
 ---
@@ -73,8 +73,9 @@ body: |
   bb.0.entry:
   liveins: $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12
     ; CHECK-LABEL: name: spill_g8prc
-    ; CHECK: liveins: $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x14, $x15, $x16, $x17, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $x29, $x30, $x31
+    ; CHECK: liveins: $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x2, $x14, $x15, $x16, $x17, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $x29, $x30, $x31
     ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: STD killed $x2, -152, $x1 :: (store (s64) into %stack.4)
     ; CHECK-NEXT: STD killed $x14, -144, $x1 :: (store (s64) into %fixed-stack.17, align 16)
     ; CHECK-NEXT: STD killed $x15, -136, $x1 :: (store (s64) into %fixed-stack.16)
     ; CHECK-NEXT: STD killed $x16, -128, $x1 :: (store (s64) into %fixed-stack.15, align 16)
@@ -95,42 +96,40 @@ body: |
     ; CHECK-NEXT: STD killed $x31, -8, $x1 :: (store (s64) into %fixed-stack.0)
     ; CHECK-NEXT: $x7 = OR8 $x3, $x3
     ; CHECK-NEXT: renamable $g8p4 = LQARX $x5, $x6
-    ; CHECK-NEXT: STD killed $x8, -160, $x1
-    ; CHECK-NEXT: STD killed $x9, -152, $x1
-    ; CHECK-NEXT: renamable $g8p13 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
     ; CHECK-NEXT: STD killed $x8, -176, $x1
     ; CHECK-NEXT: STD killed $x9, -168, $x1
-    ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
+    ; CHECK-NEXT: renamable $g8p1 = LQARX $x3, renamable $x4
+    ; CHECK-NEXT: renamable $g8p4 = LQARX renamable $x7, renamable $x4
     ; CHECK-NEXT: STD killed $x8, -192, $x1
     ; CHECK-NEXT: STD killed $x9, -184, $x1
-    ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
+    ; CHECK-NEXT: renamable $g8p4 = LQARX renamable $x7, renamable $x4
     ; CHECK-NEXT: STD killed $x8, -208, $x1
     ; CHECK-NEXT: STD killed $x9, -200, $x1
-    ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
+    ; CHECK-NEXT: renamable $g8p4 = LQARX renamable $x7, renamable $x4
     ; CHECK-NEXT: STD killed $x8, -224, $x1
     ; CHECK-NEXT: STD killed $x9, -216, $x1
-    ; CHECK-NEXT: renamable $g8p10 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p9 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p8 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p7 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p15 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p11 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p12 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p14 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p5 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
-    ; CHECK-NEXT: $x3 = OR8 $x27, $x27
+    ; CHECK-NEXT: renamable $g8p12 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p11 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p10 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p9 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p8 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p7 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p15 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p13 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p14 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p5 = LQARX renamable $x7, renamable $x4
+    ; CHECK-NEXT: renamable $g8p4 = LQARX renamable $x7, renamable $x4
     ; CHECK-NEXT: STQCX killed renamable $g8p4, renamable $x7, renamable $x4, implicit-def dead $cr0
     ; CHECK-NEXT: STQCX killed renamable $g8p5, renamable $x7, renamable $x4, implicit-def dead $cr0
     ; CHECK-NEXT: STQCX killed renamable $g8p14, renamable $x7, renamable $x4, implicit-def dead $cr0
-    ; CHECK-NEXT: STQCX killed renamable $g8p12, renamable $x7, renamable $x4, implicit-def dead $cr0
-    ; CHECK-NEXT: STQCX killed renamable $g8p11, renamable $x7, renamable $x4, implicit-def dead $cr0
+    ; CHECK-NEXT: STQCX killed renamable $g8p13, renamable $x7, renamable $x4, implicit-def dead $cr0
     ; CHECK-NEXT: STQCX killed renamable $g8p15, renamable $x7, renamable $x4, implicit-def dead $cr0
     ; CHECK-NEXT: STQCX killed renamable $g8p7, renamable $x7, renamable $x4, implicit-def dead $cr0
     ; CHECK-NEXT: STQCX killed renamable $g8p8, renamable $x7, renamable $x4, implicit-def dead $cr0
     ; CHECK-NEXT: STQCX killed renamable $g8p9, renamable $x7, renamable $x4, implicit-def dead $cr0
     ; CHECK-NEXT: STQCX killed renamable $g8p10, renamable $x7, renamable $x4, implicit-def dead $cr0
+    ; CHECK-NEXT: STQCX killed renamable $g8p11, renamable $x7, renamable $x4, implicit-def dead $cr0
+    ; CHECK-NEXT: STQCX killed renamable $g8p12, renamable $x7, renamable $x4, implicit-def dead $cr0
     ; CHECK-NEXT: $x8 = LD -224, $x1
     ; CHECK-NEXT: $x9 = LD -216, $x1
     ; CHECK-NEXT: STQCX killed renamable $g8p4, renamable $x7, renamable $x4, implicit-def dead $cr0
@@ -140,12 +139,9 @@ body: |
     ; CHECK-NEXT: $x8 = LD -192, $x1
     ; CHECK-NEXT: $x9 = LD -184, $x1
     ; CHECK-NEXT: STQCX killed renamable $g8p4, renamable $x7, renamable $x4, implicit-def dead $cr0
+    ; CHECK-NEXT: STQCX renamable $g8p1, killed renamable $x7, killed renamable $x4, implicit-def dead $cr0
     ; CHECK-NEXT: $x8 = LD -176, $x1
     ; CHECK-NEXT: $x9 = LD -168, $x1
-    ; CHECK-NEXT: STQCX killed renamable $g8p4, renamable $x7, renamable $x4, implicit-def dead $cr0
-    ; CHECK-NEXT: STQCX killed renamable $g8p13, killed renamable $x7, killed renamable $x4, implicit-def dead $cr0
-    ; CHECK-NEXT: $x8 = LD -160, $x1
-    ; CHECK-NEXT: $x9 = LD -152, $x1
     ; CHECK-NEXT: STQCX killed renamable $g8p4, $x5, $x6, implicit-def dead $cr0
     ; CHECK-NEXT: $x31 = LD -8, $x1 :: (load (s64) from %fixed-stack.0)
     ; CHECK-NEXT: $x30 = LD -16, $x1 :: (load (s64) from %fixed-stack.1, align 16)
@@ -165,6 +161,7 @@ body: |
     ; CHECK-NEXT: $x16 = LD -128, $x1 :: (load (s64) from %fixed-stack.15, align 16)
     ; CHECK-NEXT: $x15 = LD -136, $x1 :: (load (s64) from %fixed-stack.16)
     ; CHECK-NEXT: $x14 = LD -144, $x1 :: (load (s64) from %fixed-stack.17, align 16)
+    ; CHECK-NEXT: $x2 = LD -152, $x1 :: (load (s64) from %stack.4)
     ; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3
   %addr0:g8rc_nox0 = COPY $x3
   %addr1:g8rc = COPY $x4
@@ -216,10 +213,9 @@ body: |
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: $x4 = OR8 $x16, $x16
     ; CHECK-NEXT: $x5 = OR8 $x17, $x17
-    ; CHECK-NEXT: $x3 = OR8 $x5, $x5
-    ; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit killed $x3, implicit $x4
+    ; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x5, implicit $x4
   %0:g8prc = COPY $g8p8
-  $x3 =...
[truncated]

@diggerlin diggerlin requested review from amy-kwan, mandlebug and stefanp-synopsys and removed request for EsmeYi and chenzheng1030 September 9, 2024 14:00
@lei137
Copy link
Contributor

lei137 commented Oct 10, 2024

So #99766 should be closed now referencing this new PR?

; CHECK-NEXT: STD killed $x0, 16, $x1
; CHECK-NEXT: $x1 = STDU $x1, -32752, $x1
; CHECK-NEXT: BL8 @test_callee, csr_ppc64, implicit-def dead $lr8, implicit $rm, implicit $x2, implicit-def $r1, implicit-def $x3
; CHECK-NEXT: BL8 @test_callee, csr_ppc64_r2, implicit-def dead $lr8, implicit $rm, implicit-def $r1, implicit-def $x3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What changes caused this change from csr_ppc64 to csr_ppc64_r2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we have following code in the llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp

if (Subtarget.isAIXABI())
    // We only reserve r2 if we need to use the TOC pointer on AIX.
    if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
      markSuperRegs(Reserved, PPC::R2); // System-reserved register.

we need to change the test case to

BL8 @test_callee, csr_ppc64_r2, implicit-def dead $lr8, implicit $rm, implicit-def $r1, implicit-def $x3

otherwise it is a illegal instruction at implicit $x2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since there is

if (Subtarget.isAIXABI())
  // We only reserve r2 if we need to use the TOC pointer on AIX.
  if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
    markSuperRegs(Reserved, PPC::R2); // System-reserved register.

in the llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp ,

in the test case ,r2 is not marked as reversed . if it can not has implicit $x2 , otherwise it will have
Bad machine code: Using an undefined physical register when parse the MIR of the function.

@diggerlin diggerlin requested a review from lei137 October 16, 2024 20:38
markSuperRegs(Reserved, PPC::R2); // System-reserved register
// We only reserve r2 if we need to use the TOC pointer on AIX.
if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
markSuperRegs(Reserved, PPC::R2); // System-reserved register.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be the same as line 392-393. Can we just combine the conditions and do it in one place?

if (Subtarget.isSVR4ABI() || Subtarget.isAIXABI()) ) 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense. And then the R13 for isSVR4ABI() can be handled separately.

Copy link
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this looks LGTM. Just one more nit.
Maybe give it a few days to see if anyone else have additional comments before committing.

Thx

Comment on lines 3437 to 3438
// Whenever accessing the TLS variable, it is done through the TC entries.
// Therefore, we set the DAG to use the TOC base.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Whenever accessing the TLS variable, it is done through the TC entries.
// Therefore, we set the DAG to use the TOC base.
// TLS variables are accessed through TOC entries.
// To support this, set the DAG to use the TOC base pointer.

markSuperRegs(Reserved, PPC::R2); // System-reserved register
// We only reserve r2 if we need to use the TOC pointer on AIX.
if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
markSuperRegs(Reserved, PPC::R2); // System-reserved register.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense. And then the R13 for isSVR4ABI() can be handled separately.

@@ -0,0 +1,13 @@
; RUN: llc < %s -mtriple=powerpc-unknown-aix-xcoff -verify-machineinstrs \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a 64-bit run line, too?

@diggerlin diggerlin requested a review from amy-kwan October 31, 2024 17:28
@diggerlin diggerlin merged commit 2cd3213 into llvm:main Nov 4, 2024
4 of 5 checks passed
PhilippRados pushed a commit to PhilippRados/llvm-project that referenced this pull request Nov 6, 2024
…lvm#107863)

This patch utilizes getReservedRegs() to find asm clobberable registers.
And to make the result of getReservedRegs() accurate, this patch
implements the todo, which is to make r2 allocatable on AIX for some
leaf functions.
@diggerlin diggerlin deleted the digger/r2_aix branch April 23, 2025 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants