-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[PowerPC] Utilize getReservedRegs to find asm clobberable registers. #107863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-backend-powerpc Author: zhijian lin (diggerlin) Changesthe patch is based on Esme's patch #99766. This patch utilizes getReservedRegs() to find asm clobberable registers. Thanks for Esme's work, I take over the patch. Patch is 24.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107863.diff 12 Files Affected:
diff --git a/llvm/lib/Target/PowerPC/PPCCallingConv.td b/llvm/lib/Target/PowerPC/PPCCallingConv.td
index 825c1a29ed62cb..d966d2a09aa78c 100644
--- a/llvm/lib/Target/PowerPC/PPCCallingConv.td
+++ b/llvm/lib/Target/PowerPC/PPCCallingConv.td
@@ -423,8 +423,10 @@ def CSR_SVR64_ColdCC_R2_VSRP : CalleeSavedRegs<(add CSR_SVR64_ColdCC_VSRP, X2)>;
def CSR_64_AllRegs_VSRP :
CalleeSavedRegs<(add CSR_64_AllRegs_VSX, CSR_ALL_VSRP)>;
+def CSR_AIX64_R2 : CalleeSavedRegs<(add X2, CSR_PPC64)>;
+
def CSR_AIX64_VSRP : CalleeSavedRegs<(add CSR_PPC64_Altivec, CSR_VSRP)>;
-def CSR_AIX64_R2_VSRP : CalleeSavedRegs<(add CSR_AIX64_VSRP, X2)>;
+def CSR_AIX64_R2_VSRP : CalleeSavedRegs<(add X2, CSR_AIX64_VSRP)>;
def CSR_AIX32_VSRP : CalleeSavedRegs<(add CSR_AIX32_Altivec, CSR_VSRP)>;
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 459a96eca1ff20..4ee9f3301e3bc1 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -3434,6 +3434,8 @@ SDValue PPCTargetLowering::LowerGlobalTLSAddressAIX(SDValue Op,
if (Subtarget.hasAIXShLibTLSModelOpt())
updateForAIXShLibTLSModelOpt(Model, DAG, getTargetMachine());
+ setUsesTOCBasePtr(DAG);
+
bool IsTLSLocalExecModel = Model == TLSModel::LocalExec;
if (IsTLSLocalExecModel || Model == TLSModel::InitialExec) {
diff --git a/llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp b/llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
index 9e8da59615dfb3..d43bf473d80cfc 100644
--- a/llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
+++ b/llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
@@ -240,7 +240,7 @@ PPCRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
if (Subtarget.pairedVectorMemops()) {
if (Subtarget.isAIXABI()) {
if (!TM.getAIXExtendedAltivecABI())
- return SaveR2 ? CSR_PPC64_R2_SaveList : CSR_PPC64_SaveList;
+ return SaveR2 ? CSR_AIX64_R2_SaveList : CSR_PPC64_SaveList;
return SaveR2 ? CSR_AIX64_R2_VSRP_SaveList : CSR_AIX64_VSRP_SaveList;
}
return SaveR2 ? CSR_SVR464_R2_VSRP_SaveList : CSR_SVR464_VSRP_SaveList;
@@ -250,7 +250,9 @@ PPCRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
return SaveR2 ? CSR_PPC64_R2_Altivec_SaveList
: CSR_PPC64_Altivec_SaveList;
}
- return SaveR2 ? CSR_PPC64_R2_SaveList : CSR_PPC64_SaveList;
+ return SaveR2 ? (Subtarget.isAIXABI() ? CSR_AIX64_R2_SaveList
+ : CSR_PPC64_R2_SaveList)
+ : CSR_PPC64_SaveList;
}
// 32-bit targets.
if (Subtarget.isAIXABI()) {
@@ -380,6 +382,8 @@ BitVector PPCRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
markSuperRegs(Reserved, PPC::VRSAVE);
+ const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
+ bool UsesTOCBasePtr = FuncInfo->usesTOCBasePtr();
// The SVR4 ABI reserves r2 and r13
if (Subtarget.isSVR4ABI()) {
// We only reserve r2 if we need to use the TOC pointer. If we have no
@@ -387,16 +391,15 @@ BitVector PPCRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
// no constant-pool loads, etc.) and we have no potential uses inside an
// inline asm block, then we can treat r2 has an ordinary callee-saved
// register.
- const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
- if (!TM.isPPC64() || FuncInfo->usesTOCBasePtr() || MF.hasInlineAsm())
- markSuperRegs(Reserved, PPC::R2); // System-reserved register
- markSuperRegs(Reserved, PPC::R13); // Small Data Area pointer register
+ if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
+ markSuperRegs(Reserved, PPC::R2); // System-reserved register.
+ markSuperRegs(Reserved, PPC::R13); // Small Data Area pointer register.
}
- // Always reserve r2 on AIX for now.
- // TODO: Make r2 allocatable on AIX/XCOFF for some leaf functions.
if (Subtarget.isAIXABI())
- markSuperRegs(Reserved, PPC::R2); // System-reserved register
+ // We only reserve r2 if we need to use the TOC pointer on AIX.
+ if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
+ markSuperRegs(Reserved, PPC::R2); // System-reserved register.
// On PPC64, r13 is the thread pointer. Never allocate this register.
if (TM.isPPC64())
@@ -441,14 +444,12 @@ BitVector PPCRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
bool PPCRegisterInfo::isAsmClobberable(const MachineFunction &MF,
MCRegister PhysReg) const {
- // We cannot use getReservedRegs() to find the registers that are not asm
- // clobberable because there are some reserved registers which can be
- // clobbered by inline asm. For example, when LR is clobbered, the register is
- // saved and restored. We will hardcode the registers that are not asm
- // cloberable in this function.
-
- // The stack pointer (R1/X1) is not clobberable by inline asm
- return PhysReg != PPC::R1 && PhysReg != PPC::X1;
+ // CTR and LR registers are always reserved, but they are asm clobberable.
+ if (PhysReg == PPC::CTR || PhysReg == PPC::CTR8 || PhysReg == PPC::LR ||
+ PhysReg == PPC::LR8)
+ return true;
+
+ return !getReservedRegs(MF).test(PhysReg);
}
bool PPCRegisterInfo::requiresFrameIndexScavenging(const MachineFunction &MF) const {
diff --git a/llvm/lib/Target/PowerPC/PPCRegisterInfo.td b/llvm/lib/Target/PowerPC/PPCRegisterInfo.td
index 3cb7cd9d8f2299..56e170b1230f6f 100644
--- a/llvm/lib/Target/PowerPC/PPCRegisterInfo.td
+++ b/llvm/lib/Target/PowerPC/PPCRegisterInfo.td
@@ -341,7 +341,9 @@ def GPRC : RegisterClass<"PPC", [i32,f32], 32, (add (sequence "R%u", 2, 12),
// This also helps setting the correct `NumOfGPRsSaved' in traceback table.
let AltOrders = [(add (sub GPRC, R2), R2),
(add (sequence "R%u", 2, 12),
- (sequence "R%u", 31, 13), R0, R1, FP, BP)];
+ (sequence "R%u", 31, 13), R0, R1, FP, BP),
+ (add (sequence "R%u", 3, 12),
+ (sequence "R%u", 31, 13), R2, R0, R1, FP, BP)];
let AltOrderSelect = [{
return MF.getSubtarget<PPCSubtarget>().getGPRAllocationOrderIdx();
}];
@@ -354,7 +356,9 @@ def G8RC : RegisterClass<"PPC", [i64], 64, (add (sequence "X%u", 2, 12),
// put it at the end of the list.
let AltOrders = [(add (sub G8RC, X2), X2),
(add (sequence "X%u", 2, 12),
- (sequence "X%u", 31, 13), X0, X1, FP8, BP8)];
+ (sequence "X%u", 31, 13), X0, X1, FP8, BP8),
+ (add (sequence "X%u", 3, 12),
+ (sequence "X%u", 31, 13), X2, X0, X1, FP8, BP8)];
let AltOrderSelect = [{
return MF.getSubtarget<PPCSubtarget>().getGPRAllocationOrderIdx();
}];
@@ -368,7 +372,9 @@ def GPRC_NOR0 : RegisterClass<"PPC", [i32,f32], 32, (add (sub GPRC, R0), ZERO)>
// put it at the end of the list.
let AltOrders = [(add (sub GPRC_NOR0, R2), R2),
(add (sequence "R%u", 2, 12),
- (sequence "R%u", 31, 13), R1, FP, BP, ZERO)];
+ (sequence "R%u", 31, 13), R1, FP, BP, ZERO),
+ (add (sequence "R%u", 3, 12),
+ (sequence "R%u", 31, 13), R2, R1, FP, BP, ZERO)];
let AltOrderSelect = [{
return MF.getSubtarget<PPCSubtarget>().getGPRAllocationOrderIdx();
}];
@@ -379,7 +385,9 @@ def G8RC_NOX0 : RegisterClass<"PPC", [i64], 64, (add (sub G8RC, X0), ZERO8)> {
// put it at the end of the list.
let AltOrders = [(add (sub G8RC_NOX0, X2), X2),
(add (sequence "X%u", 2, 12),
- (sequence "X%u", 31, 13), X1, FP8, BP8, ZERO8)];
+ (sequence "X%u", 31, 13), X1, FP8, BP8, ZERO8),
+ (add (sequence "X%u", 3, 12),
+ (sequence "X%u", 31, 13), X2, X1, FP8, BP8, ZERO8)];
let AltOrderSelect = [{
return MF.getSubtarget<PPCSubtarget>().getGPRAllocationOrderIdx();
}];
diff --git a/llvm/lib/Target/PowerPC/PPCSubtarget.h b/llvm/lib/Target/PowerPC/PPCSubtarget.h
index 2079dc0acc3cf7..9453f692add597 100644
--- a/llvm/lib/Target/PowerPC/PPCSubtarget.h
+++ b/llvm/lib/Target/PowerPC/PPCSubtarget.h
@@ -303,7 +303,7 @@ class PPCSubtarget : public PPCGenSubtargetInfo {
if (is64BitELFABI())
return 1;
if (isAIXABI())
- return 2;
+ return IsPPC64 ? 3 : 2;
return 0;
}
diff --git a/llvm/test/CodeGen/PowerPC/aix-inline-asm-clobber-warning.ll b/llvm/test/CodeGen/PowerPC/aix-inline-asm-clobber-warning.ll
new file mode 100644
index 00000000000000..933bd9837f9a66
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/aix-inline-asm-clobber-warning.ll
@@ -0,0 +1,12 @@
+; RUN: llc < %s -mtriple=powerpc-unknown-aix-xcoff -verify-machineinstrs 2>&1 | FileCheck %s
+
+; CHECK: warning: inline asm clobber list contains reserved registers: R2
+; CHECK-NEXT: note: Reserved registers on the clobber list may not be preserved across the asm statement, and clobbering them may lead to undefined behaviour.
+
+@a = external global i32, align 4
+
+define void @bar() {
+ store i32 0, ptr @a, align 4
+ call void asm sideeffect "li 2, 1", "~{r2}"()
+ ret void
+}
diff --git a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
index 7d96f7feabe2be..9a1483d5dac48c 100644
--- a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
+++ b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
@@ -17,6 +17,5 @@ body: |
BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $f1
...
# CHECK-DAG: AllocationOrder(VFRC) = [ $vf2 $vf3 $vf4 $vf5 $vf0 $vf1 $vf6 $vf7 $vf8 $vf9 $vf10 $vf11 $vf12 $vf13 $vf14 $vf15 $vf16 $vf17 $vf18 $vf19 $vf31 $vf30 $vf29 $vf28 $vf27 $vf26 $vf25 $vf24 $vf23 $vf22 $vf21 $vf20 ]
-# CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x1
-# CHECK-DAG: 4 ]
-# CHECK-DAG: AllocationOrder(F8RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
\ No newline at end of file
+# CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 $x2 ]
+# CHECK-DAG: AllocationOrder(F8RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
diff --git a/llvm/test/CodeGen/PowerPC/inline-asm-clobber-warning.ll b/llvm/test/CodeGen/PowerPC/inline-asm-clobber-warning.ll
index 7f13f5072d97f1..4c460cf6d8059e 100644
--- a/llvm/test/CodeGen/PowerPC/inline-asm-clobber-warning.ll
+++ b/llvm/test/CodeGen/PowerPC/inline-asm-clobber-warning.ll
@@ -1,7 +1,7 @@
; RUN: llc < %s -verify-machineinstrs -mtriple=powerpc-unknown-unkown \
-; RUN: -mcpu=pwr7 2>&1 | FileCheck %s
+; RUN: -mcpu=pwr7 -O0 2>&1 | FileCheck %s
; RUN: llc < %s -verify-machineinstrs -mtriple=powerpc64-unknown-unkown \
-; RUN: -mcpu=pwr7 2>&1 | FileCheck %s
+; RUN: -mcpu=pwr7 -O0 2>&1 | FileCheck %s
define void @test_r1_clobber() {
entry:
@@ -20,3 +20,24 @@ entry:
; CHECK: warning: inline asm clobber list contains reserved registers: X1
; CHECK-NEXT: note: Reserved registers on the clobber list may not be preserved across the asm statement, and clobbering them may lead to undefined behaviour.
+
+; CHECK: warning: inline asm clobber list contains reserved registers: R31
+; CHECK-NEXT: note: Reserved registers on the clobber list may not be preserved across the asm statement, and clobbering them may lead to undefined behaviour.
+
+@a = dso_local global i32 100, align 4
+define dso_local signext i32 @test_r31_r30_clobber() {
+entry:
+ %retval = alloca i32, align 4
+ %old = alloca i64, align 8
+ store i32 0, ptr %retval, align 4
+ call void asm sideeffect "li 31, 1", "~{r31}"()
+ call void asm sideeffect "li 30, 1", "~{r30}"()
+ %0 = call i64 asm sideeffect "mr $0, 31", "=r"()
+ store i64 %0, ptr %old, align 8
+ %1 = load i32, ptr @a, align 4
+ %conv = sext i32 %1 to i64
+ %2 = alloca i8, i64 %conv, align 16
+ %3 = load i64, ptr %old, align 8
+ %conv1 = trunc i64 %3 to i32
+ ret i32 %conv1
+}
diff --git a/llvm/test/CodeGen/PowerPC/ldst-16-byte.mir b/llvm/test/CodeGen/PowerPC/ldst-16-byte.mir
index b9c541feae5acf..7888d297072346 100644
--- a/llvm/test/CodeGen/PowerPC/ldst-16-byte.mir
+++ b/llvm/test/CodeGen/PowerPC/ldst-16-byte.mir
@@ -8,18 +8,18 @@ alignment: 8
tracksRegLiveness: true
body: |
bb.0.entry:
- liveins: $x3, $x4
+ liveins: $x5, $x4
; CHECK-LABEL: name: foo
- ; CHECK: liveins: $x3, $x4
+ ; CHECK: liveins: $x4, $x5
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: early-clobber renamable $g8p3 = LQ 128, $x4
- ; CHECK-NEXT: $x3 = OR8 $x7, $x7
- ; CHECK-NEXT: STQ killed renamable $g8p3, 160, $x3
- ; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3
+ ; CHECK-NEXT: $x5 = OR8 $x7, $x7
+ ; CHECK-NEXT: STQ killed renamable $g8p3, 160, $x5
+ ; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x5
%0:g8prc = LQ 128, $x4
- $x3 = COPY %0.sub_gp8_x1:g8prc
- STQ %0, 160, $x3
- BLR8 implicit $lr8, implicit undef $rm, implicit $x3
+ $x5 = COPY %0.sub_gp8_x1:g8prc
+ STQ %0, 160, $x5
+ BLR8 implicit $lr8, implicit undef $rm, implicit $x5
...
---
@@ -73,8 +73,9 @@ body: |
bb.0.entry:
liveins: $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12
; CHECK-LABEL: name: spill_g8prc
- ; CHECK: liveins: $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x14, $x15, $x16, $x17, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $x29, $x30, $x31
+ ; CHECK: liveins: $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x2, $x14, $x15, $x16, $x17, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $x29, $x30, $x31
; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: STD killed $x2, -152, $x1 :: (store (s64) into %stack.4)
; CHECK-NEXT: STD killed $x14, -144, $x1 :: (store (s64) into %fixed-stack.17, align 16)
; CHECK-NEXT: STD killed $x15, -136, $x1 :: (store (s64) into %fixed-stack.16)
; CHECK-NEXT: STD killed $x16, -128, $x1 :: (store (s64) into %fixed-stack.15, align 16)
@@ -95,42 +96,40 @@ body: |
; CHECK-NEXT: STD killed $x31, -8, $x1 :: (store (s64) into %fixed-stack.0)
; CHECK-NEXT: $x7 = OR8 $x3, $x3
; CHECK-NEXT: renamable $g8p4 = LQARX $x5, $x6
- ; CHECK-NEXT: STD killed $x8, -160, $x1
- ; CHECK-NEXT: STD killed $x9, -152, $x1
- ; CHECK-NEXT: renamable $g8p13 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
; CHECK-NEXT: STD killed $x8, -176, $x1
; CHECK-NEXT: STD killed $x9, -168, $x1
- ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
+ ; CHECK-NEXT: renamable $g8p1 = LQARX $x3, renamable $x4
+ ; CHECK-NEXT: renamable $g8p4 = LQARX renamable $x7, renamable $x4
; CHECK-NEXT: STD killed $x8, -192, $x1
; CHECK-NEXT: STD killed $x9, -184, $x1
- ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
+ ; CHECK-NEXT: renamable $g8p4 = LQARX renamable $x7, renamable $x4
; CHECK-NEXT: STD killed $x8, -208, $x1
; CHECK-NEXT: STD killed $x9, -200, $x1
- ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
+ ; CHECK-NEXT: renamable $g8p4 = LQARX renamable $x7, renamable $x4
; CHECK-NEXT: STD killed $x8, -224, $x1
; CHECK-NEXT: STD killed $x9, -216, $x1
- ; CHECK-NEXT: renamable $g8p10 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p9 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p8 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p7 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p15 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p11 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p12 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p14 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p5 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: renamable $g8p4 = LQARX $x3, renamable $x4
- ; CHECK-NEXT: $x3 = OR8 $x27, $x27
+ ; CHECK-NEXT: renamable $g8p12 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p11 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p10 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p9 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p8 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p7 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p15 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p13 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p14 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p5 = LQARX renamable $x7, renamable $x4
+ ; CHECK-NEXT: renamable $g8p4 = LQARX renamable $x7, renamable $x4
; CHECK-NEXT: STQCX killed renamable $g8p4, renamable $x7, renamable $x4, implicit-def dead $cr0
; CHECK-NEXT: STQCX killed renamable $g8p5, renamable $x7, renamable $x4, implicit-def dead $cr0
; CHECK-NEXT: STQCX killed renamable $g8p14, renamable $x7, renamable $x4, implicit-def dead $cr0
- ; CHECK-NEXT: STQCX killed renamable $g8p12, renamable $x7, renamable $x4, implicit-def dead $cr0
- ; CHECK-NEXT: STQCX killed renamable $g8p11, renamable $x7, renamable $x4, implicit-def dead $cr0
+ ; CHECK-NEXT: STQCX killed renamable $g8p13, renamable $x7, renamable $x4, implicit-def dead $cr0
; CHECK-NEXT: STQCX killed renamable $g8p15, renamable $x7, renamable $x4, implicit-def dead $cr0
; CHECK-NEXT: STQCX killed renamable $g8p7, renamable $x7, renamable $x4, implicit-def dead $cr0
; CHECK-NEXT: STQCX killed renamable $g8p8, renamable $x7, renamable $x4, implicit-def dead $cr0
; CHECK-NEXT: STQCX killed renamable $g8p9, renamable $x7, renamable $x4, implicit-def dead $cr0
; CHECK-NEXT: STQCX killed renamable $g8p10, renamable $x7, renamable $x4, implicit-def dead $cr0
+ ; CHECK-NEXT: STQCX killed renamable $g8p11, renamable $x7, renamable $x4, implicit-def dead $cr0
+ ; CHECK-NEXT: STQCX killed renamable $g8p12, renamable $x7, renamable $x4, implicit-def dead $cr0
; CHECK-NEXT: $x8 = LD -224, $x1
; CHECK-NEXT: $x9 = LD -216, $x1
; CHECK-NEXT: STQCX killed renamable $g8p4, renamable $x7, renamable $x4, implicit-def dead $cr0
@@ -140,12 +139,9 @@ body: |
; CHECK-NEXT: $x8 = LD -192, $x1
; CHECK-NEXT: $x9 = LD -184, $x1
; CHECK-NEXT: STQCX killed renamable $g8p4, renamable $x7, renamable $x4, implicit-def dead $cr0
+ ; CHECK-NEXT: STQCX renamable $g8p1, killed renamable $x7, killed renamable $x4, implicit-def dead $cr0
; CHECK-NEXT: $x8 = LD -176, $x1
; CHECK-NEXT: $x9 = LD -168, $x1
- ; CHECK-NEXT: STQCX killed renamable $g8p4, renamable $x7, renamable $x4, implicit-def dead $cr0
- ; CHECK-NEXT: STQCX killed renamable $g8p13, killed renamable $x7, killed renamable $x4, implicit-def dead $cr0
- ; CHECK-NEXT: $x8 = LD -160, $x1
- ; CHECK-NEXT: $x9 = LD -152, $x1
; CHECK-NEXT: STQCX killed renamable $g8p4, $x5, $x6, implicit-def dead $cr0
; CHECK-NEXT: $x31 = LD -8, $x1 :: (load (s64) from %fixed-stack.0)
; CHECK-NEXT: $x30 = LD -16, $x1 :: (load (s64) from %fixed-stack.1, align 16)
@@ -165,6 +161,7 @@ body: |
; CHECK-NEXT: $x16 = LD -128, $x1 :: (load (s64) from %fixed-stack.15, align 16)
; CHECK-NEXT: $x15 = LD -136, $x1 :: (load (s64) from %fixed-stack.16)
; CHECK-NEXT: $x14 = LD -144, $x1 :: (load (s64) from %fixed-stack.17, align 16)
+ ; CHECK-NEXT: $x2 = LD -152, $x1 :: (load (s64) from %stack.4)
; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x3
%addr0:g8rc_nox0 = COPY $x3
%addr1:g8rc = COPY $x4
@@ -216,10 +213,9 @@ body: |
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: $x4 = OR8 $x16, $x16
; CHECK-NEXT: $x5 = OR8 $x17, $x17
- ; CHECK-NEXT: $x3 = OR8 $x5, $x5
- ; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit killed $x3, implicit $x4
+ ; CHECK-NEXT: BLR8 implicit $lr8, implicit undef $rm, implicit $x5, implicit $x4
%0:g8prc = COPY $g8p8
- $x3 =...
[truncated]
|
…e leaf functions.
1f3099e to
153d1d9
Compare
|
So #99766 should be closed now referencing this new PR? |
| ; CHECK-NEXT: STD killed $x0, 16, $x1 | ||
| ; CHECK-NEXT: $x1 = STDU $x1, -32752, $x1 | ||
| ; CHECK-NEXT: BL8 @test_callee, csr_ppc64, implicit-def dead $lr8, implicit $rm, implicit $x2, implicit-def $r1, implicit-def $x3 | ||
| ; CHECK-NEXT: BL8 @test_callee, csr_ppc64_r2, implicit-def dead $lr8, implicit $rm, implicit-def $r1, implicit-def $x3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What changes caused this change from csr_ppc64 to csr_ppc64_r2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we have following code in the llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
if (Subtarget.isAIXABI())
// We only reserve r2 if we need to use the TOC pointer on AIX.
if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
markSuperRegs(Reserved, PPC::R2); // System-reserved register.
we need to change the test case to
BL8 @test_callee, csr_ppc64_r2, implicit-def dead $lr8, implicit $rm, implicit-def $r1, implicit-def $x3
otherwise it is a illegal instruction at implicit $x2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since there is
if (Subtarget.isAIXABI())
// We only reserve r2 if we need to use the TOC pointer on AIX.
if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm())
markSuperRegs(Reserved, PPC::R2); // System-reserved register.
in the llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp ,
in the test case ,r2 is not marked as reversed . if it can not has implicit $x2 , otherwise it will have
Bad machine code: Using an undefined physical register when parse the MIR of the function.
| markSuperRegs(Reserved, PPC::R2); // System-reserved register | ||
| // We only reserve r2 if we need to use the TOC pointer on AIX. | ||
| if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm()) | ||
| markSuperRegs(Reserved, PPC::R2); // System-reserved register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks to be the same as line 392-393. Can we just combine the conditions and do it in one place?
if (Subtarget.isSVR4ABI() || Subtarget.isAIXABI()) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense. And then the R13 for isSVR4ABI() can be handled separately.
lei137
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this looks LGTM. Just one more nit.
Maybe give it a few days to see if anyone else have additional comments before committing.
Thx
| // Whenever accessing the TLS variable, it is done through the TC entries. | ||
| // Therefore, we set the DAG to use the TOC base. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Whenever accessing the TLS variable, it is done through the TC entries. | |
| // Therefore, we set the DAG to use the TOC base. | |
| // TLS variables are accessed through TOC entries. | |
| // To support this, set the DAG to use the TOC base pointer. |
| markSuperRegs(Reserved, PPC::R2); // System-reserved register | ||
| // We only reserve r2 if we need to use the TOC pointer on AIX. | ||
| if (!TM.isPPC64() || UsesTOCBasePtr || MF.hasInlineAsm()) | ||
| markSuperRegs(Reserved, PPC::R2); // System-reserved register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense. And then the R13 for isSVR4ABI() can be handled separately.
| @@ -0,0 +1,13 @@ | |||
| ; RUN: llc < %s -mtriple=powerpc-unknown-aix-xcoff -verify-machineinstrs \ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a 64-bit run line, too?
…lvm#107863) This patch utilizes getReservedRegs() to find asm clobberable registers. And to make the result of getReservedRegs() accurate, this patch implements the todo, which is to make r2 allocatable on AIX for some leaf functions.
the patch is based on Esme's patch #99766.
This patch utilizes getReservedRegs() to find asm clobberable registers.
And to make the result of getReservedRegs() accurate, this patch implements the todo, which is to make r2 allocatable on AIX for some leaf functions.
Thanks for Esme's work, I take over the patch.