Skip to content

Conversation

@pestctrl
Copy link
Contributor

@pestctrl pestctrl commented Nov 19, 2024

Currently, the relative position of GPRCS2 (with respect to other instructions in the prologue of a function) can be different depending on the type of ARMSubtarget::PushPopSplitVariant.

When the PushPopSpiltVariant is SplitR11WindowsSEH, GPRCS2 comes after both GPRCS1 and DPRCS2:

GPRCS1
DPRCS1
GPRCS2

However, in all other cases, GPRCS2 comes before DPRCS1, like so:

GPRCS1
GPRCS2
DPRCS1

This makes the MI walking code in ARMFrameLowering::emitPrologue a bit confusing. If GPRCS2Size is non-zero, we also have to check the PushPopSplitVariant to know if we will encounter the DPRCS1 push instruction first or the GPRCS2 push instruction first.

This commit changes to SplitR11WindowsSEH such that the spill area is as follows:

GPRCS1
DPRCS1
GPRCS3

This disambiguates a lot of the ARMFrameLowering.cpp MI traversal code.

@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-backend-arm

Author: Benson Chu (pestctrl)

Changes

Currently, the relative position of GPRCS2 (with respect to other instructions in the prologue of a function) can be different depending on the type of ARMSubtarget::PushPopSplitVariant.

When the PushPopSpiltVariant is SplitR11WindowsSEH, the ordering of the spill instructions is as follows:

GPRCS1
DPRCS1
GPRCS2

However, in all other cases, GPRCS2 comes before DPRCS1, like so:

GPRCS1
GPRCS2
DPRCS1

This makes the MI walking code in ARMFrameLowering::emitPrologue a bit confusing. If GPRCS2Size is non-zero, we also have to check the PushPopSplitVariant to know if we will encounter the DPRCS1 push instruction first or the GPRCS2 push instruction first.

This commit changes to SplitR11WindowsSEH such that the spill area is as follows:

GPRCS1
DPRCS1
GPRCS3

This disambiguates a lot of the ARMFrameLowering.cpp MI traversal code.


Full diff: https://github.com/llvm/llvm-project/pull/116716.diff

3 Files Affected:

  • (modified) llvm/lib/Target/ARM/ARMFrameLowering.cpp (+77-51)
  • (modified) llvm/lib/Target/ARM/ARMMachineFunctionInfo.h (+9-6)
  • (modified) llvm/lib/Target/ARM/Thumb1FrameLowering.cpp (+3-3)
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.cpp b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
index 2f24006c198a2a..8dc0656ace236e 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
@@ -177,6 +177,7 @@ enum class SpillArea {
   GPRCS2,
   DPRCS1,
   DPRCS2,
+  GPRCS3,
   FPCXT,
 };
 
@@ -197,7 +198,7 @@ SpillArea getSpillArea(Register Reg,
   // SplitR11WindowsSEH:
   // push {r0-r10, r12}  GPRCS1
   // vpush {r8-d15}      DPRCS1
-  // push {r11, lr}      GPRCS2
+  // push {r11, lr}      GPRCS3
   //
   // SplitR11AAPCSSignRA:
   // push {r0-r10, r12}  GPRSC1
@@ -238,10 +239,13 @@ SpillArea getSpillArea(Register Reg,
       return SpillArea::GPRCS1;
 
   case ARM::R11:
-    if (Variation == ARMSubtarget::NoSplit)
-      return SpillArea::GPRCS1;
-    else
+    if (Variation == ARMSubtarget::SplitR7 ||
+        Variation == ARMSubtarget::SplitR11AAPCSSignRA)
       return SpillArea::GPRCS2;
+    if (Variation == ARMSubtarget::SplitR11WindowsSEH)
+      return SpillArea::GPRCS3;
+
+    return SpillArea::GPRCS1;
 
   case ARM::R12:
     if (Variation == ARMSubtarget::SplitR7)
@@ -250,11 +254,12 @@ SpillArea getSpillArea(Register Reg,
       return SpillArea::GPRCS1;
 
   case ARM::LR:
-    if (Variation == ARMSubtarget::SplitR11WindowsSEH ||
-        Variation == ARMSubtarget::SplitR11AAPCSSignRA)
+    if (Variation == ARMSubtarget::SplitR11AAPCSSignRA)
       return SpillArea::GPRCS2;
-    else
-      return SpillArea::GPRCS1;
+    if (Variation == ARMSubtarget::SplitR11WindowsSEH)
+      return SpillArea::GPRCS3;
+
+    return SpillArea::GPRCS1;
 
   case ARM::D0:
   case ARM::D1:
@@ -912,7 +917,8 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
 
   // Determine the sizes of each callee-save spill areas and record which frame
   // belongs to which callee-save spill areas.
-  unsigned GPRCS1Size = 0, GPRCS2Size = 0, DPRCSSize = 0;
+  unsigned GPRCS1Size = 0, GPRCS2Size = 0, DPRCS1Size = 0, GPRCS3Size = 0,
+           DPRCS2Size = 0;
   int FramePtrSpillFI = 0;
   int D8SpillFI = 0;
 
@@ -970,14 +976,19 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
       GPRCS2Size += 4;
       break;
     case SpillArea::DPRCS1:
-      DPRCSSize += 8;
+      DPRCS1Size += 8;
+      break;
+    case SpillArea::GPRCS3:
+      GPRCS3Size += 4;
       break;
     case SpillArea::DPRCS2:
+      DPRCS2Size += 4;
       break;
     }
   }
 
-  MachineBasicBlock::iterator LastPush = MBB.end(), GPRCS1Push, GPRCS2Push;
+  MachineBasicBlock::iterator LastPush = MBB.end(), GPRCS1Push, GPRCS2Push,
+                              DPRCS1Push, GPRCS3Push;
 
   // Move past the PAC computation.
   if (AFI->shouldSignReturnAddress())
@@ -1012,20 +1023,14 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
   unsigned FPCXTOffset = NumBytes - ArgRegsSaveSize - FPCXTSaveSize;
   unsigned GPRCS1Offset = FPCXTOffset - GPRCS1Size;
   unsigned GPRCS2Offset = GPRCS1Offset - GPRCS2Size;
-  Align DPRAlign = DPRCSSize ? std::min(Align(8), Alignment) : Align(4);
-  unsigned DPRGapSize = GPRCS1Size + FPCXTSaveSize + ArgRegsSaveSize;
-  if (PushPopSplit != ARMSubtarget::SplitR11WindowsSEH) {
-    DPRGapSize += GPRCS2Size;
-  }
-  DPRGapSize %= DPRAlign.value();
 
-  unsigned DPRCSOffset;
-  if (PushPopSplit == ARMSubtarget::SplitR11WindowsSEH) {
-    DPRCSOffset = GPRCS1Offset - DPRGapSize - DPRCSSize;
-    GPRCS2Offset = DPRCSOffset - GPRCS2Size;
-  } else {
-    DPRCSOffset = GPRCS2Offset - DPRGapSize - DPRCSSize;
-  }
+  Align DPRAlign = DPRCS1Size ? std::min(Align(8), Alignment) : Align(4);
+  unsigned DPRGapSize = (ArgRegsSaveSize + FPCXTSaveSize + GPRCS1Size +
+                         GPRCS2Size) %
+                        DPRAlign.value();
+
+  unsigned DPRCS1Offset = GPRCS2Offset - DPRGapSize - DPRCS1Size;
+
   if (HasFP) {
     // Offset from the CFA to the saved frame pointer, will be negative.
     [[maybe_unused]] int FPOffset = MFI.getObjectOffset(FramePtrSpillFI);
@@ -1038,11 +1043,11 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
   }
   AFI->setGPRCalleeSavedArea1Offset(GPRCS1Offset);
   AFI->setGPRCalleeSavedArea2Offset(GPRCS2Offset);
-  AFI->setDPRCalleeSavedAreaOffset(DPRCSOffset);
+  AFI->setDPRCalleeSavedArea1Offset(DPRCS1Offset);
 
-  // Move GPRCS2, unless using SplitR11WindowsSEH, in which case it will be
-  // after DPRCS1.
-  if (GPRCS2Size > 0 && PushPopSplit != ARMSubtarget::SplitR11WindowsSEH) {
+  // Move past area 2.
+  if (GPRCS2Size > 0) {
+    assert(PushPopSplit != ARMSubtarget::SplitR11WindowsSEH);
     GPRCS2Push = LastPush = MBBI++;
     DefCFAOffsetCandidates.addInst(LastPush, GPRCS2Size, BeforeFPPush);
     if (FramePtrSpillArea == SpillArea::GPRCS2)
@@ -1063,19 +1068,19 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
     }
   }
 
-  // Move past DPRCS1.
-  if (DPRCSSize > 0) {
+  // Move past DPRCS1Size.
+  if (DPRCS1Size > 0) {
     // Since vpush register list cannot have gaps, there may be multiple vpush
     // instructions in the prologue.
     while (MBBI != MBB.end() && MBBI->getOpcode() == ARM::VSTMDDB_UPD) {
       DefCFAOffsetCandidates.addInst(MBBI, sizeOfSPAdjustment(*MBBI),
                                      BeforeFPPush);
-      LastPush = MBBI++;
+      DPRCS1Push = LastPush = MBBI++;
     }
   }
 
   // Move past the aligned DPRCS2 area.
-  if (AFI->getNumAlignedDPRCS2Regs() > 0) {
+  if (DPRCS2Size > 0) {
     MBBI = skipAlignedDPRCS2Spills(MBBI, AFI->getNumAlignedDPRCS2Regs());
     // The code inserted by emitAlignedDPRCS2Spills realigns the stack, and
     // leaves the stack pointer pointing to the DPRCS2 area.
@@ -1083,13 +1088,14 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
     // Adjust NumBytes to represent the stack slots below the DPRCS2 area.
     NumBytes += MFI.getObjectOffset(D8SpillFI);
   } else
-    NumBytes = DPRCSOffset;
-
-  // Move GPRCS2, if using using SplitR11WindowsSEH.
-  if (GPRCS2Size > 0 && PushPopSplit == ARMSubtarget::SplitR11WindowsSEH) {
-    GPRCS2Push = LastPush = MBBI++;
-    DefCFAOffsetCandidates.addInst(LastPush, GPRCS2Size, BeforeFPPush);
-    if (FramePtrSpillArea == SpillArea::GPRCS2)
+    NumBytes = DPRCS1Offset;
+
+  // Move GPRCS3, if using using SplitR11WindowsSEH.
+  if (GPRCS3Size > 0) {
+    assert(PushPopSplit == ARMSubtarget::SplitR11WindowsSEH);
+    GPRCS3Push = LastPush = MBBI++;
+    DefCFAOffsetCandidates.addInst(LastPush, GPRCS3Size, BeforeFPPush);
+    if (FramePtrSpillArea == SpillArea::GPRCS3)
       BeforeFPPush = false;
   }
 
@@ -1211,11 +1217,18 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
       FPOffsetAfterPush = MFI.getObjectOffset(FramePtrSpillFI) +
                           ArgRegsSaveSize + FPCXTSaveSize + GPRCS1Size +
                           sizeOfSPAdjustment(*FPPushInst);
-      if (PushPopSplit == ARMSubtarget::SplitR11WindowsSEH)
-        FPOffsetAfterPush += DPRCSSize + DPRGapSize;
       LLVM_DEBUG(dbgs() << "Frame pointer in GPRCS2, offset "
                         << FPOffsetAfterPush << "  after that push\n");
       break;
+    case SpillArea::GPRCS3:
+      FPPushInst = GPRCS3Push;
+      FPOffsetAfterPush = MFI.getObjectOffset(FramePtrSpillFI) +
+                          ArgRegsSaveSize + FPCXTSaveSize + GPRCS1Size +
+                          DPRCS1Size + DPRGapSize +
+                          sizeOfSPAdjustment(*FPPushInst);
+      LLVM_DEBUG(dbgs() << "Frame pointer in GPRCS3, offset "
+                        << FPOffsetAfterPush << "  after that push\n");
+      break;
     default:
       llvm_unreachable("frame pointer in unknown spill area");
       break;
@@ -1279,7 +1292,10 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
         CFIPos = std::next(GPRCS2Push);
         break;
       case SpillArea::DPRCS1:
-        CFIPos = std::next(LastPush);
+        CFIPos = std::next(DPRCS1Push);
+        break;
+      case SpillArea::GPRCS3:
+        CFIPos = std::next(GPRCS3Push);
         break;
       case SpillArea::FPCXT:
       case SpillArea::DPRCS2:
@@ -1317,7 +1333,8 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
   AFI->setGPRCalleeSavedArea1Size(GPRCS1Size);
   AFI->setGPRCalleeSavedArea2Size(GPRCS2Size);
   AFI->setDPRCalleeSavedGapSize(DPRGapSize);
-  AFI->setDPRCalleeSavedAreaSize(DPRCSSize);
+  AFI->setDPRCalleeSavedArea1Size(DPRCS1Size);
+  AFI->setGPRCalleeSavedArea3Size(GPRCS3Size);
 
   // If we need dynamic stack realignment, do it here. Be paranoid and make
   // sure if we also have VLAs, we have a base pointer for frame access.
@@ -1443,7 +1460,8 @@ void ARMFrameLowering::emitEpilogue(MachineFunction &MF,
                  AFI->getGPRCalleeSavedArea1Size() +
                  AFI->getGPRCalleeSavedArea2Size() +
                  AFI->getDPRCalleeSavedGapSize() +
-                 AFI->getDPRCalleeSavedAreaSize());
+                 AFI->getDPRCalleeSavedArea1Size() +
+                 AFI->getGPRCalleeSavedArea3Size());
 
     // Reset SP based on frame pointer only if the stack frame extends beyond
     // frame pointer stack slot or target is ELF and the function has FP.
@@ -1491,11 +1509,12 @@ void ARMFrameLowering::emitEpilogue(MachineFunction &MF,
                    MachineInstr::FrameDestroy);
 
     // Increment past our save areas.
-    if (AFI->getGPRCalleeSavedArea2Size() &&
-        PushPopSplit == ARMSubtarget::SplitR11WindowsSEH)
+    if (AFI->getGPRCalleeSavedArea3Size()) {
+      assert(PushPopSplit == ARMSubtarget::SplitR11WindowsSEH);
       MBBI++;
+    }
 
-    if (MBBI != MBB.end() && AFI->getDPRCalleeSavedAreaSize()) {
+    if (MBBI != MBB.end() && AFI->getDPRCalleeSavedArea1Size()) {
       MBBI++;
       // Since vpop register list cannot have gaps, there may be multiple vpop
       // instructions in the epilogue.
@@ -1509,9 +1528,10 @@ void ARMFrameLowering::emitEpilogue(MachineFunction &MF,
                    MachineInstr::FrameDestroy);
     }
 
-    if (AFI->getGPRCalleeSavedArea2Size() &&
-        PushPopSplit != ARMSubtarget::SplitR11WindowsSEH)
+    if (AFI->getGPRCalleeSavedArea2Size()) {
+      assert(PushPopSplit != ARMSubtarget::SplitR11WindowsSEH);
       MBBI++;
+    }
     if (AFI->getGPRCalleeSavedArea1Size()) MBBI++;
 
     if (ReservedArgStack || IncomingArgStackToRestore) {
@@ -2128,6 +2148,9 @@ bool ARMFrameLowering::spillCalleeSavedRegisters(
   auto IsDPRCS1 = [&CheckRegArea](unsigned Reg) {
     return CheckRegArea(Reg, SpillArea::DPRCS1);
   };
+  auto IsGPRCS3 = [&CheckRegArea](unsigned Reg) {
+    return CheckRegArea(Reg, SpillArea::GPRCS3);
+  };
 
   // Windows SEH requires the floating-point registers to be pushed between the
   // two blocks of GPRs in some situations. In all other cases, they are pushed
@@ -2135,7 +2158,7 @@ bool ARMFrameLowering::spillCalleeSavedRegisters(
   if (PushPopSplit == ARMSubtarget::SplitR11WindowsSEH) {
     emitPushInst(MBB, MI, CSI, PushOpc, PushOneOpc, false, IsGPRCS1);
     emitPushInst(MBB, MI, CSI, FltOpc, 0, true, IsDPRCS1);
-    emitPushInst(MBB, MI, CSI, PushOpc, PushOneOpc, false, IsGPRCS2);
+    emitPushInst(MBB, MI, CSI, PushOpc, PushOneOpc, false, IsGPRCS3);
   } else {
     emitPushInst(MBB, MI, CSI, PushOpc, PushOneOpc, false, IsGPRCS1);
     emitPushInst(MBB, MI, CSI, PushOpc, PushOneOpc, false, IsGPRCS2);
@@ -2190,9 +2213,12 @@ bool ARMFrameLowering::restoreCalleeSavedRegisters(
   auto IsDPRCS1 = [&CheckRegArea](unsigned Reg) {
     return CheckRegArea(Reg, SpillArea::DPRCS1);
   };
+  auto IsGPRCS3 = [&CheckRegArea](unsigned Reg) {
+    return CheckRegArea(Reg, SpillArea::GPRCS3);
+  };
 
   if (PushPopSplit == ARMSubtarget::SplitR11WindowsSEH) {
-    emitPopInst(MBB, MI, CSI, PopOpc, LdrOpc, isVarArg, false, IsGPRCS2);
+    emitPopInst(MBB, MI, CSI, PopOpc, LdrOpc, isVarArg, false, IsGPRCS3);
     emitPopInst(MBB, MI, CSI, FltOpc, 0, isVarArg, true, IsDPRCS1);
     emitPopInst(MBB, MI, CSI, PopOpc, LdrOpc, isVarArg, false, IsGPRCS1);
   } else {
diff --git a/llvm/lib/Target/ARM/ARMMachineFunctionInfo.h b/llvm/lib/Target/ARM/ARMMachineFunctionInfo.h
index 54bf5fffd3942c..e330d83cd80d57 100644
--- a/llvm/lib/Target/ARM/ARMMachineFunctionInfo.h
+++ b/llvm/lib/Target/ARM/ARMMachineFunctionInfo.h
@@ -81,7 +81,7 @@ class ARMFunctionInfo : public MachineFunctionInfo {
   /// Some may be spilled after the stack has been realigned.
   unsigned GPRCS1Offset = 0;
   unsigned GPRCS2Offset = 0;
-  unsigned DPRCSOffset = 0;
+  unsigned DPRCS1Offset = 0;
 
   /// GPRCS1Size, GPRCS2Size, DPRCSSize - Sizes of callee saved register spills
   /// areas.
@@ -90,7 +90,8 @@ class ARMFunctionInfo : public MachineFunctionInfo {
   unsigned GPRCS1Size = 0;
   unsigned GPRCS2Size = 0;
   unsigned DPRCSAlignGapSize = 0;
-  unsigned DPRCSSize = 0;
+  unsigned DPRCS1Size = 0;
+  unsigned GPRCS3Size = 0;
 
   /// NumAlignedDPRCS2Regs - The number of callee-saved DPRs that are saved in
   /// the aligned portion of the stack frame.  This is always a contiguous
@@ -194,25 +195,27 @@ class ARMFunctionInfo : public MachineFunctionInfo {
 
   unsigned getGPRCalleeSavedArea1Offset() const { return GPRCS1Offset; }
   unsigned getGPRCalleeSavedArea2Offset() const { return GPRCS2Offset; }
-  unsigned getDPRCalleeSavedAreaOffset()  const { return DPRCSOffset; }
+  unsigned getDPRCalleeSavedArea1Offset() const { return DPRCS1Offset; }
 
   void setGPRCalleeSavedArea1Offset(unsigned o) { GPRCS1Offset = o; }
   void setGPRCalleeSavedArea2Offset(unsigned o) { GPRCS2Offset = o; }
-  void setDPRCalleeSavedAreaOffset(unsigned o)  { DPRCSOffset = o; }
+  void setDPRCalleeSavedArea1Offset(unsigned o) { DPRCS1Offset = o; }
 
   unsigned getFPCXTSaveAreaSize() const       { return FPCXTSaveSize; }
   unsigned getFrameRecordSavedAreaSize() const { return FRSaveSize; }
   unsigned getGPRCalleeSavedArea1Size() const { return GPRCS1Size; }
   unsigned getGPRCalleeSavedArea2Size() const { return GPRCS2Size; }
   unsigned getDPRCalleeSavedGapSize() const   { return DPRCSAlignGapSize; }
-  unsigned getDPRCalleeSavedAreaSize()  const { return DPRCSSize; }
+  unsigned getDPRCalleeSavedArea1Size() const { return DPRCS1Size; }
+  unsigned getGPRCalleeSavedArea3Size() const { return GPRCS3Size; }
 
   void setFPCXTSaveAreaSize(unsigned s)       { FPCXTSaveSize = s; }
   void setFrameRecordSavedAreaSize(unsigned s) { FRSaveSize = s; }
   void setGPRCalleeSavedArea1Size(unsigned s) { GPRCS1Size = s; }
   void setGPRCalleeSavedArea2Size(unsigned s) { GPRCS2Size = s; }
   void setDPRCalleeSavedGapSize(unsigned s)   { DPRCSAlignGapSize = s; }
-  void setDPRCalleeSavedAreaSize(unsigned s)  { DPRCSSize = s; }
+  void setDPRCalleeSavedArea1Size(unsigned s) { DPRCS1Size = s; }
+  void setGPRCalleeSavedArea3Size(unsigned s) { GPRCS3Size = s; }
 
   unsigned getArgumentStackSize() const { return ArgumentStackSize; }
   void setArgumentStackSize(unsigned size) { ArgumentStackSize = size; }
diff --git a/llvm/lib/Target/ARM/Thumb1FrameLowering.cpp b/llvm/lib/Target/ARM/Thumb1FrameLowering.cpp
index 0a6bb5d1b9b167..3f8c50d6173102 100644
--- a/llvm/lib/Target/ARM/Thumb1FrameLowering.cpp
+++ b/llvm/lib/Target/ARM/Thumb1FrameLowering.cpp
@@ -290,7 +290,7 @@ void Thumb1FrameLowering::emitPrologue(MachineFunction &MF,
     AFI->setFrameRecordSavedAreaSize(FRSize);
   AFI->setGPRCalleeSavedArea1Offset(GPRCS1Offset);
   AFI->setGPRCalleeSavedArea2Offset(GPRCS2Offset);
-  AFI->setDPRCalleeSavedAreaOffset(DPRCSOffset);
+  AFI->setDPRCalleeSavedArea1Offset(DPRCSOffset);
   NumBytes = DPRCSOffset;
 
   int FramePtrOffsetInBlock = 0;
@@ -440,7 +440,7 @@ void Thumb1FrameLowering::emitPrologue(MachineFunction &MF,
 
   AFI->setGPRCalleeSavedArea1Size(GPRCS1Size);
   AFI->setGPRCalleeSavedArea2Size(GPRCS2Size);
-  AFI->setDPRCalleeSavedAreaSize(DPRCSSize);
+  AFI->setDPRCalleeSavedArea1Size(DPRCSSize);
 
   if (RegInfo->hasStackRealignment(MF)) {
     const unsigned NrBitsToZero = Log2(MFI.getMaxAlign());
@@ -529,7 +529,7 @@ void Thumb1FrameLowering::emitEpilogue(MachineFunction &MF,
     NumBytes -= (AFI->getFrameRecordSavedAreaSize() +
                  AFI->getGPRCalleeSavedArea1Size() +
                  AFI->getGPRCalleeSavedArea2Size() +
-                 AFI->getDPRCalleeSavedAreaSize() +
+                 AFI->getDPRCalleeSavedArea1Size() +
                  ArgRegsSaveSize);
 
     // We are likely to need a scratch register and we know all callee-save

Copy link
Contributor Author

@pestctrl pestctrl Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left these asserts here to serve as reminders of what GPRCS2Size > 0 and GPRCS3Size > 0 likewise imply, but someone doesn't like them being here, I can remove them. Checking for GPRCS2Size/GPRCS3Size provides all the information that the frame code needs to know.

@pestctrl pestctrl force-pushed the gprcs3 branch 2 times, most recently from ff6955e to 2b212cb Compare November 19, 2024 00:22
@github-actions
Copy link

github-actions bot commented Nov 19, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 48591953e97b9ecf32e60fe0233ca0ba2765184e 0acdd71ff48577db149582eae8f34b3826333023 --extensions cpp,h -- llvm/lib/Target/ARM/ARMFrameLowering.cpp llvm/lib/Target/ARM/ARMMachineFunctionInfo.h llvm/lib/Target/ARM/Thumb1FrameLowering.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.cpp b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
index 054e46611a..d6b94fecf6 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
@@ -1455,13 +1455,11 @@ void ARMFrameLowering::emitEpilogue(MachineFunction &MF,
     }
 
     // Move SP to start of FP callee save spill area.
-    NumBytes -= (ReservedArgStack +
-                 AFI->getFPCXTSaveAreaSize() +
-                 AFI->getGPRCalleeSavedArea1Size() +
-                 AFI->getGPRCalleeSavedArea2Size() +
-                 AFI->getDPRCalleeSavedGapSize() +
-                 AFI->getDPRCalleeSavedArea1Size() +
-                 AFI->getGPRCalleeSavedArea3Size());
+    NumBytes -=
+        (ReservedArgStack + AFI->getFPCXTSaveAreaSize() +
+         AFI->getGPRCalleeSavedArea1Size() + AFI->getGPRCalleeSavedArea2Size() +
+         AFI->getDPRCalleeSavedGapSize() + AFI->getDPRCalleeSavedArea1Size() +
+         AFI->getGPRCalleeSavedArea3Size());
 
     // Reset SP based on frame pointer only if the stack frame extends beyond
     // frame pointer stack slot or target is ELF and the function has FP.
diff --git a/llvm/lib/Target/ARM/Thumb1FrameLowering.cpp b/llvm/lib/Target/ARM/Thumb1FrameLowering.cpp
index 3f8c50d617..890c2344c2 100644
--- a/llvm/lib/Target/ARM/Thumb1FrameLowering.cpp
+++ b/llvm/lib/Target/ARM/Thumb1FrameLowering.cpp
@@ -526,11 +526,10 @@ void Thumb1FrameLowering::emitEpilogue(MachineFunction &MF,
     }
 
     // Move SP to start of FP callee save spill area.
-    NumBytes -= (AFI->getFrameRecordSavedAreaSize() +
-                 AFI->getGPRCalleeSavedArea1Size() +
-                 AFI->getGPRCalleeSavedArea2Size() +
-                 AFI->getDPRCalleeSavedArea1Size() +
-                 ArgRegsSaveSize);
+    NumBytes -=
+        (AFI->getFrameRecordSavedAreaSize() +
+         AFI->getGPRCalleeSavedArea1Size() + AFI->getGPRCalleeSavedArea2Size() +
+         AFI->getDPRCalleeSavedArea1Size() + ArgRegsSaveSize);
 
     // We are likely to need a scratch register and we know all callee-save
     // registers are free at this point in the epilogue, so pick one.

// Windows SEH requires the floating-point registers to be pushed between the
// two blocks of GPRs in some situations. In all other cases, they are pushed
// below the GPRs.
if (PushPopSplit == ARMSubtarget::SplitR11WindowsSEH) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this if be removed now that the blocks are always pushed in the same order?

return CheckRegArea(Reg, SpillArea::GPRCS3);
};

if (PushPopSplit == ARMSubtarget::SplitR11WindowsSEH) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this if be removed now that the blocks are always pushed in the same order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! Good idea.

case SpillArea::GPRCS3:
FPPushInst = GPRCS3Push;
FPOffsetAfterPush = MFI.getObjectOffset(FramePtrSpillFI) +
ArgRegsSaveSize + FPCXTSaveSize + GPRCS1Size +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to include GPRCS2Size here for consistency, or at least assert that it is zero, so that this will be correct of we ever start using GPRCS2 and GPRCS3 together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with including GPRCS2Size here.

Copy link
Collaborator

@ostannard ostannard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM, assuming you apply the code_formatter bot's suggested changes before merging.

@pestctrl pestctrl force-pushed the gprcs3 branch 3 times, most recently from 0acdd71 to 7d24648 Compare November 19, 2024 16:40
Currently, the relative position of GPRCS2 (with respect to other
instructions in the prologue of a function) can be different depending
on the type of ARMSubtarget::PushPopSplitVariant.

When the PushPopSpiltVariant is SplitR11WindowsSEH, GPRCS2 comes
after both GPRCS1 and DPRCS2:

GPRCS1
DPRCS1
GPRCS2

However, in all other cases, GPRCS2 comes before DPRCS1, like so:

GPRCS1
GPRCS2
DPRCS1

This makes the MI walking code in ARMFrameLowering::emitPrologue a bit
confusing. If GPRCS2Size is non-zero, we also have to check the
PushPopSplitVariant to know if we will encounter the DPRCS1 push
instruction first or the GPRCS2 push instruction first.

This commit changes to SplitR11WindowsSEH such that the spill area is
as follows:

GPRCS1
DPRCS1
GPRCS3

This disambiguates a lot of the ARMFrameLowering.cpp MI traversal
code.
@pestctrl pestctrl merged commit d37554b into llvm:main Nov 19, 2024
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants