Skip to content

Conversation

@arsenm
Copy link
Contributor

@arsenm arsenm commented Aug 24, 2025

This function should not be virtual; making this virtual was
an AMDGPU hack that should be removed not spread to other
backends.

This does not need to be overridden to reserve registers. The
register reservation mechanism is orthogonal to to the register
class constraints of the instruction, this should be reporting
the underlying instruction constraint. The registers are separately
reserved, so they will be removed from the allocation order anyway.
If the actual class needs to change based on the subtarget,
it should probably generalize the LookupPtrRegClass mechanism.

This was added by #70958. The new tests there for the class are
probably not useful anymore. These instead should compile to the
end and try to stress the allocation behavior.

Copy link
Contributor Author

arsenm commented Aug 24, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@arsenm arsenm requested review from KanRobert and nikic August 24, 2025 01:57
@llvmbot
Copy link
Member

llvmbot commented Aug 24, 2025

@llvm/pr-subscribers-backend-x86

Author: Matt Arsenault (arsenm)

Changes

This function should not be virtual; making this virtual was
an AMDGPU hack that should be removed not spread to other
backends.

This does not need to be overridden to reserve registers. The
register reservation mechanism is orthogonal to to the register
class constraints of the instruction, this should be reporting
the underlying instruction constraint. The registers are separately
reserved, so they will be removed from the allocation order anyway.
If the actual class needs to change based on the subtarget,
it should probably generalize the LookupPtrRegClass mechanism.

This was added by #70958. The new tests there for the class are
probably not useful anymore. These instead should compile to the
end and try to stress the allocation behavior.


Full diff: https://github.com/llvm/llvm-project/pull/155128.diff

6 Files Affected:

  • (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (-17)
  • (modified) llvm/lib/Target/X86/X86InstrInfo.h (-11)
  • (modified) llvm/test/CodeGen/X86/apx/no-rex2-general.ll (+8-5)
  • (modified) llvm/test/CodeGen/X86/apx/no-rex2-pseudo-amx.ll (+2-2)
  • (modified) llvm/test/CodeGen/X86/apx/no-rex2-pseudo-x87.ll (+2-2)
  • (modified) llvm/test/CodeGen/X86/apx/no-rex2-special.ll (+4-4)
diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index abf365eedec39..75ed096904629 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -91,23 +91,6 @@ X86InstrInfo::X86InstrInfo(X86Subtarget &STI)
                       X86::CATCHRET, (STI.is64Bit() ? X86::RET64 : X86::RET32)),
       Subtarget(STI), RI(STI.getTargetTriple()) {}
 
-const TargetRegisterClass *
-X86InstrInfo::getRegClass(const MCInstrDesc &MCID, unsigned OpNum,
-                          const TargetRegisterInfo *TRI,
-                          const MachineFunction &MF) const {
-  auto *RC = TargetInstrInfo::getRegClass(MCID, OpNum, TRI, MF);
-  // If the target does not have egpr, then r16-r31 will be resereved for all
-  // instructions.
-  if (!RC || !Subtarget.hasEGPR())
-    return RC;
-
-  if (X86II::canUseApxExtendedReg(MCID))
-    return RC;
-
-  const X86RegisterInfo *RI = Subtarget.getRegisterInfo();
-  return RI->constrainRegClassToNonRex2(RC);
-}
-
 bool X86InstrInfo::isCoalescableExtInstr(const MachineInstr &MI,
                                          Register &SrcReg, Register &DstReg,
                                          unsigned &SubIdx) const {
diff --git a/llvm/lib/Target/X86/X86InstrInfo.h b/llvm/lib/Target/X86/X86InstrInfo.h
index 9dc5f4b0e086e..b2e7e0e9eea28 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.h
+++ b/llvm/lib/Target/X86/X86InstrInfo.h
@@ -240,17 +240,6 @@ class X86InstrInfo final : public X86GenInstrInfo {
 public:
   explicit X86InstrInfo(X86Subtarget &STI);
 
-  /// Given a machine instruction descriptor, returns the register
-  /// class constraint for OpNum, or NULL. Returned register class
-  /// may be different from the definition in the TD file, e.g.
-  /// GR*RegClass (definition in TD file)
-  /// ->
-  /// GR*_NOREX2RegClass (Returned register class)
-  const TargetRegisterClass *
-  getRegClass(const MCInstrDesc &MCID, unsigned OpNum,
-              const TargetRegisterInfo *TRI,
-              const MachineFunction &MF) const override;
-
   /// getRegisterInfo - TargetInstrInfo is a superset of MRegister info.  As
   /// such, whenever a client has an instance of instruction info, it should
   /// always be able to get register info as well (through this method).
diff --git a/llvm/test/CodeGen/X86/apx/no-rex2-general.ll b/llvm/test/CodeGen/X86/apx/no-rex2-general.ll
index 805fc7ccaab76..74d8fc2d801a8 100644
--- a/llvm/test/CodeGen/X86/apx/no-rex2-general.ll
+++ b/llvm/test/CodeGen/X86/apx/no-rex2-general.ll
@@ -14,6 +14,7 @@ define i32 @map0(ptr nocapture noundef readonly %a, i64 noundef %b) {
   ; SSE-NEXT:   [[MOV32rm:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 4, [[COPY]], 0, $noreg :: (load (s32) from %ir.add.ptr)
   ; SSE-NEXT:   $eax = COPY [[MOV32rm]]
   ; SSE-NEXT:   RET 0, $eax
+  ;
   ; AVX-LABEL: name: map0
   ; AVX: bb.0.entry:
   ; AVX-NEXT:   liveins: $rdi, $rsi
@@ -38,12 +39,13 @@ define i32 @map1_or_vex(<2 x double> noundef %a) {
   ; SSE-NEXT:   [[CVTSD2SIrr_Int:%[0-9]+]]:gr32 = nofpexcept CVTSD2SIrr_Int [[COPY]], implicit $mxcsr
   ; SSE-NEXT:   $eax = COPY [[CVTSD2SIrr_Int]]
   ; SSE-NEXT:   RET 0, $eax
+  ;
   ; AVX-LABEL: name: map1_or_vex
   ; AVX: bb.0.entry:
   ; AVX-NEXT:   liveins: $xmm0
   ; AVX-NEXT: {{  $}}
   ; AVX-NEXT:   [[COPY:%[0-9]+]]:vr128 = COPY $xmm0
-  ; AVX-NEXT:   [[VCVTSD2SIrr_Int:%[0-9]+]]:gr32_norex2 = nofpexcept VCVTSD2SIrr_Int [[COPY]], implicit $mxcsr
+  ; AVX-NEXT:   [[VCVTSD2SIrr_Int:%[0-9]+]]:gr32 = nofpexcept VCVTSD2SIrr_Int [[COPY]], implicit $mxcsr
   ; AVX-NEXT:   $eax = COPY [[VCVTSD2SIrr_Int]]
   ; AVX-NEXT:   RET 0, $eax
 entry:
@@ -56,17 +58,18 @@ define <2 x i64> @map2_or_vex(ptr nocapture noundef readonly %b, i64 noundef %c)
   ; SSE: bb.0.entry:
   ; SSE-NEXT:   liveins: $rdi, $rsi
   ; SSE-NEXT: {{  $}}
-  ; SSE-NEXT:   [[COPY:%[0-9]+]]:gr64_norex2_nosp = COPY $rsi
-  ; SSE-NEXT:   [[COPY1:%[0-9]+]]:gr64_norex2 = COPY $rdi
+  ; SSE-NEXT:   [[COPY:%[0-9]+]]:gr64_nosp = COPY $rsi
+  ; SSE-NEXT:   [[COPY1:%[0-9]+]]:gr64 = COPY $rdi
   ; SSE-NEXT:   [[PABSBrm:%[0-9]+]]:vr128 = PABSBrm [[COPY1]], 4, [[COPY]], 0, $noreg :: (load (s128) from %ir.add.ptr)
   ; SSE-NEXT:   $xmm0 = COPY [[PABSBrm]]
   ; SSE-NEXT:   RET 0, $xmm0
+  ;
   ; AVX-LABEL: name: map2_or_vex
   ; AVX: bb.0.entry:
   ; AVX-NEXT:   liveins: $rdi, $rsi
   ; AVX-NEXT: {{  $}}
-  ; AVX-NEXT:   [[COPY:%[0-9]+]]:gr64_norex2_nosp = COPY $rsi
-  ; AVX-NEXT:   [[COPY1:%[0-9]+]]:gr64_norex2 = COPY $rdi
+  ; AVX-NEXT:   [[COPY:%[0-9]+]]:gr64_nosp = COPY $rsi
+  ; AVX-NEXT:   [[COPY1:%[0-9]+]]:gr64 = COPY $rdi
   ; AVX-NEXT:   [[VPABSBrm:%[0-9]+]]:vr128 = VPABSBrm [[COPY1]], 4, [[COPY]], 0, $noreg :: (load (s128) from %ir.add.ptr)
   ; AVX-NEXT:   $xmm0 = COPY [[VPABSBrm]]
   ; AVX-NEXT:   RET 0, $xmm0
diff --git a/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-amx.ll b/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-amx.ll
index 5fa4cb4c8826b..812ea87fd0baa 100644
--- a/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-amx.ll
+++ b/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-amx.ll
@@ -7,8 +7,8 @@ define dso_local void @amx(ptr noundef %data) {
   ; CHECK: bb.0.entry:
   ; CHECK-NEXT:   liveins: $rdi
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gr64_norex2 = COPY $rdi
-  ; CHECK-NEXT:   [[MOV32ri64_:%[0-9]+]]:gr64_norex2_nosp = MOV32ri64 8
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gr64 = COPY $rdi
+  ; CHECK-NEXT:   [[MOV32ri64_:%[0-9]+]]:gr64_nosp = MOV32ri64 8
   ; CHECK-NEXT:   PTILELOADD 4, [[COPY]], 1, killed [[MOV32ri64_]], 0, $noreg
   ; CHECK-NEXT:   RET 0
   entry:
diff --git a/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-x87.ll b/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-x87.ll
index a9ca591a156c2..5156047dac7b8 100644
--- a/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-x87.ll
+++ b/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-x87.ll
@@ -7,8 +7,8 @@ define void @x87(ptr %0, ptr %1) {
   ; CHECK: bb.0 (%ir-block.2):
   ; CHECK-NEXT:   liveins: $rdi, $rsi
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gr64_norex2 = COPY $rsi
-  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gr64_norex2 = COPY $rdi
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gr64 = COPY $rsi
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gr64 = COPY $rdi
   ; CHECK-NEXT:   [[LD_Fp32m:%[0-9]+]]:rfp32 = nofpexcept LD_Fp32m [[COPY1]], 1, $noreg, 0, $noreg, implicit-def dead $fpsw, implicit $fpcw :: (load (s32) from %ir.0)
   ; CHECK-NEXT:   nofpexcept ST_Fp32m [[COPY]], 1, $noreg, 0, $noreg, killed [[LD_Fp32m]], implicit-def dead $fpsw, implicit $fpcw :: (store (s32) into %ir.1)
   ; CHECK-NEXT:   RET 0
diff --git a/llvm/test/CodeGen/X86/apx/no-rex2-special.ll b/llvm/test/CodeGen/X86/apx/no-rex2-special.ll
index 86534427a9eae..51b7fe9391cf4 100644
--- a/llvm/test/CodeGen/X86/apx/no-rex2-special.ll
+++ b/llvm/test/CodeGen/X86/apx/no-rex2-special.ll
@@ -9,7 +9,7 @@ define void @test_xsave(ptr %ptr, i32 %hi, i32 %lo) {
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gr32 = COPY $edx
   ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gr32 = COPY $esi
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gr64_norex2 = COPY $rdi
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gr64 = COPY $rdi
   ; CHECK-NEXT:   $edx = COPY [[COPY1]]
   ; CHECK-NEXT:   $eax = COPY [[COPY]]
   ; CHECK-NEXT:   XSAVE [[COPY2]], 1, $noreg, 0, $noreg, implicit $edx, implicit $eax
@@ -26,7 +26,7 @@ define void @test_xsave64(ptr %ptr, i32 %hi, i32 %lo) {
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gr32 = COPY $edx
   ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gr32 = COPY $esi
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gr64_norex2 = COPY $rdi
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gr64 = COPY $rdi
   ; CHECK-NEXT:   $edx = COPY [[COPY1]]
   ; CHECK-NEXT:   $eax = COPY [[COPY]]
   ; CHECK-NEXT:   XSAVE64 [[COPY2]], 1, $noreg, 0, $noreg, implicit $edx, implicit $eax
@@ -43,7 +43,7 @@ define void @test_xrstor(ptr %ptr, i32 %hi, i32 %lo) {
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gr32 = COPY $edx
   ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gr32 = COPY $esi
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gr64_norex2 = COPY $rdi
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gr64 = COPY $rdi
   ; CHECK-NEXT:   $edx = COPY [[COPY1]]
   ; CHECK-NEXT:   $eax = COPY [[COPY]]
   ; CHECK-NEXT:   XRSTOR [[COPY2]], 1, $noreg, 0, $noreg, implicit $edx, implicit $eax
@@ -60,7 +60,7 @@ define void @test_xrstor64(ptr %ptr, i32 %hi, i32 %lo) {
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gr32 = COPY $edx
   ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gr32 = COPY $esi
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gr64_norex2 = COPY $rdi
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gr64 = COPY $rdi
   ; CHECK-NEXT:   $edx = COPY [[COPY1]]
   ; CHECK-NEXT:   $eax = COPY [[COPY]]
   ; CHECK-NEXT:   XRSTOR64 [[COPY2]], 1, $noreg, 0, $noreg, implicit $edx, implicit $eax

@arsenm arsenm requested review from RKSimon, e-kud, mortbopet, phoebewang and qcolombet and removed request for mortbopet August 24, 2025 01:57
@arsenm arsenm marked this pull request as ready for review August 24, 2025 01:57
@KanRobert
Copy link
Contributor

KanRobert commented Aug 24, 2025

It seems this PR should not only remove the overriding. If there is an alternative/better way to achieve that

it should probably generalize the LookupPtrRegClass mechanism.

, it should be included too, otherwise there is a correctness issue.

@arsenm
Copy link
Contributor Author

arsenm commented Aug 24, 2025

It seems this PR should not only remove the overriding. If there is an alternative/better way to achieve that

The original PR already reserved these registers

it should probably generalize the LookupPtrRegClass mechanism.

, it should be included too, otherwise there is a correctness issue.

That would be necessary if the underlying class fundamentally changed (e.g. 32-bit to 64-bit), but this just swapping to a different subset, which also aligns with the reserved registers so I don't think there's anything to do here.

@KanRobert
Copy link
Contributor

It seems this PR should not only remove the overriding. If there is an alternative/better way to achieve that

The original PR already reserved these registers

it should probably generalize the LookupPtrRegClass mechanism.

, it should be included too, otherwise there is a correctness issue.

That would be necessary if the underlying class fundamentally changed (e.g. 32-bit to 64-bit), but this just swapping to a different subset, which also aligns with the reserved registers so I don't think there's anything to do here.

It's different from the logic of reserved registers, which does not depend on the instruction itself. Here we have the check

X86II::canUseApxExtendedReg(MCID)

b/c even when feature APX/EGPR is on, some instructions can not use R16-R31.

@arsenm
Copy link
Contributor Author

arsenm commented Aug 24, 2025

It's different from the logic of reserved registers, which does not depend on the instruction itself. Here we have the check

X86II::canUseApxExtendedReg(MCID)

b/c even when feature APX/EGPR is on, some instructions can not use R16-R31.

Can you take care of this then? Consider this a post-commit review. That patch was not actually approved, and does not have adequate test coverage for this restriction

@KanRobert
Copy link
Contributor

KanRobert commented Aug 25, 2025

It's different from the logic of reserved registers, which does not depend on the instruction itself. Here we have the check

X86II::canUseApxExtendedReg(MCID)

b/c even when feature APX/EGPR is on, some instructions can not use R16-R31.

Can you take care of this then? Consider this a post-commit review. That patch was not actually approved, and does not have adequate test coverage for this restriction

The tests updated in this PR should prove the adequate test coverage. We can generalize LookupPtrRegClass to replace current implementation, but that would require the the function XXRegisterInfo::getPointerRegClass to take a parameter const MCInstrDesc &MCID, and X86 GR8/16/32/64 to be pointer-like regclass. Does it sound good to you?

@arsenm
Copy link
Contributor Author

arsenm commented Oct 10, 2025

The tests updated in this PR should prove the adequate test coverage. We can generalize LookupPtrRegClass to replace current implementation, but that would require the the function XXRegisterInfo::getPointerRegClass to take a parameter const MCInstrDesc &MCID, and X86 GR8/16/32/64 to be pointer-like regclass. Does it sound good to you?

We can do better now. This can use RegClassByHwMode. I'm planning on removing getPointerRegClass in #159883

@RKSimon
Copy link
Collaborator

RKSimon commented Oct 30, 2025

@KanRobert @arsenm Whats this plan for this PR please?

@KanRobert
Copy link
Contributor

@KanRobert @arsenm Whats this plan for this PR please?

I suppose @arsenm would re-implement this after rebasing.

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't merge with trunk latest

@arsenm
Copy link
Contributor Author

arsenm commented Nov 4, 2025

@KanRobert @arsenm Whats this plan for this PR please?

I suppose @arsenm would re-implement this after rebasing.

@KanRobert I can take care of reposting this in its current delete-as-is form. You previously claimed this will break something, but the current test coverage is inadequate and this approach insufficient. I have no intention of taking on that work of fixing the test coverage or switching to use RegClassByHwMode, I'm asking you as the original submitter of the implementation to take care of that

This function should not be virtual; making this virtual was
an AMDGPU hack that should be removed not spread to other
backends.

This does not need to be overridden to reserve registers. The
register reservation mechanism is orthogonal to to the register
class constraints of the instruction, this should be reporting
the underlying instruction constraint. The registers are separately
reserved, so they will be removed from the allocation order anyway.
If the actual class needs to change based on the subtarget,
it should probably generalize the LookupPtrRegClass mechanism.

This was added by #70958. The new tests there for the class are
probably not useful anymore. These instead should compile to the
end and try to stress the allocation behavior.
@arsenm arsenm force-pushed the users/arsenm/x86/stop-overriding-getRegClass branch from c53a6ad to 6d9f319 Compare November 4, 2025 21:40
@KanRobert
Copy link
Contributor

@arsenm Synced with @phoebewang offline, she will help improve the test coverage.

@phoebewang
Copy link
Contributor

@arsenm Synced with @phoebewang offline, she will help improve the test coverage.

Created #167275

@RKSimon RKSimon self-requested a review November 10, 2025 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants