X86: Use LiveRegUnits in findDeadCallerSavedReg #156817

arsenm · 2025-09-04T07:15:25Z

My goal here was to eliminate the use of getGPRsForTailCall.
I don't think it makes sense to use here, considering this function
handles other non-tail call cases. It seems to have been used to
find a non-callee saved register in a roundabout way. We can be more
precise by letting LiveRegUnits figure out the CSR set for the current
function.

There are a few things I find confusing about this function; the API
isn't right. The callers ideally would be maintaining LiveRegUnits
in the context where they need the free register. It also doesn't
provide a RegisterClass to use. Also, this should work for any
block and this shouldn't need to special case this set of return
opcodes. I'm also not sure why there is a special case for RIP. It was introduced in
fe5e5dc but doesn't appear to
be tested.

There are a few codegen test changes. One of them I think is a correct
improvement since the old code didn't consider undef uses as available.
The others I think are just to allocation order changes, since we're now
using the broader GPR_*NOSP classes.

arsenm · 2025-09-04T07:15:43Z

X86: Stop using MachineFunction in getPointerRegClass #156880
X86: Use LiveRegUnits in findDeadCallerSavedReg #156817 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-09-04T07:16:03Z

@llvm/pr-subscribers-backend-x86

Author: Matt Arsenault (arsenm)

Changes

My goal here was to eliminate the use of getGPRsForTailCall.
I don't think it makes sense to use here, considering this function
handles other non-tail call cases. It seems to have been used to
find a non-callee saved register in a roundabout way. We can be more
precise by letting LiveRegUnits figure out the CSR set for the current
function.

There are a few things I find confusing about this function; the API
isn't right. The callers ideally would be maintaining LiveRegUnits
in the context where they need the free register. It also doesn't
provide a RegisterClass to use. Also, this should work for any
block and this shouldn't need to special case this set of return
opcodes.

There are a few codegen test changes. One of them I think is a correct
improvement since the old code didn't consider undef uses as available.
The others I think are just to allocation order changes, since we're now
using the broader GPR_*NOSP classes.

Full diff: https://github.com/llvm/llvm-project/pull/156817.diff

5 Files Affected:

(modified) llvm/lib/Target/X86/X86RegisterInfo.cpp (+13-15)
(modified) llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll (+4-4)
(modified) llvm/test/CodeGen/X86/lvi-hardening-ret.ll (+6-6)
(modified) llvm/test/CodeGen/X86/pr40289-64bit.ll (+1-1)
(modified) llvm/test/CodeGen/X86/pr40289.ll (+1-1)

diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index b79e508df7c97..3f4955f28e68b 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -1002,8 +1002,6 @@ unsigned X86RegisterInfo::findDeadCallerSavedReg(
   if (MF->callsEHReturn())
     return 0;
 
-  const TargetRegisterClass &AvailableRegs = *getGPRsForTailCall(*MF);
-
   if (MBBI == MBB.end())
     return 0;
 
@@ -1025,20 +1023,20 @@ unsigned X86RegisterInfo::findDeadCallerSavedReg(
   case X86::TCRETURNmi64:
   case X86::EH_RETURN:
   case X86::EH_RETURN64: {
-    SmallSet<uint16_t, 8> Uses;
-    for (MachineOperand &MO : MBBI->operands()) {
-      if (!MO.isReg() || MO.isDef())
-        continue;
-      Register Reg = MO.getReg();
-      if (!Reg)
-        continue;
-      for (MCRegAliasIterator AI(Reg, this, true); AI.isValid(); ++AI)
-        Uses.insert(*AI);
+    LiveRegUnits LRU(*this);
+    LRU.addLiveOuts(MBB);
+    LRU.stepBackward(*MBBI);
+
+    // FIXME: Why do we need to special case this register? Is it missing from
+    // return implicit uses?
+    LRU.removeReg(X86::RIP);
+
+    const TargetRegisterClass &RC =
+        Is64Bit ? X86::GR64_NOSPRegClass : X86::GR32_NOSPRegClass;
+    for (MCRegister Reg : RC) {
+      if (LRU.available(Reg))
+        return Reg;
     }
-
-    for (auto CS : AvailableRegs)
-      if (!Uses.count(CS) && CS != X86::RIP && CS != X86::RSP && CS != X86::ESP)
-        return CS;
   }
   }
 
diff --git a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
index ad24608d338ad..d6d4db3509103 100644
--- a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
+++ b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
@@ -81,7 +81,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; LIN-NEXT:    .cfi_def_cfa_offset 32
 ; LIN-NEXT:    pop2 %rbp, %r15
 ; LIN-NEXT:    .cfi_def_cfa_offset 16
-; LIN-NEXT:    popq %rcx
+; LIN-NEXT:    popq %rax
 ; LIN-NEXT:    .cfi_def_cfa_offset 8
 ; LIN-NEXT:    retq
 ;
@@ -116,7 +116,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; LIN-PPX-NEXT:    .cfi_def_cfa_offset 32
 ; LIN-PPX-NEXT:    pop2p %rbp, %r15
 ; LIN-PPX-NEXT:    .cfi_def_cfa_offset 16
-; LIN-PPX-NEXT:    popq %rcx
+; LIN-PPX-NEXT:    popq %rax
 ; LIN-PPX-NEXT:    .cfi_def_cfa_offset 8
 ; LIN-PPX-NEXT:    retq
 ;
@@ -180,7 +180,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-NEXT:    pop2 %rbp, %rbx
 ; WIN-NEXT:    pop2 %r13, %r12
 ; WIN-NEXT:    pop2 %r15, %r14
-; WIN-NEXT:    popq %rcx
+; WIN-NEXT:    popq %rax
 ; WIN-NEXT:    .seh_endepilogue
 ; WIN-NEXT:    retq
 ; WIN-NEXT:    .seh_endproc
@@ -211,7 +211,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-PPX-NEXT:    pop2p %rbp, %rbx
 ; WIN-PPX-NEXT:    pop2p %r13, %r12
 ; WIN-PPX-NEXT:    pop2p %r15, %r14
-; WIN-PPX-NEXT:    popq %rcx
+; WIN-PPX-NEXT:    popq %rax
 ; WIN-PPX-NEXT:    .seh_endepilogue
 ; WIN-PPX-NEXT:    retq
 ; WIN-PPX-NEXT:    .seh_endproc
diff --git a/llvm/test/CodeGen/X86/lvi-hardening-ret.ll b/llvm/test/CodeGen/X86/lvi-hardening-ret.ll
index faa8bff8f0943..954985a3798b7 100644
--- a/llvm/test/CodeGen/X86/lvi-hardening-ret.ll
+++ b/llvm/test/CodeGen/X86/lvi-hardening-ret.ll
@@ -41,9 +41,9 @@ entry:
   %add = add nsw i32 %0, %1
   ret i32 %add
 ; CHECK-NOT:  retq
-; CHECK:      popq %rcx
+; CHECK:      popq %rsi
 ; CHECK-NEXT: lfence
-; CHECK-NEXT: jmpq *%rcx
+; CHECK-NEXT: jmpq *%rsi
 }
 
 ; Function Attrs: noinline nounwind optnone uwtable
@@ -52,9 +52,9 @@ define dso_local preserve_mostcc void @preserve_most() #0 {
 entry:
   ret void
 ; CHECK-NOT:  retq
-; CHECK:      popq %rax
+; CHECK:      popq %r11
 ; CHECK-NEXT: lfence
-; CHECK-NEXT: jmpq *%rax
+; CHECK-NEXT: jmpq *%r11
 }
 
 ; Function Attrs: noinline nounwind optnone uwtable
@@ -63,9 +63,9 @@ define dso_local preserve_allcc void @preserve_all() #0 {
 entry:
   ret void
 ; CHECK-NOT:  retq
-; CHECK:      popq %rax
+; CHECK:      popq %r11
 ; CHECK-NEXT: lfence
-; CHECK-NEXT: jmpq *%rax
+; CHECK-NEXT: jmpq *%r11
 }
 
 define { i64, i128 } @ret_i64_i128() #0 {
diff --git a/llvm/test/CodeGen/X86/pr40289-64bit.ll b/llvm/test/CodeGen/X86/pr40289-64bit.ll
index 58da5258a6703..a83674ac81998 100644
--- a/llvm/test/CodeGen/X86/pr40289-64bit.ll
+++ b/llvm/test/CodeGen/X86/pr40289-64bit.ll
@@ -6,5 +6,5 @@ define cc 92 < 9 x i64 > @clobber() {
   ret < 9 x i64 > undef
   ; CHECK-LABEL: clobber:
   ; CHECK-NOT: popq %rsp
-  ; CHECK: addq $8, %rsp
+  ; CHECK: pushq %rax
 }
diff --git a/llvm/test/CodeGen/X86/pr40289.ll b/llvm/test/CodeGen/X86/pr40289.ll
index 851b23c002bdb..21e50931b40f2 100644
--- a/llvm/test/CodeGen/X86/pr40289.ll
+++ b/llvm/test/CodeGen/X86/pr40289.ll
@@ -6,5 +6,5 @@ define < 3 x i32 > @clobber() {
   ret < 3 x i32 > undef
   ; CHECK-LABEL: clobber:
   ; CHECK-NOT: popl %esp
-  ; CHECK: addl $4, %esp
+  ; CHECK: popl %eax
 }

llvm/test/CodeGen/X86/pr40289-64bit.ll

My goal here was to eliminate the use of getGPRsForTailCall. I don't think it makes sense to use here, considering this function handles other non-tail call cases. It seems to have been used to find a non-callee saved register in a roundabout way. We can be more precise by letting LiveRegUnits figure out the CSR set for the current function. There are a few things I find confusing about this function; the API isn't right. The callers ideally would be maintaining LiveRegUnits in the context where they need the free register. It also doesn't provide a RegisterClass to use. Also, this should work for any block and this shouldn't need to special case this set of return opcodes. There are a few codegen test changes. One of them I think is a correct improvement since the old code didn't consider undef uses as available. The others I think are just to allocation order changes, since we're now using the broader GPR_*NOSP classes.

arsenm · 2025-09-08T23:39:25Z

ping

phoebewang · 2025-09-09T02:15:03Z

llvm/lib/Target/X86/X86RegisterInfo.cpp

+    // FIXME: Why do we need to special case this register? Is it missing from
+    // return implicit uses?
+    LRU.removeReg(X86::RIP);


Isn't X86::GR64_NOSPRegClass excluding RIP?

Not sure if that's the class constraint an all of these instructions. This is attempting to reproduce the existing logic

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86RegisterInfo.td#L685

llvm/lib/Target/X86/X86RegisterInfo.cpp

phoebewang

LGTM.

…ss (#157798) Fixes regression casued by #156817.

arsenm added the backend:X86 label Sep 4, 2025 — with Graphite App

arsenm requested review from KanRobert, MatzeB, RKSimon, phoebewang, qcolombet, rnk and topperc September 4, 2025 07:16

arsenm marked this pull request as ready for review September 4, 2025 07:17

phoebewang reviewed Sep 4, 2025

View reviewed changes

llvm/test/CodeGen/X86/pr40289-64bit.ll Outdated Show resolved Hide resolved

arsenm force-pushed the users/arsenm/x86/use-liveregunits-findDeadCallerSavedReg branch from 73e2257 to c831a2a Compare September 4, 2025 12:46

arsenm mentioned this pull request Sep 4, 2025

X86: Stop using MachineFunction in getPointerRegClass #156880

Merged

phoebewang reviewed Sep 9, 2025

View reviewed changes

arsenm commented Sep 9, 2025

View reviewed changes

llvm/lib/Target/X86/X86RegisterInfo.cpp Outdated Show resolved Hide resolved

Update llvm/lib/Target/X86/X86RegisterInfo.cpp

8b22709

phoebewang approved these changes Sep 9, 2025

View reviewed changes

arsenm enabled auto-merge (squash) September 9, 2025 03:35

arsenm merged commit 6076b07 into main Sep 9, 2025
9 checks passed

arsenm deleted the users/arsenm/x86/use-liveregunits-findDeadCallerSavedReg branch September 9, 2025 03:52

phoebewang mentioned this pull request Sep 10, 2025

[X86][LiveRegUnits] Exclude reserved registers from TargetRegisterClass #157798

Merged

phoebewang added a commit that referenced this pull request Sep 10, 2025

[X86][LiveRegUnits] Exclude reserved registers from TargetRegisterCla…

253d18d

…ss (#157798) Fixes regression casued by #156817.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

X86: Use LiveRegUnits in findDeadCallerSavedReg #156817

X86: Use LiveRegUnits in findDeadCallerSavedReg #156817

Uh oh!

arsenm commented Sep 4, 2025 •

edited

Loading

Uh oh!

arsenm commented Sep 4, 2025 •

edited

Loading

Uh oh!

llvmbot commented Sep 4, 2025

Uh oh!

Uh oh!

arsenm commented Sep 8, 2025

Uh oh!

phoebewang Sep 9, 2025

Uh oh!

arsenm Sep 9, 2025

Uh oh!

phoebewang Sep 9, 2025

Uh oh!

Uh oh!

phoebewang left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

X86: Use LiveRegUnits in findDeadCallerSavedReg #156817

X86: Use LiveRegUnits in findDeadCallerSavedReg #156817

Uh oh!

Conversation

arsenm commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 4, 2025

Uh oh!

Uh oh!

arsenm commented Sep 8, 2025

Uh oh!

phoebewang Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

phoebewang Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

phoebewang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

arsenm commented Sep 4, 2025 •

edited

Loading

arsenm commented Sep 4, 2025 •

edited

Loading