Skip to content

Conversation

@KRM7
Copy link
Contributor

@KRM7 KRM7 commented Aug 4, 2025

Currently, when an instruction rematerialized by the register coalescer defines more subregs of the destination register
than the original COPY instruction did, we only add dead defs for the newly defined subregs if they were not defined anywhere
else. For example, consider something like this before rematerialization:

 %0:reg64 = CONSTANT 1
 %1:reg128.sub_lo64_lo32 = COPY %0.lo32
 %1:reg128.sub_lo64_hi32 = ...
 ...

that would look like this after rematerializing %0:

 %0:reg64 = CONSTANT 2
 %1:reg128.sub_lo64 = CONSTANT 2
 %1:reg128.sub_lo64_hi32 = ...
 ...

A dead def would not be added for %1.sub_lo64_hi32 at the 2nd instruction because it's subrange wasn't empty beforehand.

@KRM7 KRM7 marked this pull request as ready for review August 5, 2025 08:49
@KRM7
Copy link
Contributor Author

KRM7 commented Aug 5, 2025

@qcolombet @arsenm

@llvmbot
Copy link
Member

llvmbot commented Aug 5, 2025

@llvm/pr-subscribers-llvm-regalloc

@llvm/pr-subscribers-backend-systemz

Author: None (KRM7)

Changes

Currently, when an instruction rematerialized by the register coalescer defines more subregs of the destination register
than the original COPY instruction did, we only add dead defs for the newly defined subregs if they were not defined anywhere
else. For example, consider something like this before rematerialization:

 %0:reg64 = CONSTANT 1
 %1:reg128.sub_lo64_lo32 = COPY %0.lo32
 %1:reg128.sub_lo64_hi32 = ...
 ...

that would look like this after rematerializing %0:

 %0:reg64 = CONSTANT 2
 %1:reg128.sub_lo64 = CONSTANT 2
 %1:reg128.sub_lo64_hi32 = ...
 ...

A dead def would not be added for %1.sub_lo64_hi32 at the 2nd instruction because it's subrange wasn't empty beforehand.


Full diff: https://github.com/llvm/llvm-project/pull/151974.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/RegisterCoalescer.cpp (+5-5)
  • (modified) llvm/test/CodeGen/SystemZ/regcoal-subranges-update-remat.mir (+25)
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 2d7987a2e1988..514f2f02d6425 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -1624,11 +1624,11 @@ bool RegisterCoalescer::reMaterializeTrivialDef(const CoalescerPair &CP,
           UpdatedSubRanges = true;
         } else {
           // We know that this lane is defined by this instruction,
-          // but at this point it may be empty because it is not used by
-          // anything. This happens when updateRegDefUses adds the missing
-          // lanes. Assign that lane a dead def so that the interferences
-          // are properly modeled.
-          if (SR.empty())
+          // but at this point it might not be live because it was not defined
+          // by the original instruction. This happens when the
+          // rematerialization widens the defined register. Assign that lane a
+          // dead def so that the interferences are properly modeled.
+          if (!SR.liveAt(DefIndex))
             SR.createDeadDef(DefIndex, Alloc);
         }
       }
diff --git a/llvm/test/CodeGen/SystemZ/regcoal-subranges-update-remat.mir b/llvm/test/CodeGen/SystemZ/regcoal-subranges-update-remat.mir
index e3207df799449..d9fe810e23b3a 100644
--- a/llvm/test/CodeGen/SystemZ/regcoal-subranges-update-remat.mir
+++ b/llvm/test/CodeGen/SystemZ/regcoal-subranges-update-remat.mir
@@ -43,3 +43,28 @@ body:             |
     %3:gr32bit = COPY killed %1
     Return implicit %3
 ...
+
+---
+name:            test_dead_at_remat_later_defined
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: test_dead_at_remat_later_defined
+    ; CHECK: undef [[LHI:%[0-9]+]].subreg_l32:gr128bit = LHI 0
+    ; CHECK-NEXT: [[LHI:%[0-9]+]].subreg_l64:gr128bit = LGHI 2
+    ; CHECK-NEXT: [[LHI1:%[0-9]+]]:gr32bit = LHI 1
+    ; CHECK-NEXT: [[LHI:%[0-9]+]].subreg_lh32:gr128bit = COPY [[LHI1]]
+    ; CHECK-NEXT: [[LGHI:%[0-9]+]]:gr64bit = LGHI 2
+    ; CHECK-NEXT: [[LHI:%[0-9]+]].subreg_h32:gr128bit = COPY [[LGHI]].subreg_l32
+    ; CHECK-NEXT: $r0q = COPY [[LHI]]
+    ; CHECK-NEXT: $r4d = COPY [[LGHI]].subreg_h32
+    %0:gr64bit = LGHI 2
+    %1:gr32bit = LHI 0
+    %2:gr32bit = LHI 1
+    undef %3.subreg_ll32:gr128bit = COPY %0.subreg_l32
+    %3.subreg_lh32:gr128bit = COPY %2
+    %3.subreg_l32:gr128bit = COPY %1
+    %3.subreg_h32:gr128bit = COPY %0.subreg_l32
+    $r0q = COPY %3
+    $r4d = COPY %0.subreg_h32
+...

@arsenm arsenm merged commit ee47427 into llvm:main Aug 5, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants