AMDGPU/UniformityAnalysis: fix G_ZEXTLOAD and G_SEXTLOAD #157845

petar-avramovic · 2025-09-10T12:31:46Z

Use same rules for G_ZEXTLOAD and G_SEXTLOAD as for G_LOAD.
Flat addrspace(0) and private addrspace(5) G_ZEXTLOAD and G_SEXTLOAD
should be always divergent.

petar-avramovic · 2025-09-10T12:32:02Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-09-10T12:34:08Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-amdgpu

Author: Petar Avramovic (petar-avramovic)

Changes

Use same rules for G_ZEXTLOAD and G_SEXTLOAD as for G_LOAD.
Flat addrspace(0) and private addrspace(5) G_ZEXTLOAD and G_SEXTLOAD
should be always divergent.

Full diff: https://github.com/llvm/llvm-project/pull/157845.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+8-7)
(modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/loads-gmir.mir (+12-8)

diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 5c958dfe6954f..398c99b3bd127 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -10281,7 +10281,7 @@ unsigned SIInstrInfo::getInstrLatency(const InstrItineraryData *ItinData,
 InstructionUniformity
 SIInstrInfo::getGenericInstructionUniformity(const MachineInstr &MI) const {
   const MachineRegisterInfo &MRI = MI.getMF()->getRegInfo();
-  unsigned opcode = MI.getOpcode();
+  unsigned Opcode = MI.getOpcode();
 
   auto HandleAddrSpaceCast = [this, &MRI](const MachineInstr &MI) {
     Register Dst = MI.getOperand(0).getReg();
@@ -10301,7 +10301,7 @@ SIInstrInfo::getGenericInstructionUniformity(const MachineInstr &MI) const {
   // If the target supports globally addressable scratch, the mapping from
   // scratch memory to the flat aperture changes therefore an address space cast
   // is no longer uniform.
-  if (opcode == TargetOpcode::G_ADDRSPACE_CAST)
+  if (Opcode == TargetOpcode::G_ADDRSPACE_CAST)
     return HandleAddrSpaceCast(MI);
 
   if (auto *GI = dyn_cast<GIntrinsic>(&MI)) {
@@ -10329,7 +10329,8 @@ SIInstrInfo::getGenericInstructionUniformity(const MachineInstr &MI) const {
   //
   // All other loads are not divergent, because if threads issue loads with the
   // same arguments, they will always get the same result.
-  if (opcode == AMDGPU::G_LOAD) {
+  if (Opcode == AMDGPU::G_LOAD || Opcode == AMDGPU::G_ZEXTLOAD ||
+      Opcode == AMDGPU::G_SEXTLOAD) {
     if (MI.memoperands_empty())
       return InstructionUniformity::NeverUniform; // conservative assumption
 
@@ -10343,10 +10344,10 @@ SIInstrInfo::getGenericInstructionUniformity(const MachineInstr &MI) const {
     return InstructionUniformity::Default;
   }
 
-  if (SIInstrInfo::isGenericAtomicRMWOpcode(opcode) ||
-      opcode == AMDGPU::G_ATOMIC_CMPXCHG ||
-      opcode == AMDGPU::G_ATOMIC_CMPXCHG_WITH_SUCCESS ||
-      AMDGPU::isGenericAtomic(opcode)) {
+  if (SIInstrInfo::isGenericAtomicRMWOpcode(Opcode) ||
+      Opcode == AMDGPU::G_ATOMIC_CMPXCHG ||
+      Opcode == AMDGPU::G_ATOMIC_CMPXCHG_WITH_SUCCESS ||
+      AMDGPU::isGenericAtomic(Opcode)) {
     return InstructionUniformity::NeverUniform;
   }
   return InstructionUniformity::Default;
diff --git a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/loads-gmir.mir b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/loads-gmir.mir
index cb3c2de5b8753..d799cd2057f47 100644
--- a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/loads-gmir.mir
+++ b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/loads-gmir.mir
@@ -46,13 +46,13 @@ body:             |
     %6:_(p5) = G_IMPLICIT_DEF
 
     ; Atomic load
-    ; CHECK-NOT: DIVERGENT
-
+    ; CHECK: DIVERGENT
+    ; CHECK-SAME: G_ZEXTLOAD
     %0:_(s32) = G_ZEXTLOAD %1(p0) :: (load seq_cst (s16) from `ptr undef`)
 
     ; flat load
-    ; CHECK-NOT: DIVERGENT
-
+    ; CHECK: DIVERGENT
+    ; CHECK-SAME: G_ZEXTLOAD
     %2:_(s32) = G_ZEXTLOAD %1(p0) :: (load (s16) from `ptr undef`)
 
     ; Gloabal load
@@ -60,7 +60,8 @@ body:             |
     %3:_(s32) = G_ZEXTLOAD %4(p1) :: (load (s16) from `ptr addrspace(1) undef`, addrspace 1)
 
     ; Private load
-    ; CHECK-NOT: DIVERGENT
+    ; CHECK: DIVERGENT
+    ; CHECK-SAME: G_ZEXTLOAD
     %5:_(s32) = G_ZEXTLOAD %6(p5) :: (volatile load (s16) from `ptr addrspace(5) undef`, addrspace 5)
     G_STORE %2(s32), %4(p1) :: (volatile store (s32) into `ptr addrspace(1) undef`, addrspace 1)
     G_STORE %3(s32), %4(p1) :: (volatile store (s32) into `ptr addrspace(1) undef`, addrspace 1)
@@ -80,11 +81,13 @@ body:             |
     %6:_(p5) = G_IMPLICIT_DEF
 
     ; Atomic load
-    ; CHECK-NOT: DIVERGENT
+    ; CHECK: DIVERGENT
+    ; CHECK-SAME: G_SEXTLOAD
     %0:_(s32) = G_SEXTLOAD %1(p0) :: (load seq_cst (s16) from `ptr undef`)
 
     ; flat load
-    ; CHECK-NOT: DIVERGENT
+    ; CHECK: DIVERGENT
+    ; CHECK-SAME: G_SEXTLOAD
     %2:_(s32) = G_SEXTLOAD %1(p0) :: (load (s16) from `ptr undef`)
 
     ; Gloabal load
@@ -92,7 +95,8 @@ body:             |
     %3:_(s32) = G_SEXTLOAD %4(p1) :: (load (s16) from `ptr addrspace(1) undef`, addrspace 1)
 
     ; Private load
-    ; CHECK-NOT: DIVERGENT
+    ; CHECK: DIVERGENT
+    ; CHECK-SAME: G_SEXTLOAD
     %5:_(s32) = G_SEXTLOAD %6(p5) :: (volatile load (s16) from `ptr addrspace(5) undef`, addrspace 5)
     G_STORE %2(s32), %4(p1) :: (volatile store (s32) into `ptr addrspace(1) undef`, addrspace 1)
     G_STORE %3(s32), %4(p1) :: (volatile store (s32) into `ptr addrspace(1) undef`, addrspace 1)

Use same rules for G_ZEXTLOAD and G_SEXTLOAD as for G_LOAD. Flat addrspace(0) and private addrspace(5) G_ZEXTLOAD and G_SEXTLOAD should be always divergent.

petar-avramovic · 2025-09-10T15:55:33Z

Merge activity

Sep 10, 3:55 PM UTC: A user started a stack merge that includes this pull request via Graphite.
Sep 10, 3:57 PM UTC: @petar-avramovic merged this pull request with Graphite.

petar-avramovic mentioned this pull request Sep 10, 2025

AMDGPU: Add uniformity analysis test for G_ZEXTLOAD and G_SEXTLOAD #157844

Merged

petar-avramovic requested review from Pierre-vh, arsenm, gandhi56, nhaehnle and ssahasra September 10, 2025 12:33

petar-avramovic marked this pull request as ready for review September 10, 2025 12:33

llvmbot added backend:AMDGPU llvm:analysis Includes value tracking, cost tables and constant folding labels Sep 10, 2025

petar-avramovic mentioned this pull request Sep 10, 2025

AMDGPU/GlobalISel: Add regbanklegalize rules for load and store #153176

Merged

Pierre-vh approved these changes Sep 10, 2025

View reviewed changes

petar-avramovic force-pushed the users/petar-avramovic/extload-test branch 12 times, most recently from bfa8de6 to bb121e2 Compare September 10, 2025 14:33

petar-avramovic force-pushed the users/petar-avramovic/extload-uniformity branch from 8573b17 to f426257 Compare September 10, 2025 14:33

Base automatically changed from users/petar-avramovic/extload-test to main September 10, 2025 15:19

AMDGPU/UniformityAnalysis: fix G_ZEXTLOAD and G_SEXTLOAD

310a7b6

Use same rules for G_ZEXTLOAD and G_SEXTLOAD as for G_LOAD. Flat addrspace(0) and private addrspace(5) G_ZEXTLOAD and G_SEXTLOAD should be always divergent.

petar-avramovic force-pushed the users/petar-avramovic/extload-uniformity branch from f426257 to 310a7b6 Compare September 10, 2025 15:21

petar-avramovic merged commit 41c6859 into main Sep 10, 2025
9 checks passed

petar-avramovic deleted the users/petar-avramovic/extload-uniformity branch September 10, 2025 15:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMDGPU/UniformityAnalysis: fix G_ZEXTLOAD and G_SEXTLOAD #157845

AMDGPU/UniformityAnalysis: fix G_ZEXTLOAD and G_SEXTLOAD #157845

Uh oh!

petar-avramovic commented Sep 10, 2025

Uh oh!

petar-avramovic commented Sep 10, 2025 •

edited

Loading

Uh oh!

llvmbot commented Sep 10, 2025 •

edited

Loading

Uh oh!

petar-avramovic commented Sep 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AMDGPU/UniformityAnalysis: fix G_ZEXTLOAD and G_SEXTLOAD #157845

AMDGPU/UniformityAnalysis: fix G_ZEXTLOAD and G_SEXTLOAD #157845

Uh oh!

Conversation

petar-avramovic commented Sep 10, 2025

Uh oh!

petar-avramovic commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petar-avramovic commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

petar-avramovic commented Sep 10, 2025 •

edited

Loading

llvmbot commented Sep 10, 2025 •

edited

Loading

petar-avramovic commented Sep 10, 2025 •

edited

Loading