AMDGPU: skip AMDGPUAttributor and AMDGPUImageIntrinsicOptimizerPass on R600. #162207

jyknight · 2025-10-07T00:52:25Z

These passes call getSubtarget<GCNSubtarget>, which doesn't work on R600 targets, as that uses an R600Subtarget type, instead. Unfortunately, TargetMachine::getSubtarget<ST> does an unchecked static_cast to ST&, which makes it easy for this error to go undetected.

The modifications here were verified by running check-llvm with an assert added to getSubtarget. However, that asssert requires that RTTI is enabled, which LLVM doesn't use, so I've reverted the assert before sending this fix upstream.

These errors have been present for some time, but were detected after #162040 caused an uninitialized memory read to be reported by asan/msan.

Unfortunately, `TargetMachine::getSubtarget<ST>` does an unchecked static_cast to `ST&`, which makes it easy to get wrong. The modifications here were created by running check-llvm with an assert added to getSubtarget. However, that asssert requires that RTTI is enabled, which LLVM doesn't use, so I've reverted the assert before sending this fix upstream.

llvmbot · 2025-10-07T00:52:55Z

@llvm/pr-subscribers-backend-amdgpu

Author: James Y Knight (jyknight)

Changes

Unfortunately, TargetMachine::getSubtarget<ST> does an unchecked static_cast to ST&, which makes it easy to get wrong.

The modifications here were created by running check-llvm with an assert added to getSubtarget. However, that asssert requires that RTTI is enabled, which LLVM doesn't use, so I've reverted the assert before sending this fix upstream.

These errors have been present for some time, but were detected after #162040 caused an uninitialized memory read to be reported by asan/msan.

Full diff: https://github.com/llvm/llvm-project/pull/162207.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp (+16-10)
(modified) llvm/lib/Target/AMDGPU/AMDGPUImageIntrinsicOptimizer.cpp (+1-1)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
index cb49936871e74..f6186f1fae777 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
@@ -162,12 +162,16 @@ class AMDGPUInformationCache : public InformationCache {
 
   /// Check if the subtarget has aperture regs.
   bool hasApertureRegs(Function &F) {
+    if (!TM.getTargetTriple().isAMDGCN())
+      return false;
     const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
     return ST.hasApertureRegs();
   }
 
   /// Check if the subtarget supports GetDoorbellID.
   bool supportsGetDoorbellID(Function &F) {
+    if (!TM.getTargetTriple().isAMDGCN())
+      return false;
     const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
     return ST.supportsGetDoorbellID();
   }
@@ -182,18 +186,18 @@ class AMDGPUInformationCache : public InformationCache {
 
   std::pair<unsigned, unsigned>
   getDefaultFlatWorkGroupSize(const Function &F) const {
-    const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
+    const AMDGPUSubtarget &ST = AMDGPUSubtarget::get(TM, F);
     return ST.getDefaultFlatWorkGroupSize(F.getCallingConv());
   }
 
   std::pair<unsigned, unsigned>
   getMaximumFlatWorkGroupRange(const Function &F) {
-    const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
+    const AMDGPUSubtarget &ST = AMDGPUSubtarget::get(TM, F);
     return {ST.getMinFlatWorkGroupSize(), ST.getMaxFlatWorkGroupSize()};
   }
 
   SmallVector<unsigned> getMaxNumWorkGroups(const Function &F) {
-    const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
+    const AMDGPUSubtarget &ST = AMDGPUSubtarget::get(TM, F);
     return ST.getMaxNumWorkGroups(F);
   }
 
@@ -206,7 +210,7 @@ class AMDGPUInformationCache : public InformationCache {
   std::pair<unsigned, unsigned>
   getWavesPerEU(const Function &F,
                 std::pair<unsigned, unsigned> FlatWorkGroupSize) {
-    const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
+    const AMDGPUSubtarget &ST = AMDGPUSubtarget::get(TM, F);
     return ST.getWavesPerEU(FlatWorkGroupSize, getLDSSize(F), F);
   }
 
@@ -217,7 +221,7 @@ class AMDGPUInformationCache : public InformationCache {
     if (!Val)
       return std::nullopt;
     if (!Val->second) {
-      const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
+      const AMDGPUSubtarget &ST = AMDGPUSubtarget::get(TM, F);
       Val->second = ST.getMaxWavesPerEU();
     }
     return std::make_pair(Val->first, *(Val->second));
@@ -227,13 +231,13 @@ class AMDGPUInformationCache : public InformationCache {
   getEffectiveWavesPerEU(const Function &F,
                          std::pair<unsigned, unsigned> WavesPerEU,
                          std::pair<unsigned, unsigned> FlatWorkGroupSize) {
-    const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
+    const AMDGPUSubtarget &ST = AMDGPUSubtarget::get(TM, F);
     return ST.getEffectiveWavesPerEU(WavesPerEU, FlatWorkGroupSize,
                                      getLDSSize(F));
   }
 
   unsigned getMaxWavesPerEU(const Function &F) {
-    const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
+    const AMDGPUSubtarget &ST = AMDGPUSubtarget::get(TM, F);
     return ST.getMaxWavesPerEU();
   }
 
@@ -1511,9 +1515,11 @@ static bool runImpl(Module &M, AnalysisGetter &AG, TargetMachine &TM,
       A.getOrCreateAAFor<AAAMDWavesPerEU>(IRPosition::function(*F));
     }
 
-    const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(*F);
-    if (!F->isDeclaration() && ST.hasClusters())
-      A.getOrCreateAAFor<AAAMDGPUClusterDims>(IRPosition::function(*F));
+    if (TM.getTargetTriple().isAMDGCN()) {
+      const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(*F);
+      if (!F->isDeclaration() && ST.hasClusters())
+        A.getOrCreateAAFor<AAAMDGPUClusterDims>(IRPosition::function(*F));
+    }
 
     for (auto &I : instructions(F)) {
       Value *Ptr = nullptr;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUImageIntrinsicOptimizer.cpp b/llvm/lib/Target/AMDGPU/AMDGPUImageIntrinsicOptimizer.cpp
index 639089c75a33e..155c7dad904c5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUImageIntrinsicOptimizer.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUImageIntrinsicOptimizer.cpp
@@ -281,7 +281,7 @@ bool optimizeSection(ArrayRef<SmallVector<IntrinsicInst *, 4>> MergeableInsts) {
 }
 
 static bool imageIntrinsicOptimizerImpl(Function &F, const TargetMachine *TM) {
-  if (!TM)
+  if (!TM || !TM->getTargetTriple().isAMDGCN())
     return false;
 
   // This optimization only applies to GFX11 and beyond.

arsenm

Instead we should just not add the pass for r600

jyknight · 2025-10-07T01:48:15Z

AMDGPUImageIntrinsicOptimizer: OK, will change that one.

But are you saying AMDGPUAttributor also shouldn't run on R600? I can change it if you like, but naively the pass seems like it's doing things that do apply to R600 too, and has some code explicitly for r600 intrinsics.

shiltian · 2025-10-07T02:21:04Z

I'm not sure how much longer we still want to support r600.

jyknight · 2025-10-07T13:16:07Z

I'm not sure how much longer we still want to support r600.

I have no opinion on that. But surely it should not be invoking UB in the test suite while it's still included, right?

shiltian · 2025-10-07T13:44:00Z

That's right. However, I'm not sure if we instead want to disable AMDGPUAttributor for r600.

jyknight · 2025-10-07T14:15:50Z

Ah, got it -- another vote for skipping the pass on r600 instead of making it work -- done!

shiltian · 2025-10-07T15:31:46Z

llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp

-      if (!isLTOPreLink(Phase)) {
-        AMDGPUAttributorOptions Opts;
-        MPM.addPass(AMDGPUAttributorPass(*this, Opts, Phase));
+  if (getTargetTriple().isAMDGCN()) {


I'd put this check into the call back, closely wrap the attributor pass, since we might want to add more passes into this EP.

jyknight requested a review from shiltian October 7, 2025 00:52

llvmbot added the backend:AMDGPU label Oct 7, 2025

jyknight requested a review from arsenm October 7, 2025 00:53

jyknight mentioned this pull request Oct 7, 2025

[AMDGPU] Make cluster a target feature #162040

Merged

arsenm reviewed Oct 7, 2025

View reviewed changes

Review feedback: don't add AMDGPUImageIntrinsicOptimizer pass for R600.

4becdd9

Review feedback 2: Don't run AMDGPUAttributor for R600.

9102ec6

Merge remote-tracking branch 'origin/main' into fix-r600

74c64e7

shiltian reviewed Oct 7, 2025

View reviewed changes

Address review comment.

95348a6

jyknight changed the title ~~AMDGPU: don't call getSubtarget<GCNSubtarget> on R600 targets.~~ AMDGPU: skip AMDGPUAttributor and AMDGPUImageIntrinsicOptimizerPass on R600 targets. Oct 7, 2025

jyknight changed the title ~~AMDGPU: skip AMDGPUAttributor and AMDGPUImageIntrinsicOptimizerPass on R600 targets.~~ AMDGPU: skip AMDGPUAttributor and AMDGPUImageIntrinsicOptimizerPass on R600. Oct 7, 2025

shiltian approved these changes Oct 7, 2025

View reviewed changes

jyknight enabled auto-merge (squash) October 7, 2025 16:12

jyknight merged commit 2f4275b into llvm:main Oct 7, 2025
9 checks passed

jyknight deleted the fix-r600 branch October 7, 2025 16:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMDGPU: skip AMDGPUAttributor and AMDGPUImageIntrinsicOptimizerPass on R600. #162207

AMDGPU: skip AMDGPUAttributor and AMDGPUImageIntrinsicOptimizerPass on R600. #162207

jyknight commented Oct 7, 2025 •

edited

Loading

Uh oh!

llvmbot commented Oct 7, 2025

Uh oh!

arsenm left a comment

Uh oh!

jyknight commented Oct 7, 2025

Uh oh!

shiltian commented Oct 7, 2025

Uh oh!

jyknight commented Oct 7, 2025

Uh oh!

shiltian commented Oct 7, 2025

Uh oh!

jyknight commented Oct 7, 2025

Uh oh!

shiltian Oct 7, 2025

Uh oh!

jyknight Oct 7, 2025

Uh oh!

Uh oh!

Uh oh!

AMDGPU: skip AMDGPUAttributor and AMDGPUImageIntrinsicOptimizerPass on R600. #162207

AMDGPU: skip AMDGPUAttributor and AMDGPUImageIntrinsicOptimizerPass on R600. #162207

Conversation

jyknight commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 7, 2025

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

jyknight commented Oct 7, 2025

Uh oh!

shiltian commented Oct 7, 2025

Uh oh!

jyknight commented Oct 7, 2025

Uh oh!

shiltian commented Oct 7, 2025

Uh oh!

jyknight commented Oct 7, 2025

Uh oh!

shiltian Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

jyknight Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jyknight commented Oct 7, 2025 •

edited

Loading