Skip to content

Conversation

@rampitec
Copy link
Collaborator

If a wavefrontsize32 or wavefrontsize64 is the only possible value
insert it into feature list by default and use that value as an
indication that another wavefront size is not legal.

Copy link
Collaborator Author

rampitec commented Aug 21, 2025

@rampitec rampitec requested review from arsenm and yxsamliu August 21, 2025 22:00
@rampitec rampitec marked this pull request as ready for review August 21, 2025 22:01
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Aug 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 21, 2025

@llvm/pr-subscribers-clang

Author: Stanislav Mekhanoshin (rampitec)

Changes

If a wavefrontsize32 or wavefrontsize64 is the only possible value
insert it into feature list by default and use that value as an
indication that another wavefront size is not legal.


Full diff: https://github.com/llvm/llvm-project/pull/154850.diff

3 Files Affected:

  • (modified) clang/lib/Basic/Targets/AMDGPU.cpp (+2-2)
  • (modified) clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl (+2)
  • (modified) llvm/lib/TargetParser/TargetParser.cpp (+22-49)
diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index 639e735202f26..dba1bab88c481 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -197,11 +197,11 @@ bool AMDGPUTargetInfo::initFeatureMap(
     const std::vector<std::string> &FeatureVec) const {
 
   using namespace llvm::AMDGPU;
-  fillAMDGPUFeatureMap(CPU, getTriple(), Features);
+
   if (!TargetInfo::initFeatureMap(Features, Diags, CPU, FeatureVec))
     return false;
 
-  // TODO: Should move this logic into TargetParser
+    // TODO: Should move this logic into TargetParser
   auto HasError = insertWaveSizeFeature(CPU, getTriple(), Features);
   switch (HasError.first) {
   default:
diff --git a/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl b/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
index 4e2f7f86e8402..04de5dca3f6c0 100644
--- a/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
+++ b/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
@@ -1,8 +1,10 @@
 // RUN: not %clang_cc1 -triple amdgcn -target-feature +wavefrontsize32 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s
 // RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx1103 -target-feature +wavefrontsize32 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s
 // RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx900 -target-feature +wavefrontsize32 -o /dev/null %s 2>&1 | FileCheck %s --check-prefix=GFX9
+// RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx1250 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s --check-prefix=GFX1250
 
 // CHECK: error: invalid feature combination: 'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive
 // GFX9: error: option 'wavefrontsize32' cannot be specified on this target
+// GFX1250: error: option 'wavefrontsize64' cannot be specified on this target
 
 kernel void test() {}
diff --git a/llvm/lib/TargetParser/TargetParser.cpp b/llvm/lib/TargetParser/TargetParser.cpp
index 50b97d3257540..91ac4cc71d34b 100644
--- a/llvm/lib/TargetParser/TargetParser.cpp
+++ b/llvm/lib/TargetParser/TargetParser.cpp
@@ -471,6 +471,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["setprio-inc-wg-inst"] = true;
       Features["atomic-fmin-fmax-global-f32"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize32"] = true;
       break;
     case GK_GFX1201:
     case GK_GFX1200:
@@ -638,6 +639,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["gws"] = true;
       Features["vmem-to-lds-load-insts"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_GFX90A:
       Features["gfx90a-insts"] = true;
@@ -681,6 +683,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["image-insts"] = true;
       Features["s-memtime-inst"] = true;
       Features["gws"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_GFX705:
     case GK_GFX704:
@@ -698,6 +701,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["gws"] = true;
       Features["atomic-fmin-fmax-global-f32"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_NONE:
       break;
@@ -734,68 +738,37 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
   }
 }
 
-static bool isWave32Capable(StringRef GPU, const Triple &T) {
-  bool IsWave32Capable = false;
-  // XXX - What does the member GPU mean if device name string passed here?
-  if (T.isAMDGCN()) {
-    switch (parseArchAMDGCN(GPU)) {
-    case GK_GFX1250:
-    case GK_GFX1201:
-    case GK_GFX1200:
-    case GK_GFX1153:
-    case GK_GFX1152:
-    case GK_GFX1151:
-    case GK_GFX1150:
-    case GK_GFX1103:
-    case GK_GFX1102:
-    case GK_GFX1101:
-    case GK_GFX1100:
-    case GK_GFX1036:
-    case GK_GFX1035:
-    case GK_GFX1034:
-    case GK_GFX1033:
-    case GK_GFX1032:
-    case GK_GFX1031:
-    case GK_GFX1030:
-    case GK_GFX1012:
-    case GK_GFX1011:
-    case GK_GFX1013:
-    case GK_GFX1010:
-    case GK_GFX12_GENERIC:
-    case GK_GFX11_GENERIC:
-    case GK_GFX10_3_GENERIC:
-    case GK_GFX10_1_GENERIC:
-      IsWave32Capable = true;
-      break;
-    default:
-      break;
-    }
-  }
-  return IsWave32Capable;
-}
-
 std::pair<FeatureError, StringRef>
 AMDGPU::insertWaveSizeFeature(StringRef GPU, const Triple &T,
                               StringMap<bool> &Features) {
-  bool IsWave32Capable = isWave32Capable(GPU, T);
+  StringMap<bool> DefaultFeatures;
+  fillAMDGPUFeatureMap(GPU, T, DefaultFeatures);
+
   const bool IsNullGPU = GPU.empty();
+  const bool TargetHasWave32 = DefaultFeatures.count("wavefrontsize32");
+  const bool TargetHasWave64 = DefaultFeatures.count("wavefrontsize64");
   const bool HaveWave32 = Features.count("wavefrontsize32");
   const bool HaveWave64 = Features.count("wavefrontsize64");
   if (HaveWave32 && HaveWave64) {
     return {AMDGPU::INVALID_FEATURE_COMBINATION,
             "'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive"};
   }
-  if (HaveWave32 && !IsNullGPU && !IsWave32Capable) {
+  if (HaveWave32 && !IsNullGPU && TargetHasWave64) {
     return {AMDGPU::UNSUPPORTED_TARGET_FEATURE, "wavefrontsize32"};
   }
+  if (HaveWave64 && !IsNullGPU && TargetHasWave32) {
+    return {AMDGPU::UNSUPPORTED_TARGET_FEATURE, "wavefrontsize64"};
+  }
   // Don't assume any wavesize with an unknown subtarget.
-  if (!IsNullGPU) {
-    // Default to wave32 if available, or wave64 if not
-    if (!HaveWave32 && !HaveWave64) {
-      StringRef DefaultWaveSizeFeature =
-          IsWave32Capable ? "wavefrontsize32" : "wavefrontsize64";
-      Features.insert(std::make_pair(DefaultWaveSizeFeature, true));
-    }
+  // Default to wave32 is target supports both.
+  if (!IsNullGPU && !HaveWave32 && !HaveWave64 && !TargetHasWave32 &&
+      !TargetHasWave64)
+    Features.insert(std::make_pair("wavefrontsize32", true));
+
+  for (const auto &Entry : DefaultFeatures) {
+    if (!Features.count(Entry.getKey()))
+      Features[Entry.getKey()] = Entry.getValue();
   }
+
   return {NO_ERROR, StringRef()};
 }

@llvmbot
Copy link
Member

llvmbot commented Aug 21, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

If a wavefrontsize32 or wavefrontsize64 is the only possible value
insert it into feature list by default and use that value as an
indication that another wavefront size is not legal.


Full diff: https://github.com/llvm/llvm-project/pull/154850.diff

3 Files Affected:

  • (modified) clang/lib/Basic/Targets/AMDGPU.cpp (+2-2)
  • (modified) clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl (+2)
  • (modified) llvm/lib/TargetParser/TargetParser.cpp (+22-49)
diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index 639e735202f26..dba1bab88c481 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -197,11 +197,11 @@ bool AMDGPUTargetInfo::initFeatureMap(
     const std::vector<std::string> &FeatureVec) const {
 
   using namespace llvm::AMDGPU;
-  fillAMDGPUFeatureMap(CPU, getTriple(), Features);
+
   if (!TargetInfo::initFeatureMap(Features, Diags, CPU, FeatureVec))
     return false;
 
-  // TODO: Should move this logic into TargetParser
+    // TODO: Should move this logic into TargetParser
   auto HasError = insertWaveSizeFeature(CPU, getTriple(), Features);
   switch (HasError.first) {
   default:
diff --git a/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl b/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
index 4e2f7f86e8402..04de5dca3f6c0 100644
--- a/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
+++ b/clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
@@ -1,8 +1,10 @@
 // RUN: not %clang_cc1 -triple amdgcn -target-feature +wavefrontsize32 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s
 // RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx1103 -target-feature +wavefrontsize32 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s
 // RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx900 -target-feature +wavefrontsize32 -o /dev/null %s 2>&1 | FileCheck %s --check-prefix=GFX9
+// RUN: not %clang_cc1 -triple amdgcn -target-cpu gfx1250 -target-feature +wavefrontsize64 -o /dev/null %s 2>&1 | FileCheck %s --check-prefix=GFX1250
 
 // CHECK: error: invalid feature combination: 'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive
 // GFX9: error: option 'wavefrontsize32' cannot be specified on this target
+// GFX1250: error: option 'wavefrontsize64' cannot be specified on this target
 
 kernel void test() {}
diff --git a/llvm/lib/TargetParser/TargetParser.cpp b/llvm/lib/TargetParser/TargetParser.cpp
index 50b97d3257540..91ac4cc71d34b 100644
--- a/llvm/lib/TargetParser/TargetParser.cpp
+++ b/llvm/lib/TargetParser/TargetParser.cpp
@@ -471,6 +471,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["setprio-inc-wg-inst"] = true;
       Features["atomic-fmin-fmax-global-f32"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize32"] = true;
       break;
     case GK_GFX1201:
     case GK_GFX1200:
@@ -638,6 +639,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["gws"] = true;
       Features["vmem-to-lds-load-insts"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_GFX90A:
       Features["gfx90a-insts"] = true;
@@ -681,6 +683,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["image-insts"] = true;
       Features["s-memtime-inst"] = true;
       Features["gws"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_GFX705:
     case GK_GFX704:
@@ -698,6 +701,7 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
       Features["gws"] = true;
       Features["atomic-fmin-fmax-global-f32"] = true;
       Features["atomic-fmin-fmax-global-f64"] = true;
+      Features["wavefrontsize64"] = true;
       break;
     case GK_NONE:
       break;
@@ -734,68 +738,37 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const Triple &T,
   }
 }
 
-static bool isWave32Capable(StringRef GPU, const Triple &T) {
-  bool IsWave32Capable = false;
-  // XXX - What does the member GPU mean if device name string passed here?
-  if (T.isAMDGCN()) {
-    switch (parseArchAMDGCN(GPU)) {
-    case GK_GFX1250:
-    case GK_GFX1201:
-    case GK_GFX1200:
-    case GK_GFX1153:
-    case GK_GFX1152:
-    case GK_GFX1151:
-    case GK_GFX1150:
-    case GK_GFX1103:
-    case GK_GFX1102:
-    case GK_GFX1101:
-    case GK_GFX1100:
-    case GK_GFX1036:
-    case GK_GFX1035:
-    case GK_GFX1034:
-    case GK_GFX1033:
-    case GK_GFX1032:
-    case GK_GFX1031:
-    case GK_GFX1030:
-    case GK_GFX1012:
-    case GK_GFX1011:
-    case GK_GFX1013:
-    case GK_GFX1010:
-    case GK_GFX12_GENERIC:
-    case GK_GFX11_GENERIC:
-    case GK_GFX10_3_GENERIC:
-    case GK_GFX10_1_GENERIC:
-      IsWave32Capable = true;
-      break;
-    default:
-      break;
-    }
-  }
-  return IsWave32Capable;
-}
-
 std::pair<FeatureError, StringRef>
 AMDGPU::insertWaveSizeFeature(StringRef GPU, const Triple &T,
                               StringMap<bool> &Features) {
-  bool IsWave32Capable = isWave32Capable(GPU, T);
+  StringMap<bool> DefaultFeatures;
+  fillAMDGPUFeatureMap(GPU, T, DefaultFeatures);
+
   const bool IsNullGPU = GPU.empty();
+  const bool TargetHasWave32 = DefaultFeatures.count("wavefrontsize32");
+  const bool TargetHasWave64 = DefaultFeatures.count("wavefrontsize64");
   const bool HaveWave32 = Features.count("wavefrontsize32");
   const bool HaveWave64 = Features.count("wavefrontsize64");
   if (HaveWave32 && HaveWave64) {
     return {AMDGPU::INVALID_FEATURE_COMBINATION,
             "'wavefrontsize32' and 'wavefrontsize64' are mutually exclusive"};
   }
-  if (HaveWave32 && !IsNullGPU && !IsWave32Capable) {
+  if (HaveWave32 && !IsNullGPU && TargetHasWave64) {
     return {AMDGPU::UNSUPPORTED_TARGET_FEATURE, "wavefrontsize32"};
   }
+  if (HaveWave64 && !IsNullGPU && TargetHasWave32) {
+    return {AMDGPU::UNSUPPORTED_TARGET_FEATURE, "wavefrontsize64"};
+  }
   // Don't assume any wavesize with an unknown subtarget.
-  if (!IsNullGPU) {
-    // Default to wave32 if available, or wave64 if not
-    if (!HaveWave32 && !HaveWave64) {
-      StringRef DefaultWaveSizeFeature =
-          IsWave32Capable ? "wavefrontsize32" : "wavefrontsize64";
-      Features.insert(std::make_pair(DefaultWaveSizeFeature, true));
-    }
+  // Default to wave32 is target supports both.
+  if (!IsNullGPU && !HaveWave32 && !HaveWave64 && !TargetHasWave32 &&
+      !TargetHasWave64)
+    Features.insert(std::make_pair("wavefrontsize32", true));
+
+  for (const auto &Entry : DefaultFeatures) {
+    if (!Features.count(Entry.getKey()))
+      Features[Entry.getKey()] = Entry.getValue();
   }
+
   return {NO_ERROR, StringRef()};
 }

@github-actions
Copy link

github-actions bot commented Aug 21, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@rampitec rampitec force-pushed the users/rampitec/08-21-_amdgpu_refactor_insertwavesizefeature branch from a6ae3b4 to 9694899 Compare August 21, 2025 22:04
Features.insert(std::make_pair("wavefrontsize32", true));

for (const auto &Entry : DefaultFeatures) {
if (!Features.count(Entry.getKey()))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish there is a method to detach entry from one map and add it to another, but I didn't find one.

@rampitec rampitec force-pushed the users/rampitec/08-21-_amdgpu_refactor_insertwavesizefeature branch 2 times, most recently from ecc47cd to ca5b895 Compare August 25, 2025 08:20
@rampitec
Copy link
Collaborator Author

Ping. Here I did that and just that -- rely purely on features. It is somewhat spaghetti, but it was so before. Then there is a follow-up to hide the implementation details from language drivers: #155222.


using namespace llvm::AMDGPU;
fillAMDGPUFeatureMap(CPU, getTriple(), Features);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit concerned moving fillAMDGPUFeatureMap from here to insertWaveSizeFeature. Since we are supposed to initialize the feature map with the target defaults, then override them with command line options.

How about we keep fillAMDGPUFeatureMap here and make a copy of default target features and pass it to insertWaveSizeFeature as an extra argument. In insertWaveSizeFeature, we only use the default target feature for checking target support of wave32/64.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the next #155222. Here I have only done what you have asked for. There I have refactored the whole thing.

Copy link
Collaborator Author

@rampitec rampitec Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want I can squash both into a single commit, but I thought that way it is better reviewable.

If a wavefrontsize32 or wavefrontsize64 is the only possible value
insert it into feature list by default and use that value as an
indication that another wavefront size is not legal.
@rampitec rampitec force-pushed the users/rampitec/08-21-_amdgpu_refactor_insertwavesizefeature branch from c7e8319 to 7ffe0f9 Compare August 26, 2025 18:22
@rampitec rampitec merged commit 8c6b7af into main Aug 27, 2025
9 checks passed
@rampitec rampitec deleted the users/rampitec/08-21-_amdgpu_refactor_insertwavesizefeature branch August 27, 2025 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants