-
Notifications
You must be signed in to change notification settings - Fork 15.2k
AMDGPU: Add subtarget feature for memory atomic fadd f64 #96444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
arsenm
merged 1 commit into
main
from
users/arsenm/amdgpu-add-feature-for-buffer-atomic-fadd-f64
Jul 10, 2024
Merged
AMDGPU: Add subtarget feature for memory atomic fadd f64 #96444
arsenm
merged 1 commit into
main
from
users/arsenm/amdgpu-add-feature-for-buffer-atomic-fadd-f64
Jul 10, 2024
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Jun 23, 2024
Contributor
Author
Member
|
@llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesFull diff: https://github.com/llvm/llvm-project/pull/96444.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 5f798b4391704..fe3cd75d81009 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -788,6 +788,13 @@ def FeatureFlatAtomicFaddF32Inst
"Has flat_atomic_add_f32 instruction"
>;
+def FeatureFlatBufferGlobalAtomicFaddF64Inst
+ : SubtargetFeature<"flat-buffer-global-fadd-f64-inst",
+ "HasFlatBufferGlobalAtomicFaddF64Inst",
+ "true",
+ "Has flat, buffer, and global instructions for f64 atomic fadd"
+>;
+
def FeatureMemoryAtomicFaddF32DenormalSupport
: SubtargetFeature<"memory-atomic-fadd-f32-denormal-support",
"HasAtomicMemoryAtomicFaddF32DenormalSupport",
@@ -1388,7 +1395,8 @@ def FeatureISAVersion9_0_A : FeatureSet<
FeatureBackOffBarrier,
FeatureKernargPreload,
FeatureAtomicFMinFMaxF64GlobalInsts,
- FeatureAtomicFMinFMaxF64FlatInsts
+ FeatureAtomicFMinFMaxF64FlatInsts,
+ FeatureFlatBufferGlobalAtomicFaddF64Inst
])>;
def FeatureISAVersion9_0_C : FeatureSet<
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index 674d84422538f..922435c5efaa6 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -174,6 +174,7 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
bool HasAtomicGlobalPkAddBF16Inst = false;
bool HasAtomicBufferPkAddBF16Inst = false;
bool HasFlatAtomicFaddF32Inst = false;
+ bool HasFlatBufferGlobalAtomicFaddF64Inst = false;
bool HasDefaultComponentZero = false;
bool HasAgentScopeFineGrainedRemoteMemoryAtomics = false;
bool HasDefaultComponentBroadcast = false;
@@ -873,6 +874,12 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
bool hasFlatAtomicFaddF32Inst() const { return HasFlatAtomicFaddF32Inst; }
+ /// \return true if the target has flat, global, and buffer atomic fadd for
+ /// double.
+ bool hasFlatBufferGlobalAtomicFaddF64Inst() const {
+ return HasFlatBufferGlobalAtomicFaddF64Inst;
+ }
+
/// \return true if the target's flat, global, and buffer atomic fadd for
/// float supports denormal handling.
bool hasMemoryAtomicFaddF32DenormalSupport() const {
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index eec750e5b8251..6b5ba160d6402 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -16028,7 +16028,7 @@ SITargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *RMW) const {
return AtomicExpansionKind::CmpXChg;
// global and flat atomic fadd f64: gfx90a, gfx940.
- if (Subtarget->hasGFX90AInsts() && Ty->isDoubleTy())
+ if (Subtarget->hasFlatBufferGlobalAtomicFaddF64Inst() && Ty->isDoubleTy())
return ReportUnsafeHWInst(AtomicExpansionKind::None);
if (AS != AMDGPUAS::FLAT_ADDRESS && Ty->isFloatTy()) {
|
a663c42 to
c135403
Compare
Collaborator
|
Use it in a predicate when defining pseudos? |
c135403 to
baaf961
Compare
rampitec
approved these changes
Jun 24, 2024
4594135 to
3ec4e64
Compare
baaf961 to
e948fe9
Compare
3ec4e64 to
47017c2
Compare
7648917 to
db51986
Compare
This was referenced Jun 26, 2024
23ec97c to
b57b67e
Compare
db51986 to
36cbbdf
Compare
b57b67e to
5a62792
Compare
36cbbdf to
0381e27
Compare
This was referenced Jun 27, 2024
5a62792 to
1e3c134
Compare
0381e27 to
234b772
Compare
1e3c134 to
ab52788
Compare
234b772 to
20d2b3f
Compare
This was referenced Jun 28, 2024
ab52788 to
1a5d8b8
Compare
20d2b3f to
ece3239
Compare
1a5d8b8 to
9cf93c6
Compare
ece3239 to
5945915
Compare
9cf93c6 to
deebca2
Compare
5945915 to
308e311
Compare
573e7bc to
5ef29a5
Compare
308e311 to
4590c05
Compare
Contributor
Author
5ef29a5 to
43dc4f2
Compare
Base automatically changed from
users/arsenm/amdgpu-subtarget-feature-fadd-denormal-support
to
main
July 10, 2024 12:48
4590c05 to
788b25a
Compare
aaryanshukla
pushed a commit
to aaryanshukla/llvm-project
that referenced
this pull request
Jul 14, 2024
This was referenced Aug 2, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.

No description provided.