-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[AMDGPU][NFCI] Add IEEEMinimumMaximumInsts SubtargetFeature #141081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-backend-amdgpu Author: Mirko Brkušanin (mbrkusanin) ChangesAlso remove unused hasIEEEMinMax3 which is replaced with Full diff: https://github.com/llvm/llvm-project/pull/141081.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 667c466a998e0..11645841f73db 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -2098,7 +2098,7 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
G_SADDO, G_SSUBO})
.lower();
- if (ST.hasIEEEMinMax()) {
+ if (ST.hasIEEEMinMaxInsts()) {
getActionDefinitionsBuilder({G_FMINIMUM, G_FMAXIMUM})
.legalFor(FPTypesPK16)
.clampMaxNumElements(0, S16, 2)
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index 202e5b38f0a48..08bce273d1ee7 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -1447,10 +1447,7 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
bool hasIEEEMode() const { return getGeneration() < GFX12; }
// \returns true if the target has IEEE fminimum/fmaximum instructions
- bool hasIEEEMinMax() const { return getGeneration() >= GFX12; }
-
- // \returns true if the target has IEEE fminimum3/fmaximum3 instructions
- bool hasIEEEMinMax3() const { return hasIEEEMinMax(); }
+ bool hasIEEEMinMaxInsts() const { return getGeneration() >= GFX12; }
// \returns true if the target has WG_RR_MODE kernel descriptor mode bit
bool hasRrWGMode() const { return getGeneration() >= GFX12; }
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 2d337fafe6dc2..59dee1c4635bc 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -858,7 +858,7 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
if (Subtarget->hasPrefetch() && Subtarget->hasSafeSmemPrefetch())
setOperationAction(ISD::PREFETCH, MVT::Other, Custom);
- if (Subtarget->hasIEEEMinMax()) {
+ if (Subtarget->hasIEEEMinMaxInsts()) {
setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM},
{MVT::f16, MVT::f32, MVT::f64, MVT::v2f16}, Legal);
} else {
@@ -6975,7 +6975,8 @@ SDValue SITargetLowering::lowerFMINIMUM_FMAXIMUM(SDValue Op,
if (VT.isVector())
return splitBinaryVectorOp(Op, DAG);
- assert(!Subtarget->hasIEEEMinMax() && !Subtarget->hasMinimum3Maximum3F16() &&
+ assert(!Subtarget->hasIEEEMinMaxInsts() &&
+ !Subtarget->hasMinimum3Maximum3F16() &&
Subtarget->hasMinimum3Maximum3PKF16() && VT == MVT::f16 &&
"should not need to widen f16 minimum/maximum to v2f16");
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just thinking we need to rename this, but to be more specific about which ones because this means 3 possible operations. IEEEMinimumMaximumInsts? and comment this is the IEEE-754 2019 minimum and maximum. Also this needs to be broken down per type, since for gfx950 it's only present for v2f16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which also implies turning it into a set of proper subtarget features
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you mean v_pk_minimum3_f16 and v_pk_maximum3_f16, then FeatureMinimum3Maximum3PKF16 already exists.
New IEEEMinimumMaximumInsts will include v_pk_minimum_f16 and v_pk_maximum_f16.
You want to have just v2f16 legal when only FeatureMinimum3Maximum3PKF16 exists because of MinimumMaximumByMinimum3Maximum3VOP3P pattern? That would not be NFC so I do not want to put it here.
dabbce3 to
e93956c
Compare
llvm/lib/Target/AMDGPU/AMDGPU.td
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is these are not bundled instructions, we need at least 2 separate features. gfx950 has its own weird subset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gfx950 doesn't have any of these instructions, does it? At least I don't see any MC tests for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arsenm ping
Is this fine or did you have some other idea in mind about splitting this into multiple features and how?
Also remove unused hasIEEEMinMax3 which is replaced with hasMinimum3Maximum3F32 and hasMinimum3Maximum3F16
e93956c to
2b6efa5
Compare
|
Already merged in: #147594 |
Rename IEEEMinMax to IEEEMinimumMaximumInsts and turn it into a proper
subtarget feature.
Also remove unused hasIEEEMinMax3 which is replaced with
hasMinimum3Maximum3F32 and hasMinimum3Maximum3F16