Skip to content

Conversation

@mbrkusanin
Copy link
Collaborator

@mbrkusanin mbrkusanin commented May 22, 2025

Rename IEEEMinMax to IEEEMinimumMaximumInsts and turn it into a proper
subtarget feature.

Also remove unused hasIEEEMinMax3 which is replaced with
hasMinimum3Maximum3F32 and hasMinimum3Maximum3F16

@llvmbot
Copy link
Member

llvmbot commented May 22, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Mirko Brkušanin (mbrkusanin)

Changes

Also remove unused hasIEEEMinMax3 which is replaced with
hasMinimum3Maximum3F32 and hasMinimum3Maximum3F16


Full diff: https://github.com/llvm/llvm-project/pull/141081.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/GCNSubtarget.h (+1-4)
  • (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+3-2)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 667c466a998e0..11645841f73db 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -2098,7 +2098,7 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
        G_SADDO, G_SSUBO})
       .lower();
 
-  if (ST.hasIEEEMinMax()) {
+  if (ST.hasIEEEMinMaxInsts()) {
     getActionDefinitionsBuilder({G_FMINIMUM, G_FMAXIMUM})
         .legalFor(FPTypesPK16)
         .clampMaxNumElements(0, S16, 2)
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index 202e5b38f0a48..08bce273d1ee7 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -1447,10 +1447,7 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
   bool hasIEEEMode() const { return getGeneration() < GFX12; }
 
   // \returns true if the target has IEEE fminimum/fmaximum instructions
-  bool hasIEEEMinMax() const { return getGeneration() >= GFX12; }
-
-  // \returns true if the target has IEEE fminimum3/fmaximum3 instructions
-  bool hasIEEEMinMax3() const { return hasIEEEMinMax(); }
+  bool hasIEEEMinMaxInsts() const { return getGeneration() >= GFX12; }
 
   // \returns true if the target has WG_RR_MODE kernel descriptor mode bit
   bool hasRrWGMode() const { return getGeneration() >= GFX12; }
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 2d337fafe6dc2..59dee1c4635bc 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -858,7 +858,7 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
   if (Subtarget->hasPrefetch() && Subtarget->hasSafeSmemPrefetch())
     setOperationAction(ISD::PREFETCH, MVT::Other, Custom);
 
-  if (Subtarget->hasIEEEMinMax()) {
+  if (Subtarget->hasIEEEMinMaxInsts()) {
     setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM},
                        {MVT::f16, MVT::f32, MVT::f64, MVT::v2f16}, Legal);
   } else {
@@ -6975,7 +6975,8 @@ SDValue SITargetLowering::lowerFMINIMUM_FMAXIMUM(SDValue Op,
   if (VT.isVector())
     return splitBinaryVectorOp(Op, DAG);
 
-  assert(!Subtarget->hasIEEEMinMax() && !Subtarget->hasMinimum3Maximum3F16() &&
+  assert(!Subtarget->hasIEEEMinMaxInsts() &&
+         !Subtarget->hasMinimum3Maximum3F16() &&
          Subtarget->hasMinimum3Maximum3PKF16() && VT == MVT::f16 &&
          "should not need to widen f16 minimum/maximum to v2f16");
 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking we need to rename this, but to be more specific about which ones because this means 3 possible operations. IEEEMinimumMaximumInsts? and comment this is the IEEE-754 2019 minimum and maximum. Also this needs to be broken down per type, since for gfx950 it's only present for v2f16

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which also implies turning it into a set of proper subtarget features

Copy link
Collaborator Author

@mbrkusanin mbrkusanin May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean v_pk_minimum3_f16 and v_pk_maximum3_f16, then FeatureMinimum3Maximum3PKF16 already exists.

New IEEEMinimumMaximumInsts will include v_pk_minimum_f16 and v_pk_maximum_f16.

You want to have just v2f16 legal when only FeatureMinimum3Maximum3PKF16 exists because of MinimumMaximumByMinimum3Maximum3VOP3P pattern? That would not be NFC so I do not want to put it here.

@mbrkusanin mbrkusanin force-pushed the rename-ieeeminmax branch from dabbce3 to e93956c Compare May 26, 2025 16:41
@mbrkusanin mbrkusanin changed the title [AMDGPU][NFC] Rename IEEEMinMax to IEEEMinMaxInsts [AMDGPU][NFCI] Add IEEEMinimumMaximumInsts SubtargetFeature May 26, 2025
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is these are not bundled instructions, we need at least 2 separate features. gfx950 has its own weird subset

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gfx950 doesn't have any of these instructions, does it? At least I don't see any MC tests for them.

Copy link
Collaborator Author

@mbrkusanin mbrkusanin Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenm ping

Is this fine or did you have some other idea in mind about splitting this into multiple features and how?

Also remove unused hasIEEEMinMax3 which is replaced with
hasMinimum3Maximum3F32 and hasMinimum3Maximum3F16
@mbrkusanin
Copy link
Collaborator Author

Already merged in: #147594

@mbrkusanin mbrkusanin closed this Jul 17, 2025
@mbrkusanin mbrkusanin deleted the rename-ieeeminmax branch July 17, 2025 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants