[AMDGPU] expand-fp: Change frem expansion criterion #158285

frederik-h · 2025-09-12T12:17:40Z

The existing condition for checking whether or not to expand an frem instruction in the pass is not sufficiently precise. Right now, it is sufficient to ensure the correct working of the pass for the targets currently using the pass. But this is only true in conjunction with the existing check for the MaxLegalFpConvertBitWidth value which happens to exit early on targets on which the frem condition is insufficient.

The correct working of the pass should not rely on this interaction. The possibility of using the pass for handling further expansions (e.g. merging the very similar ExpandLargeDivRem into it) is also limited by this.

Make the expansion criterion more precise and use it to exit early from the pass run if no expansions are required for a target.

The existing condition for checking whether or not to expand an frem instruction in the pass is not sufficiently precise. Right now, it is sufficient to ensure the correct working of the pass. But this is only true in conjunction with the existing check for the MaxLegalFpConvertBitWidth value which happens to exit early on targets on which the frem condition is insufficient. The correct working of the pass should not rely on this interaction. The possibility of using the pass for handling further expansions:(e.g. merging the very similar ExpandLargDivRem into it) is also limited by this. This patch changes the pass to expand frem for a target iff the target's legalization action for the instruction with the scalar type corresponding to the instruction type is LibCall but the libcall does not exist. The legalization action for frem in the AMDGPU backend is adjusted accordingly.

llvmbot · 2025-09-12T12:18:16Z

@llvm/pr-subscribers-backend-amdgpu

Author: Frederik Harwath (frederik-h)

Changes

The existing condition for checking whether or not to expand an frem instruction in the pass is not sufficiently precise. Right now, it is sufficient to ensure the correct working of the pass. But this is only true in conjunction with the existing check for the MaxLegalFpConvertBitWidth value which happens to exit early on targets on which the frem condition is insufficient.

The correct working of the pass should not rely on this interaction. The possibility of using the pass for handling further expansions (e.g. merging the very similar ExpandLargeDivRem into it) is also limited by this.

This patch changes the pass to expand frem for a target iff the target's legalization action for the instruction with the scalar type corresponding to the instruction type is LibCall but the libcall does not exist. The legalization action for frem in the AMDGPU backend is adjusted accordingly.

Full diff: https://github.com/llvm/llvm-project/pull/158285.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/ExpandFp.cpp (+25-12)
(modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (+1-1)

diff --git a/llvm/lib/CodeGen/ExpandFp.cpp b/llvm/lib/CodeGen/ExpandFp.cpp
index 9cc6c6a706c58..6f4f049cc7f8e 100644
--- a/llvm/lib/CodeGen/ExpandFp.cpp
+++ b/llvm/lib/CodeGen/ExpandFp.cpp
@@ -979,14 +979,22 @@ static RTLIB::Libcall fremToLibcall(Type *Ty) {
   llvm_unreachable("Unknown floating point type");
 }
 
-/* Return true if, according to \p LibInfo, the target either directly
-   supports the frem instruction for the \p Ty, has a custom lowering,
-   or uses a libcall. */
-static bool targetSupportsFrem(const TargetLowering &TLI, Type *Ty) {
-  if (!TLI.isOperationExpand(ISD::FREM, EVT::getEVT(Ty)))
-    return true;
-
-  return TLI.getLibcallName(fremToLibcall(Ty->getScalarType()));
+/// Return true if the pass should expand a "frem" instruction of the
+/// given \p Ty for the target represented by \p TLI. Expansion
+/// should happen if the legalization for the scalar type uses a
+/// non-existing libcall. The scalar type is considered because it is
+/// easier to do so and it is highly unlikely that a vector type can
+/// be legalized without a libcall if the scalar type cannot.
+static bool shouldExpandFremType(const TargetLowering &TLI, Type *Ty) {
+  Type *ScalarTy = Ty->getScalarType();
+  EVT VT = EVT::getEVT(ScalarTy);
+
+  TargetLowering::LegalizeAction LA = TLI.getOperationAction(ISD::FREM, VT);
+  if (LA != TargetLowering::LegalizeAction::LibCall)
+    return false;
+
+  bool MissingLibcall = !TLI.getLibcallName(fremToLibcall(ScalarTy));
+  return MissingLibcall && FRemExpander::canExpandType(ScalarTy);
 }
 
 static bool runImpl(Function &F, const TargetLowering &TLI,
@@ -1000,8 +1008,8 @@ static bool runImpl(Function &F, const TargetLowering &TLI,
   if (ExpandFpConvertBits != llvm::IntegerType::MAX_INT_BITS)
     MaxLegalFpConvertBitWidth = ExpandFpConvertBits;
 
-  if (MaxLegalFpConvertBitWidth >= llvm::IntegerType::MAX_INT_BITS)
-    return false;
+  bool TargetSkipExpandLargeFp =
+      MaxLegalFpConvertBitWidth >= llvm::IntegerType::MAX_INT_BITS;
 
   for (auto &I : instructions(F)) {
     switch (I.getOpcode()) {
@@ -1011,8 +1019,7 @@ static bool runImpl(Function &F, const TargetLowering &TLI,
       if (Ty->isScalableTy())
         continue;
 
-      if (targetSupportsFrem(TLI, Ty) ||
-          !FRemExpander::canExpandType(Ty->getScalarType()))
+      if (!shouldExpandFremType(TLI, Ty))
         continue;
 
       Replace.push_back(&I);
@@ -1022,6 +1029,9 @@ static bool runImpl(Function &F, const TargetLowering &TLI,
     }
     case Instruction::FPToUI:
     case Instruction::FPToSI: {
+      if (TargetSkipExpandLargeFp)
+        continue;
+
       // TODO: This pass doesn't handle scalable vectors.
       if (I.getOperand(0)->getType()->isScalableTy())
         continue;
@@ -1039,6 +1049,9 @@ static bool runImpl(Function &F, const TargetLowering &TLI,
     }
     case Instruction::UIToFP:
     case Instruction::SIToFP: {
+      if (TargetSkipExpandLargeFp)
+        continue;
+
       // TODO: This pass doesn't handle scalable vectors.
       if (I.getOperand(0)->getType()->isScalableTy())
         continue;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index 5c9b616e9bc21..3892d7949a0fc 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -423,7 +423,7 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
   setOperationAction({ISD::LRINT, ISD::LLRINT}, {MVT::f16, MVT::f32, MVT::f64},
                      Expand);
 
-  setOperationAction(ISD::FREM, {MVT::f16, MVT::f32, MVT::f64}, Expand);
+  setOperationAction(ISD::FREM, {MVT::f16, MVT::f32, MVT::f64}, LibCall);
 
   if (Subtarget->has16BitInsts()) {
     setOperationAction(ISD::IS_FPCLASS, {MVT::f16, MVT::f32, MVT::f64}, Legal);

llvm/lib/CodeGen/ExpandFp.cpp

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp

…nsion-criterion

… expansion This commit adds a function to check if a target needs expansion of any type of frem instructions. In conjunction with the trivial check if the large fp conversion expansions are necessary, this can be used to perform any early exit from the pass if no expansions are needed for a target.

…nsion-criterion

llvm/lib/CodeGen/ExpandFp.cpp

* Use constexpr std::array instead of inline const SmallVector. * shouldExpandFremType - Remove LibCall action handling - Add assert to document that vectors are not handled - additionally: Inline variable

github-actions · 2025-09-16T15:07:42Z

✅ With the latest revision this PR passed the C/C++ code formatter.

The compiler in the Windows build failed to deduced the type in the auto.

arsenm · 2025-09-17T11:38:18Z

llvm/lib/CodeGen/ExpandFp.cpp

+    auto Libcall = getFremLibcallForType(VT);
+    return Libcall.has_value() && !TLI.getLibcallName(*Libcall);


This pass shouldn't really need to consider the libcall name. Can you fix codegen to use the libcall action for frem

This pass shouldn't really need to consider the libcall name. Can you fix codegen to use the libcall action for frem

Here you wrote:

Expand is the correct action. Every other target should be using LibCall. I.e. change the default action to Libcall. This is another stalled migration, Expand should imply emit standalone code that doesn't depend on a function

That is, you suggest that I change the legalization action for ISD::FREM for every target that currently uses Expand to LibCall (unless it provides a custom lowering function, I suppose; in this case Custom, as e.g. used by the NVPTX target, would be correct?) and this pass would simply start handling frem for the types that it supports if the target uses Expand for frem instructions of this type. Is this correct?

That is, you suggest that I change the legalization action for ISD::FREM for every target that currently uses Expand to LibCall (unless it provides a custom lowering function, I suppose; in this case Custom, as e.g. used by the NVPTX target, would be correct?)

Yes. However, for the PTX case, that is the same bug. The custom expansion is the same buggy expansion you're fixing. As a hack until that's also converted, you could just ignore custom too

I have implemented the change now. I think this should go into a separate PR if we want to keep it this way as it affects many targets. I also think it is a bit confusing to make this "Expand" -> "LibCall" change for ISD::FREM only. Several times, I had to exclude it from a list of instructions which should probably all be handled in the same way.

…ar types

…nsion-criterion

llvm/lib/CodeGen/ExpandFp.cpp

…nsion-criterion

frederik-h requested a review from arsenm September 12, 2025 12:17

llvmbot added backend:AMDGPU llvm:codegen labels Sep 12, 2025

frederik-h commented Sep 12, 2025

View reviewed changes

llvm/lib/CodeGen/ExpandFp.cpp Show resolved Hide resolved

arsenm reviewed Sep 12, 2025

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Outdated Show resolved Hide resolved

frederik-h added 4 commits September 12, 2025 09:21

Revert Operation Action for frem to Expand

534b3e2

Merge remote-tracking branch 'upstream/main' into expand-fp-frem-expa…

7dee6b1

…nsion-criterion

Merge remote-tracking branch 'upstream/main' into expand-fp-frem-expa…

0a30d40

…nsion-criterion

frederik-h requested a review from arsenm September 16, 2025 09:55

arsenm reviewed Sep 16, 2025

View reviewed changes

llvm/lib/CodeGen/ExpandFp.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/ExpandFp.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/ExpandFp.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/ExpandFp.cpp Outdated Show resolved Hide resolved

Review changes

c1814f0

* Use constexpr std::array instead of inline const SmallVector. * shouldExpandFremType - Remove LibCall action handling - Add assert to document that vectors are not handled - additionally: Inline variable

frederik-h added 3 commits September 16, 2025 11:09

fixup! Review changes

a752a2f

Try fix Windows build problem

307252a

The compiler in the Windows build failed to deduced the type in the auto.

Furhter fixup for Windows build

4728696

arsenm reviewed Sep 17, 2025

View reviewed changes

frederik-h added 6 commits October 16, 2025 07:51

Change ISD::FREM legalization actions from Expand to LibCall for scal…

77d862b

…ar types

expand-fp: always expand frem if legalization action is "Expand"

df7066c

Merge remote-tracking branch 'upstream/main' into expand-fp-frem-expa…

53fdc7a

…nsion-criterion

Revert deletion of comment line

a33ab1d

Add back deleted line

a58e1c7

clang-format changes

1ea0b3a

jmmartinez reviewed Oct 17, 2025

View reviewed changes

llvm/lib/CodeGen/ExpandFp.cpp Outdated Show resolved Hide resolved

jmmartinez reviewed Oct 17, 2025

View reviewed changes

llvm/lib/CodeGen/ExpandFp.cpp Outdated Show resolved Hide resolved

frederik-h added 3 commits October 17, 2025 04:10

Replace two function uses by better llvm alternatives

4d5d984

trigger CI

17f470b

Merge remote-tracking branch 'upstream/main' into expand-fp-frem-expa…

4225845

…nsion-criterion

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] expand-fp: Change frem expansion criterion #158285

[AMDGPU] expand-fp: Change frem expansion criterion #158285

frederik-h commented Sep 12, 2025 •

edited

Loading

Uh oh!

llvmbot commented Sep 12, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Sep 16, 2025 •

edited

Loading

Uh oh!

arsenm Sep 17, 2025

Uh oh!

frederik-h Sep 17, 2025

Uh oh!

arsenm Sep 18, 2025

Uh oh!

frederik-h Oct 16, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		auto Libcall = getFremLibcallForType(VT);
		return Libcall.has_value() && !TLI.getLibcallName(*Libcall);

[AMDGPU] expand-fp: Change frem expansion criterion #158285

Are you sure you want to change the base?

[AMDGPU] expand-fp: Change frem expansion criterion #158285

Conversation

frederik-h commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 12, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

frederik-h Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

frederik-h Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

frederik-h commented Sep 12, 2025 •

edited

Loading

github-actions bot commented Sep 16, 2025 •

edited

Loading