Skip to content

Conversation

@andykaylor
Copy link
Contributor

Change the handling of -ffp-contract=fast such that the contract
semantics described in the IR are always respected. The front end was
already generating IR that enabled or disabled contraction as requested
by relevant pragmas, but it was setting a TargetOption (AllowFPOpFusion)
that causes the LLVM code generator to disregard the IR restrictions.

This makes -ffp-contract=fast-honor-pragmas redundant, so it is now
reported as deprecated.

For more discussion, see: https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Aug 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang-driver

Author: Andy Kaylor (andykaylor)

Changes

Change the handling of -ffp-contract=fast such that the contract
semantics described in the IR are always respected. The front end was
already generating IR that enabled or disabled contraction as requested
by relevant pragmas, but it was setting a TargetOption (AllowFPOpFusion)
that causes the LLVM code generator to disregard the IR restrictions.

This makes -ffp-contract=fast-honor-pragmas redundant, so it is now
reported as deprecated.

For more discussion, see: https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797


Full diff: https://github.com/llvm/llvm-project/pull/105746.diff

16 Files Affected:

  • (modified) clang/docs/LanguageExtensions.rst (+1-3)
  • (modified) clang/docs/ReleaseNotes.rst (+4)
  • (modified) clang/docs/UsersManual.rst (+3-3)
  • (modified) clang/include/clang/Basic/LangOptions.h (+1-9)
  • (modified) clang/include/clang/Driver/Options.td (+3-3)
  • (modified) clang/lib/Basic/LangOptions.cpp (+2-12)
  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+3-17)
  • (modified) clang/lib/CodeGen/CGCall.cpp (+1-3)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+7)
  • (modified) clang/lib/Frontend/CompilerInvocation.cpp (+2-4)
  • (modified) clang/lib/Sema/SemaAttr.cpp (-1)
  • (modified) clang/test/CodeGen/ffp-contract-fast-honor-pramga-option.cpp (+1-1)
  • (modified) clang/test/CodeGen/ffp-contract-fhp-pragma-override.cpp (+1-1)
  • (modified) clang/test/CodeGen/fp-function-attrs.cpp (+14-1)
  • (modified) clang/test/CodeGenCUDA/fp-contract.cu (+2-25)
  • (modified) clang/test/Driver/fp-contract.c (+4-3)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index 62903fc3744cad..cef239346f58c5 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -5037,9 +5037,7 @@ statements in C).
 
 The pragma can also be used with ``off`` which turns FP contraction off for a
 section of the code. This can be useful when fast contraction is otherwise
-enabled for the translation unit with the ``-ffp-contract=fast-honor-pragmas`` flag.
-Note that ``-ffp-contract=fast`` will override pragmas to fuse multiply and
-addition across statements regardless of any controlling pragmas.
+enabled for the translation unit with the ``-ffp-contract=fast`` flag.
 
 ``#pragma clang fp exceptions`` specifies floating point exception behavior. It
 may take one of the values: ``ignore``, ``maytrap`` or ``strict``. Meaning of
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 12a924acc14331..2ab1938b12f006 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -190,6 +190,10 @@ Modified Compiler Flags
   the ``promoted`` algorithm for complex division when possible rather than the
   less basic (limited range) algorithm.
 
+- The ``-ffp-contract`` option now honors pragmas by default when the ``fast``
+  argument is used. The ``fast-honor-pragmas`` option is now deprecated and acts
+  as an alias for ``fast``.
+
 Removed Compiler Flags
 -------------------------
 
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index d19b77ae40b0d7..0452cc0f774311 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -1444,7 +1444,7 @@ describes the various floating point semantic modes and the corresponding option
   "ffp-exception-behavior", "{ignore, strict, maytrap}",
   "fenv_access", "{off, on}", "(none)"
   "frounding-math", "{dynamic, tonearest, downward, upward, towardzero}"
-  "ffp-contract", "{on, off, fast, fast-honor-pragmas}"
+  "ffp-contract", "{on, off, fast}"
   "fdenormal-fp-math", "{IEEE, PreserveSign, PositiveZero}"
   "fdenormal-fp-math-fp32", "{IEEE, PreserveSign, PositiveZero}"
   "fmath-errno", "{on, off}"
@@ -1646,10 +1646,10 @@ for more details.
 
    Valid values are:
 
-   * ``fast`` (fuse across statements disregarding pragmas, default for CUDA)
+   * ``fast`` (fuse across statements unless dictated by pragmas, default for CUDA/HIP)
    * ``on`` (fuse in the same statement unless dictated by pragmas, default for languages other than CUDA/HIP)
    * ``off`` (never fuse)
-   * ``fast-honor-pragmas`` (fuse across statements unless dictated by pragmas, default for HIP)
+   * ``fast-honor-pragmas`` (deprecated, aliases fast)
 
 .. option:: -f[no-]honor-infinities
 
diff --git a/clang/include/clang/Basic/LangOptions.h b/clang/include/clang/Basic/LangOptions.h
index eb4cb4b5a7e93f..2fcc211e8249d7 100644
--- a/clang/include/clang/Basic/LangOptions.h
+++ b/clang/include/clang/Basic/LangOptions.h
@@ -265,11 +265,8 @@ class LangOptionsBase {
     // Enable the floating point pragma
     FPM_On,
 
-    // Aggressively fuse FP ops (E.g. FMA) disregarding pragmas.
-    FPM_Fast,
-
     // Aggressively fuse FP ops and honor pragmas.
-    FPM_FastHonorPragmas
+    FPM_Fast
   };
 
   /// Possible floating point exception behavior.
@@ -831,12 +828,7 @@ class FPOptions {
   }
   explicit FPOptions(const LangOptions &LO) {
     Value = 0;
-    // The language fp contract option FPM_FastHonorPragmas has the same effect
-    // as FPM_Fast in frontend. For simplicity, use FPM_Fast uniformly in
-    // frontend.
     auto LangOptContractMode = LO.getDefaultFPContractMode();
-    if (LangOptContractMode == LangOptions::FPM_FastHonorPragmas)
-      LangOptContractMode = LangOptions::FPM_Fast;
     setFPContractMode(LangOptContractMode);
     setRoundingMath(LO.RoundingMath);
     setConstRoundingMode(LangOptions::RoundingMode::Dynamic);
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 111608d30ff827..c50e458c06ab1b 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2658,11 +2658,11 @@ def fno_trapping_math : Flag<["-"], "fno-trapping-math">, Group<f_Group>;
 def ffp_contract : Joined<["-"], "ffp-contract=">, Group<f_Group>,
   Visibility<[ClangOption, CC1Option, FC1Option, FlangOption]>,
   DocBrief<"Form fused FP ops (e.g. FMAs):"
-  " fast (fuses across statements disregarding pragmas)"
+  " fast (fuses across statements unless dictated by pragmas)"
   " | on (only fuses in the same statement unless dictated by pragmas)"
   " | off (never fuses)"
-  " | fast-honor-pragmas (fuses across statements unless dictated by pragmas)."
-  " Default is 'fast' for CUDA, 'fast-honor-pragmas' for HIP, and 'on' otherwise.">,
+  " | fast-honor-pragmas (deprecated, aliases fast)."
+  " Default is 'fast' for CUDA or HIP, and 'on' otherwise.">,
   HelpText<"Form fused FP ops (e.g. FMAs)">,
   Values<"fast,on,off,fast-honor-pragmas">;
 
diff --git a/clang/lib/Basic/LangOptions.cpp b/clang/lib/Basic/LangOptions.cpp
index 9331a63d91b173..b8c35b94130f2a 100644
--- a/clang/lib/Basic/LangOptions.cpp
+++ b/clang/lib/Basic/LangOptions.cpp
@@ -182,18 +182,8 @@ void LangOptions::setLangDefaults(LangOptions &Opts, Language Lang,
 
   Opts.HIP = Lang == Language::HIP;
   Opts.CUDA = Lang == Language::CUDA || Opts.HIP;
-  if (Opts.HIP) {
-    // HIP toolchain does not support 'Fast' FPOpFusion in backends since it
-    // fuses multiplication/addition instructions without contract flag from
-    // device library functions in LLVM bitcode, which causes accuracy loss in
-    // certain math functions, e.g. tan(-1e20) becomes -0.933 instead of 0.8446.
-    // For device library functions in bitcode to work, 'Strict' or 'Standard'
-    // FPOpFusion options in backends is needed. Therefore 'fast-honor-pragmas'
-    // FP contract option is used to allow fuse across statements in frontend
-    // whereas respecting contract flag in backend.
-    Opts.setDefaultFPContractMode(LangOptions::FPM_FastHonorPragmas);
-  } else if (Opts.CUDA) {
-    if (T.isSPIRV()) {
+  if (Opts.HIP || Opts.CUDA) {
+    if (Opts.CUDA && T.isSPIRV()) {
       // Emit OpenCL version metadata in LLVM IR when targeting SPIR-V.
       Opts.OpenCLVersion = 200;
     }
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index fdd89edd72e109..e880347aed9f5e 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -348,20 +348,8 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
           .Default(llvm::FloatABI::Default);
 
   // Set FP fusion mode.
-  switch (LangOpts.getDefaultFPContractMode()) {
-  case LangOptions::FPM_Off:
-    // Preserve any contraction performed by the front-end.  (Strict performs
-    // splitting of the muladd intrinsic in the backend.)
-    Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
-    break;
-  case LangOptions::FPM_On:
-  case LangOptions::FPM_FastHonorPragmas:
-    Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
-    break;
-  case LangOptions::FPM_Fast:
-    Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
-    break;
-  }
+  // All allowed fusion is indicated in the IR.
+  Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
 
   Options.BinutilsVersion =
       llvm::TargetMachine::parseBinutilsVersion(CodeGenOpts.BinutilsVersion);
@@ -386,9 +374,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.UnsafeFPMath = LangOpts.AllowFPReassoc && LangOpts.AllowRecip &&
                          LangOpts.NoSignedZero && LangOpts.ApproxFunc &&
                          (LangOpts.getDefaultFPContractMode() ==
-                              LangOptions::FPModeKind::FPM_Fast ||
-                          LangOpts.getDefaultFPContractMode() ==
-                              LangOptions::FPModeKind::FPM_FastHonorPragmas);
+                              LangOptions::FPModeKind::FPM_Fast);
   Options.ApproxFuncFPMath = LangOpts.ApproxFunc;
 
   Options.BBAddrMap = CodeGenOpts.BBAddrMap;
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index ca2c79b51ac96b..06e633f88c6001 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1919,9 +1919,7 @@ static void getTrivialDefaultFunctionAttributes(
     if (LangOpts.AllowFPReassoc && LangOpts.AllowRecip &&
         LangOpts.NoSignedZero && LangOpts.ApproxFunc &&
         (LangOpts.getDefaultFPContractMode() ==
-             LangOptions::FPModeKind::FPM_Fast ||
-         LangOpts.getDefaultFPContractMode() ==
-             LangOptions::FPModeKind::FPM_FastHonorPragmas))
+             LangOptions::FPModeKind::FPM_Fast))
       FuncAttrs.addAttribute("unsafe-fp-math", "true");
     if (CodeGenOpts.SoftFloat)
       FuncAttrs.addAttribute("use-soft-float", "true");
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 9f1d57f43b6565..cb63f0083884e0 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -3178,6 +3178,13 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
       StringRef Val = A->getValue();
       if (Val == "fast" || Val == "on" || Val == "off" ||
           Val == "fast-honor-pragmas") {
+        // fast-honor-pragmas is deprecated -- replace it with fast
+        if (Val == "fast-honor-pragmas") {
+          D.Diag(diag::warn_drv_deprecated_arg)
+            << A->getAsString(Args) << /*hasReplacement=*/true
+            << "-ffp-contract=fast";
+          Val = "fast";
+        }
         if (Val != FPContract && LastFpContractOverrideOption != "") {
           D.Diag(clang::diag::warn_drv_overriding_option)
               << LastFpContractOverrideOption
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp
index f510d3067d4d58..32ccfb082ee5af 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3760,8 +3760,6 @@ void CompilerInvocationBase::GenerateLangArgs(const LangOptions &Opts,
     GenerateArg(Consumer, OPT_ffp_contract, "on");
   else if (Opts.DefaultFPContractMode == LangOptions::FPM_Off)
     GenerateArg(Consumer, OPT_ffp_contract, "off");
-  else if (Opts.DefaultFPContractMode == LangOptions::FPM_FastHonorPragmas)
-    GenerateArg(Consumer, OPT_ffp_contract, "fast-honor-pragmas");
 
   for (StringRef Sanitizer : serializeSanitizerKinds(Opts.Sanitize))
     GenerateArg(Consumer, OPT_fsanitize_EQ, Sanitizer);
@@ -4261,8 +4259,8 @@ bool CompilerInvocation::ParseLangArgs(LangOptions &Opts, ArgList &Args,
       Opts.setDefaultFPContractMode(LangOptions::FPM_On);
     else if (Val == "off")
       Opts.setDefaultFPContractMode(LangOptions::FPM_Off);
-    else if (Val == "fast-honor-pragmas")
-      Opts.setDefaultFPContractMode(LangOptions::FPM_FastHonorPragmas);
+    else if (Val == "fast-honor-pragmas") // Deprecated
+      Opts.setDefaultFPContractMode(LangOptions::FPM_Fast);
     else
       Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val;
   }
diff --git a/clang/lib/Sema/SemaAttr.cpp b/clang/lib/Sema/SemaAttr.cpp
index a1724820472b59..70783a7727beb6 100644
--- a/clang/lib/Sema/SemaAttr.cpp
+++ b/clang/lib/Sema/SemaAttr.cpp
@@ -1269,7 +1269,6 @@ void Sema::ActOnPragmaFPContract(SourceLocation Loc,
     NewFPFeatures.setAllowFPContractWithinStatement();
     break;
   case LangOptions::FPM_Fast:
-  case LangOptions::FPM_FastHonorPragmas:
     NewFPFeatures.setAllowFPContractAcrossStatement();
     break;
   case LangOptions::FPM_Off:
diff --git a/clang/test/CodeGen/ffp-contract-fast-honor-pramga-option.cpp b/clang/test/CodeGen/ffp-contract-fast-honor-pramga-option.cpp
index fef4da1edf1fc9..c3d8909c33bd69 100644
--- a/clang/test/CodeGen/ffp-contract-fast-honor-pramga-option.cpp
+++ b/clang/test/CodeGen/ffp-contract-fast-honor-pramga-option.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -O3 -ffp-contract=fast-honor-pragmas -triple %itanium_abi_triple -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -O3 -ffp-contract=fast -triple %itanium_abi_triple -emit-llvm -o - %s | FileCheck %s
 
 float fp_contract_1(float a, float b, float c) {
   // CHECK-LABEL: fp_contract_1fff(
diff --git a/clang/test/CodeGen/ffp-contract-fhp-pragma-override.cpp b/clang/test/CodeGen/ffp-contract-fhp-pragma-override.cpp
index ff35c9204c79cd..fe4cf21861f004 100644
--- a/clang/test/CodeGen/ffp-contract-fhp-pragma-override.cpp
+++ b/clang/test/CodeGen/ffp-contract-fhp-pragma-override.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -O3 -ffp-contract=fast-honor-pragmas -triple %itanium_abi_triple -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -O3 -ffp-contract=fast -triple %itanium_abi_triple -emit-llvm -o - %s | FileCheck %s
 
 float fp_contract_on_1(float a, float b, float c) {
   // CHECK-LABEL: fp_contract_on_1fff(
diff --git a/clang/test/CodeGen/fp-function-attrs.cpp b/clang/test/CodeGen/fp-function-attrs.cpp
index e92c26cdb75f7f..eb0801dd6bea61 100644
--- a/clang/test/CodeGen/fp-function-attrs.cpp
+++ b/clang/test/CodeGen/fp-function-attrs.cpp
@@ -1,5 +1,4 @@
 // RUN: %clang_cc1 -triple x86_64-linux-gnu -ffast-math -ffp-contract=fast -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple x86_64-linux-gnu -ffast-math -ffp-contract=fast-honor-pragmas -emit-llvm -o - %s | FileCheck %s
 
 float test_default(float a, float b, float c) {
   float tmp = a;
@@ -53,6 +52,20 @@ float test_contract_on_pragma(float a, float b, float c) {
 // CHECK: fmul fast float {{%.+}}, {{%.+}}
 // CHECK: fadd reassoc nnan ninf nsz arcp afn float {{%.+}}, {{%.+}}
 
+float test_contract_off_pragma(float a, float b, float c) {
+  float tmp = a * b;
+  {
+    #pragma clang fp contract(off)
+    tmp += c;
+  }
+  return tmp;
+}
+
+// CHECK: define{{.*}} float @_Z24test_contract_off_pragmafff(float noundef nofpclass(nan inf) %a, float noundef nofpclass(nan inf) %b, float noundef nofpclass(nan inf) %c) [[NO_UNSAFE_ATTRS:#[0-9]+]]
+// CHECK: fmul fast float {{%.+}}, {{%.+}}
+// CHECK: fadd reassoc nnan ninf nsz arcp afn float {{%.+}}, {{%.+}}
+
+
 // CHECK: attributes [[FAST_ATTRS]] = { {{.*}}"no-infs-fp-math"="true" {{.*}}"no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" {{.*}}"unsafe-fp-math"="true"{{.*}} }
 // CHECK: attributes [[PRECISE_ATTRS]] = { {{.*}}"no-infs-fp-math"="false" {{.*}}"no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" {{.*}}"unsafe-fp-math"="false"{{.*}} }
 // CHECK: attributes [[NO_UNSAFE_ATTRS]] = { {{.*}}"no-infs-fp-math"="true" {{.*}}"no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" {{.*}}"unsafe-fp-math"="false"{{.*}} }
diff --git a/clang/test/CodeGenCUDA/fp-contract.cu b/clang/test/CodeGenCUDA/fp-contract.cu
index 60824ba59ddfb0..58558f956d06de 100644
--- a/clang/test/CodeGenCUDA/fp-contract.cu
+++ b/clang/test/CodeGenCUDA/fp-contract.cu
@@ -1,6 +1,6 @@
 // REQUIRES: x86-registered-target, nvptx-registered-target, amdgpu-registered-target
 
-// By default CUDA uses -ffp-contract=fast, HIP uses -ffp-contract=fast-honor-pragmas.
+// By default CUDA and HIP use -ffp-contract=fast.
 // we should fuse multiply/add into fma instruction.
 // In IR, fmul/fadd instructions with contract flag are emitted.
 // In backend
@@ -68,35 +68,12 @@
 // RUN:   -O3 -target-cpu gfx906 -o - -x ir %t.ll \
 // RUN:   | FileCheck -check-prefixes=COMMON,AMD-OPT-FASTSTD %s
 
-// Explicit -ffp-contract=fast-honor-pragmas
-// In IR, fmul/fadd instructions with contract flag are emitted.
-// In backend
-//    nvptx/amdgcn - assumes standard fp fuse option, which only
-//                   fuses mult/add insts with contract flag or
-//                   llvm.fmuladd intrinsics.
-
-// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
-// RUN:   -ffp-contract=fast-honor-pragmas -disable-llvm-passes -o - %s \
-// RUN:   | FileCheck -check-prefixes=COMMON,NV-ON %s
-// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -S \
-// RUN:   -target-cpu gfx906 -disable-llvm-passes -o - -x hip %s \
-// RUN:   -ffp-contract=fast-honor-pragmas \
-// RUN:   | FileCheck -check-prefixes=COMMON,AMD-ON %s
-// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
-// RUN:   -O3 -o - %s \
-// RUN:   -ffp-contract=fast-honor-pragmas \
-// RUN:   | FileCheck -check-prefixes=COMMON,NV-OPT-FASTSTD %s
-// RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -S \
-// RUN:   -O3 -target-cpu gfx906 -o - -x hip %s \
-// RUN:   -ffp-contract=fast-honor-pragmas \
-// RUN:   | FileCheck -check-prefixes=COMMON,AMD-OPT-FASTSTD %s
-
 // Check separate compile/backend steps corresponding to -save-temps.
 // When input is IR, -ffp-contract has no effect. Backend uses default
 // default FP fuse option.
 
 // RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
-// RUN:   -ffp-contract=fast-honor-pragmas \
+// RUN:   -ffp-contract=fast \
 // RUN:   -O3 -disable-llvm-passes -target-cpu gfx906 -o %t.ll -x hip %s
 // RUN: cat %t.ll  | FileCheck -check-prefixes=COMMON,AMD-OPT-FAST-IR %s
 // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -S \
diff --git a/clang/test/Driver/fp-contract.c b/clang/test/Driver/fp-contract.c
index cab63683ee8132..06241caa2fbd5f 100644
--- a/clang/test/Driver/fp-contract.c
+++ b/clang/test/Driver/fp-contract.c
@@ -6,7 +6,6 @@
 // before the drive options that are checked below the run lines.
 // WARN_FM_OFF: warning: overriding '-ffast-math' option with '-ffp-contract=off'
 // WARN_FM_ON: warning: overriding '-ffast-math' option with '-ffp-contract=on'
-// WARN_FM_FHP: warning: overriding '-ffast-math' option with '-ffp-contract=fast-honor-pragmas'
 // WARN_UM_OFF: warning: overriding '-funsafe-math-optimizations' option with '-ffp-contract=off'
 // WARN_UM_ON: warning: overriding '-funsafe-math-optimizations' option with '-ffp-contract=on'
 
@@ -30,8 +29,10 @@
 // RUN:   | FileCheck --check-prefix=CHECK-FPC-FAST %s
 
 // RUN: %clang -### -ffast-math -ffp-contract=fast-honor-pragmas -c %s 2>&1 \
-// RUN:   | FileCheck --check-prefixes=CHECK-FPC-FAST-HONOR,WARN_FM_FHP %s
-// CHECK-FPC-FAST-HONOR:     "-ffp-contract=fast-honor-pragmas"
+// RUN:   | FileCheck --check-prefixes=CHECK-FPC-FAST-HONOR,WARN_FHP_DEPRECATED %s
+// WARN_FHP_DEPRECATED: clang: warning: argument '-ffp-contract=fast-honor-pragmas' is deprecated, use '-ffp-contract=fast' instead [-Wdeprecated]
+// CHECK-FPC-FAST-HONOR:     "-ffp-contract=fast"
+// CHECK-FPC-FAST-HONOR-NOT: "-honor-pragmas"
 
 // RUN: %clang -### -Werror -ffp-contract=fast -ffast-math -c %s 2>&1 \
 // RUN:   | FileCheck --check-prefix=CHECK-FPC-FAST %s

@github-actions
Copy link

github-actions bot commented Aug 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the backend actually ready for this? Looking quickly at the backend, it looks like target-independent code is fine, but some of the target-specific code doesn't respect the "contract" flag on instructions.

break;
}
// All allowed fusion is indicated in the IR.
Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant, I think; Standard is the default anyway.

} else if (Opts.CUDA) {
if (T.isSPIRV()) {
if (Opts.HIP || Opts.CUDA) {
if (Opts.CUDA && T.isSPIRV()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment below needs to be updated.

<< A->getAsString(Args) << /*hasReplacement=*/true
<< "-ffp-contract=fast";
Val = "fast";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that actually diagnosing this is a good idea. We have to support it forever; let's just silently consider an alias.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document it as deprecated (or just remove it from the documentation) but silently accept it? I was copying this code from another option where an option was deprecated. I certainly don't see a problem with leaving it permanently.

// RUN: -O3 -target-cpu gfx906 -o - -x hip %s \
// RUN: -ffp-contract=fast-honor-pragmas \
// RUN: | FileCheck -check-prefixes=COMMON,AMD-OPT-FASTSTD %s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've verified that we aren't testing for any actual IR-gen differences in this mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a glance, I thought it was mostly duplicating the run lines above, but now I see that there are some additional checks. This test probably needs more attention to see if the checks can be reorganized.

@andykaylor
Copy link
Contributor Author

Is the backend actually ready for this? Looking quickly at the backend, it looks like target-independent code is fine, but some of the target-specific code doesn't respect the "contract" flag on instructions.

Doesn't respect it in what way? Do you mean it doesn't attempt to fuse operations when "contract" is set, or it fuses when it isn't set?

@efriedma-quic
Copy link
Collaborator

It doesn't fuse unless AllowFPOpFusion is set.

@andykaylor
Copy link
Contributor Author

Is the backend actually ready for this? Looking quickly at the backend, it looks like target-independent code is fine, but some of the target-specific code doesn't respect the "contract" flag on instructions.

I didn't realize I wasn't building all targets in my local sandbox, so the fp-contract.cu test wasn't being run. After fixing my build configuration, that test failed on one of the NV-OPT-FAST checks. So, it does look like at least the NVPTX backend isn't ready for this.

There was a clang-to-backend-codegen test for AArch64 test that does pass after the change, and I know the x86 target works. I'll look at the other backends that are checking AllowFpOpFusion and see if I can bring them up-to-date with the "contract" flag. Several are also checking the "UnsafeFPMath" function attribute and enabling fusion if that's set, but I think clang sets that to "false" if contraction is disabled anywhere within a function, so that should be OK.

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! I support this direction as mentioned in #100453

(I'm starting a 3-week vacation this Friday and will have limited availability. Happy for other reviewers to approve.)

// RUN: | FileCheck --check-prefixes=CHECK-FPC-FAST-HONOR,WARN_FM_FHP %s
// CHECK-FPC-FAST-HONOR: "-ffp-contract=fast-honor-pragmas"
// RUN: | FileCheck --check-prefixes=CHECK-FPC-FAST-HONOR,WARN_FHP_DEPRECATED %s
// WARN_FHP_DEPRECATED: clang: warning: argument '-ffp-contract=fast-honor-pragmas' is deprecated, use '-ffp-contract=fast' instead [-Wdeprecated]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't check the string before "warning" (can be customized)

@andykaylor
Copy link
Contributor Author

Is the backend actually ready for this? Looking quickly at the backend, it looks like target-independent code is fine, but some of the target-specific code doesn't respect the "contract" flag on instructions.

As I just mentioned on my RFC, it turns out the state of backend support is worse than I thought. I started with NVPTX and quickly realized that I wasn't going to be able to create an easy fix that I could just tack on to this PR, because their instruction selection pattern matching code is relying on the global flag without passing the nodes involved to the function that does the check.

I'm going to close this PR and try to start of push for the backends to clean up their handling and respect what the IR says rather than relying on the global setting.

@andykaylor andykaylor closed this Aug 23, 2024
@andykaylor andykaylor reopened this Aug 28, 2024
@andykaylor
Copy link
Contributor Author

This may not be as bad as I thought. After looking more closely at the one test case that was failing with this change, it turned out that the failure was happening because the test contained a pragma to turn contraction off but checks to verify that the fma was generated anyway. Unless that behavior is needed, this may be OK.

@pratyay-p
Copy link
Contributor

Hello @andykaylor ,

I am working on this pragma operation, with a different scenario. For example: https://godbolt.org/z/nTfYW64j4

Is it ok to merge this patch?

@yxsamliu
Copy link
Collaborator

LGTM for HIP

@andykaylor
Copy link
Contributor Author

Hello @andykaylor ,

I am working on this pragma operation, with a different scenario. For example: https://godbolt.org/z/nTfYW64j4

Is it ok to merge this patch?

As mentioned here, I ran into some problems with inconsistent handling of contract in the backends. There have been some changes in that respect but I don't think the problems are entirely resolved.

There's a bit more discussion here: https://discourse.llvm.org/t/allowfpopfusion-vs-sdnodeflags-hasallowcontract/80909

@wjristow
Copy link
Collaborator

Over in #162549, I proposed what is essentially a PlayStation-specific version of the suggestion of this PR. In discussion there, people indicated that revisiting this PR may be a better approach (rather than doing a PlayStation-specific fix), because things have improved in various backends since this PR was first proposed.

I don't have a good handle on which backends would run into trouble, but in that other PR, @efriedma-quic suggested:

We could blacklist specific backends which still have issues.

That sounds like a good approach to me. (Alternatively, we could do the PlayStation-specific change of #162549, and others that want the behavior of the pragmas always being honored could "opt in" to using the approach that PlayStation is using. Essentially evolving toward the approach suggested in this PR, rather than forcing it on all backends at once.)

@arsenm arsenm changed the title Honor pragmas with -ffp-contract=fast, depecate fast-honor-pragmas Honor pragmas with -ffp-contract=fast, deprecate fast-honor-pragmas Oct 21, 2025
@paperchalice
Copy link
Contributor

Hi @andykaylor,
FYI, unsafe-fp-math attribute support is removed, could you rebase this branch and test again?

dvbuka pushed a commit to dvbuka/llvm-project that referenced this pull request Oct 27, 2025
Lukacma pushed a commit to Lukacma/llvm-project that referenced this pull request Oct 29, 2025
aokblast pushed a commit to aokblast/llvm-project that referenced this pull request Oct 30, 2025
@andykaylor
Copy link
Contributor Author

Apologies for the delay. I haven't had time to re-visit this, but I hope to get to it soon.

@Andarwinux
Copy link
Contributor

The current -ffp-contract=fast is actually a feature, please at least add a -ffp-contract=ignore or set -ffp-contract=fast-honor-pragmas to default.

Change the handling of -ffp-contract=fast such that the contract
semantics described in the IR are always respected. The front end was
already generating IR that enabled or disabled contraction as requested
by relevant pragmas, but it was setting a TargetOption (AllowFPOpFusion)
that causes the LLVM code generator to disregard the IR restrictions.

This makes -ffp-contract=fast-honor-pragmas redundant, so it is now
reported as deprecated.

For more discussion, see: https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797
@andykaylor
Copy link
Contributor Author

The current -ffp-contract=fast is actually a feature, please at least add a -ffp-contract=ignore or set -ffp-contract=fast-honor-pragmas to default.

I can understand why you'd want that, but it's a bit problematic. The current behavior is relying on a global flag (TargetOptions::AllowFPOpFusion). To override the semantics described in the IR. This is bad behavior. The correct way to do that would be to add an option to tell the front end to disregard the related pragmas and generate IR with the corresponding semantics.

@andykaylor andykaylor force-pushed the contract-honor-pragmas branch from 9ad8c16 to 7a06cfd Compare November 17, 2025 22:02
@github-actions
Copy link

github-actions bot commented Nov 17, 2025

🐧 Linux x64 Test Results

  • 111315 tests passed
  • 4419 tests skipped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants