Skip to content

Conversation

@mikolaj-pirog
Copy link
Member

@mikolaj-pirog mikolaj-pirog commented Nov 11, 2025

As in title. Without this, fpext behaves in selectionDAG as always having no fast-math flags.

@llvmbot llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Nov 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 11, 2025

@llvm/pr-subscribers-backend-powerpc

@llvm/pr-subscribers-llvm-selectiondag

Author: Mikołaj Piróg (mikolaj-pirog)

Changes

As in title. Without this, fpext behaves in selectionDAG as always having no fmf flags.


Full diff: https://github.com/llvm/llvm-project/pull/167574.diff

1 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+4-1)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 9baf72b266aa7..16f555b16a621 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -3976,7 +3976,10 @@ void SelectionDAGBuilder::visitFPExt(const User &I) {
   SDValue N = getValue(I.getOperand(0));
   EVT DestVT = DAG.getTargetLoweringInfo().getValueType(DAG.getDataLayout(),
                                                         I.getType());
-  setValue(&I, DAG.getNode(ISD::FP_EXTEND, getCurSDLoc(), DestVT, N));
+  SDNodeFlags Flags;
+  if (auto *TruncInst = dyn_cast<FPMathOperator>(&I))
+    Flags.copyFMF(*TruncInst);
+  setValue(&I, DAG.getNode(ISD::FP_EXTEND, getCurSDLoc(), DestVT, N, Flags));
 }
 
 void SelectionDAGBuilder::visitFPToUI(const User &I) {

@mikolaj-pirog
Copy link
Member Author

I'm seeing no testing failures on this one, meaning no code path checks in meaningful way fmf on fpext nodes. I run into this issue while working on strengthening fmf semantics

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be testable? fpext flags are new, I'd assume that would have covered this?

@mikolaj-pirog
Copy link
Member Author

mikolaj-pirog commented Nov 11, 2025

Should be testable? fpext flags are new, I'd assume that would have covered this?

I don't think we have explicit testing for this stuff (i.e. if flags propagate correctly on DAG construction), I couldn't find an easy to test it either. I believe other flags get naturally tested if they work with tests for respective transformations. Would writing a unittest for it be an overkill?

I've filled a PR that requires this fixes, so that could be repurposed as a test for it

@arsenm
Copy link
Contributor

arsenm commented Nov 12, 2025

Should be testable? fpext flags are new, I'd assume that would have covered this?

I don't think we have explicit testing for this stuff (i.e. if flags propagate correctly on DAG construction), I couldn't find an easy to test it either. I believe other flags get naturally tested if they work with tests for respective transformations. Would writing a unittest for it be an overkill?

The brute force way would be use -stop-after=finalize-isel and check the MIR, but it would be better to have a transform dependent on the flags

I've filled a PR that requires this fixes, so that could be repurposed as a test for it

That one's big and not obviously related to fpext, I'd rather test it here. Is there some other context the flag will do something?

@mikolaj-pirog
Copy link
Member Author

Should be testable? fpext flags are new, I'd assume that would have covered this?

I don't think we have explicit testing for this stuff (i.e. if flags propagate correctly on DAG construction), I couldn't find an easy to test it either. I believe other flags get naturally tested if they work with tests for respective transformations. Would writing a unittest for it be an overkill?

The brute force way would be use -stop-after=finalize-isel and check the MIR, but it would be better to have a transform dependent on the flags

I've filled a PR that requires this fixes, so that could be repurposed as a test for it

That one's big and not obviously related to fpext, I'd rather test it here. Is there some other context the flag will do something?

Some combines peek through FP_EXTEND to perform fmf rewrite, but none checks if FP_EXTEND has any flags. I could add this check to one of the combines and update the tests. DAGCombiner::isContractableFMUL is one of these combines.

@github-actions
Copy link

github-actions bot commented Nov 13, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@mikolaj-pirog
Copy link
Member Author

Should be testable? fpext flags are new, I'd assume that would have covered this?

I don't think we have explicit testing for this stuff (i.e. if flags propagate correctly on DAG construction), I couldn't find an easy to test it either. I believe other flags get naturally tested if they work with tests for respective transformations. Would writing a unittest for it be an overkill?

The brute force way would be use -stop-after=finalize-isel and check the MIR, but it would be better to have a transform dependent on the flags

I've filled a PR that requires this fixes, so that could be repurposed as a test for it

That one's big and not obviously related to fpext, I'd rather test it here. Is there some other context the flag will do something?

The FPEXT in fdiv from other PR is a good candidate for testing this -- I've included checking arcp for FPEXT node here. One test caught the difference -- would that suffice for testing?

@mikolaj-pirog mikolaj-pirog requested a review from arsenm November 13, 2025 13:41
@arsenm arsenm merged commit e7b41df into llvm:main Nov 15, 2025
10 checks passed
I.getType());
setValue(&I, DAG.getNode(ISD::FP_EXTEND, getCurSDLoc(), DestVT, N));
SDNodeFlags Flags;
if (auto *TruncInst = dyn_cast<FPMathOperator>(&I))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TruncInst is confusing here (I guess it has been copied from somewhere). You could perhaps all it Operator or FPExtInst or something, but it seems like FPOp is a common variable name when dyn_cast-ing to a FPMathOperator. So I would suggest FPOp.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I will follow-up with a correct name

@s-watanabe314
Copy link
Contributor

It seems our tests started failing after this patch. I'm not familiar with IR or DAG, so I'm not sure if this patch is the cause, if it's an optimization issue, or if this is the correct behavior. Any comments would be greatly appreciated.

Here's the C program in question:
https://godbolt.org/z/jqM5foWEj

#include <stdio.h>
#include <complex.h>

_Complex double XD = (3.123457687 + 0i);
_Complex double YD = (6.123457687 + 0i);

int main() {
  _Complex float cf;
  double real;

  real = creal(XD + YD);
  // Expect double-precision output
  printf("%40.35lf\n", real);
  
  cf = XD + YD;
  // Expect single-precision output
  printf("%40.35f\n",crealf(cf));
  
  return 0;
}

In this program, we expect a single-precision output on line 17, but when -ffast-math is specified, it outputs a double-precision value. Looking at the IR, fptrunc and fpext have fast flags, and it seems that the conversion from double-precision to single-precision is removed because of this.

Is this the intended behavior? Is such an optimization with the fast-math flag allowed?

@mikolaj-pirog
Copy link
Member Author

From quick look at the example it appears to be okay. In C vararg functions like printf promote float to doubles. The reason it was working differently before is probably this change -- now fpext has correctly set fast-math while previously it always had none. The effect of this was that if fpext was being rewritten with some other nodes, to a new node, this new node had an intersection of flags, but since fpext had no flags, this new node had none as well -- meaning some rewrites couldn't be made on this new node because it had no flags -- and now they can be made.

So in other words: promoting floats to double in this case is okay because of how printf works. The change in value is probably because this patch allowed (or in better words -- stopped blocking) some rewrites from kicking in. Keep in mind this patch doesn't provide new features, it merely fixes what was broken before (i.e. fpext, internally, never had any flags set).

@s-watanabe314
Copy link
Contributor

Thanks for the reply. I understand that this patch itself is correct.

However, since the value to be printed is the return value of crealf, I believe that values exceeding the float range should be truncated. In other words, I'm concerned about whether it's okay for the return value of crealf to be double. This might be related to the C language standard. Additionally, the documentation for the fast flag in IR does not mention optimizations that would offset type conversions, so I'm unsure if this kind of optimization is permissible. Even if there are no problems with this patch, is it possible that other optimizations could introduce issues?

@arsenm
Copy link
Contributor

arsenm commented Nov 19, 2025

Additionally, the documentation for the fast flag in IR does not mention optimizations that would offset type conversions, so I'm unsure if this kind of optimization is permissible.

Basically anything and everything is permissible with fast math flags. It's explicitly opting out of any standard to do something faster. It's not going to micro-disallow specific conversions or anything like that

@s-watanabe314
Copy link
Contributor

From Clang's perspective, this optimization appears permissible because -ffast-math, which enables aggressive optimizations, is specified. From an IR perspective, it also seems acceptable, as this optimization is permitted by the semantics of the fast flags. Is this understanding correct? My interpretation at the IR level is that the contract flags (which is implied by 'fast') led to the removal of fptrunc and fpext in this case.

@mikolaj-pirog
Copy link
Member Author

mikolaj-pirog commented Nov 19, 2025

From Clang's perspective, this optimization appears permissible because -ffast-math, which enables aggressive optimizations, is specified. From an IR perspective, it also seems acceptable, as this optimization is permitted by the semantics of the fast flags. Is this understanding correct? My interpretation at the IR level is that the contract flags (which is implied by 'fast') led to the removal of fptrunc and fpext in this case.

Yes, clang sees fast enabled from CLI and puts fast on IR instructions, backend sees fast flag on IR and does various fast-math rewrites1. It looks like the contract is the flag responsible, but I haven't tested it.

Footnotes

  1. There are also global flags that dictate fast-math behavior, i.e. backend may consult global flag instead of IR flag. There's active work being done to make backend rely only on instruction-level flags

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:PowerPC llvm:SelectionDAG SelectionDAGISel as well

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants