-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[SelectionDAGBuilder] Propagate fast-math flags to fpext #167574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SelectionDAGBuilder] Propagate fast-math flags to fpext #167574
Conversation
|
@llvm/pr-subscribers-backend-powerpc @llvm/pr-subscribers-llvm-selectiondag Author: Mikołaj Piróg (mikolaj-pirog) ChangesAs in title. Without this, fpext behaves in selectionDAG as always having no fmf flags. Full diff: https://github.com/llvm/llvm-project/pull/167574.diff 1 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 9baf72b266aa7..16f555b16a621 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -3976,7 +3976,10 @@ void SelectionDAGBuilder::visitFPExt(const User &I) {
SDValue N = getValue(I.getOperand(0));
EVT DestVT = DAG.getTargetLoweringInfo().getValueType(DAG.getDataLayout(),
I.getType());
- setValue(&I, DAG.getNode(ISD::FP_EXTEND, getCurSDLoc(), DestVT, N));
+ SDNodeFlags Flags;
+ if (auto *TruncInst = dyn_cast<FPMathOperator>(&I))
+ Flags.copyFMF(*TruncInst);
+ setValue(&I, DAG.getNode(ISD::FP_EXTEND, getCurSDLoc(), DestVT, N, Flags));
}
void SelectionDAGBuilder::visitFPToUI(const User &I) {
|
|
I'm seeing no testing failures on this one, meaning no code path checks in meaningful way fmf on fpext nodes. I run into this issue while working on strengthening fmf semantics |
arsenm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be testable? fpext flags are new, I'd assume that would have covered this?
I don't think we have explicit testing for this stuff (i.e. if flags propagate correctly on DAG construction), I couldn't find an easy to test it either. I believe other flags get naturally tested if they work with tests for respective transformations. Would writing a unittest for it be an overkill? I've filled a PR that requires this fixes, so that could be repurposed as a test for it |
The brute force way would be use -stop-after=finalize-isel and check the MIR, but it would be better to have a transform dependent on the flags
That one's big and not obviously related to fpext, I'd rather test it here. Is there some other context the flag will do something? |
Some combines peek through FP_EXTEND to perform fmf rewrite, but none checks if FP_EXTEND has any flags. I could add this check to one of the combines and update the tests. |
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
The FPEXT in fdiv from other PR is a good candidate for testing this -- I've included checking arcp for FPEXT node here. One test caught the difference -- would that suffice for testing? |
| I.getType()); | ||
| setValue(&I, DAG.getNode(ISD::FP_EXTEND, getCurSDLoc(), DestVT, N)); | ||
| SDNodeFlags Flags; | ||
| if (auto *TruncInst = dyn_cast<FPMathOperator>(&I)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TruncInst is confusing here (I guess it has been copied from somewhere). You could perhaps all it Operator or FPExtInst or something, but it seems like FPOp is a common variable name when dyn_cast-ing to a FPMathOperator. So I would suggest FPOp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I will follow-up with a correct name
|
It seems our tests started failing after this patch. I'm not familiar with IR or DAG, so I'm not sure if this patch is the cause, if it's an optimization issue, or if this is the correct behavior. Any comments would be greatly appreciated. Here's the C program in question: #include <stdio.h>
#include <complex.h>
_Complex double XD = (3.123457687 + 0i);
_Complex double YD = (6.123457687 + 0i);
int main() {
_Complex float cf;
double real;
real = creal(XD + YD);
// Expect double-precision output
printf("%40.35lf\n", real);
cf = XD + YD;
// Expect single-precision output
printf("%40.35f\n",crealf(cf));
return 0;
}In this program, we expect a single-precision output on line 17, but when Is this the intended behavior? Is such an optimization with the fast-math flag allowed? |
|
From quick look at the example it appears to be okay. In C vararg functions like printf promote float to doubles. The reason it was working differently before is probably this change -- now fpext has correctly set fast-math while previously it always had none. The effect of this was that if fpext was being rewritten with some other nodes, to a new node, this new node had an intersection of flags, but since fpext had no flags, this new node had none as well -- meaning some rewrites couldn't be made on this new node because it had no flags -- and now they can be made. So in other words: promoting floats to double in this case is okay because of how printf works. The change in value is probably because this patch allowed (or in better words -- stopped blocking) some rewrites from kicking in. Keep in mind this patch doesn't provide new features, it merely fixes what was broken before (i.e. fpext, internally, never had any flags set). |
|
Thanks for the reply. I understand that this patch itself is correct. However, since the value to be printed is the return value of |
Basically anything and everything is permissible with fast math flags. It's explicitly opting out of any standard to do something faster. It's not going to micro-disallow specific conversions or anything like that |
|
From Clang's perspective, this optimization appears permissible because |
Yes, clang sees Footnotes
|
As in title. Without this, fpext behaves in selectionDAG as always having no fast-math flags.