-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[DFAJumpThreading] Enable DFAJumpThread by default. #157646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[DFAJumpThreading] Enable DFAJumpThread by default. #157646
Conversation
Co-Authored-By: YinZd <[email protected]> Co-Authored-By: ict-ql <[email protected]> Co-Authored-By: Lin Wang <[email protected]>
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
See previous discussion on #83033. |
This pass currently does not have a maintainer. We should have one if we enable it by default. @XChy What is your opinion on the current quality of this pass? @buggfg Beyond running SPEC, have you performed any due diligence to ensure this pass is ready for default enablement? Have you performed any targeted fuzzing for this enablement? I'll provide new compile-time numbers. |
The current implementation is conservative and optimizes only one switch. After numerous fixes, this pass has become more stable and safe. For me, it's acceptable to enable this pass now if there is no severe compile-time regression. And enabling it will uncover more hidden problems. |
Generally looks okay, with some outliers that need to be investigated (libclamav_nsis_LZMADecode.c +95%, NativeInlineSiteSymbol.cpp +33%). These might be fine assuming the increase is due to unavoidable second order effects from the transformation being performed. |
Hi, I’m really glad to hear that both reviewers support enabling DFAJumpThreading by default. I haven’t had the chance to do any additional research beyond the SPEC2017 benchmarks yet. However, I believe that enabling this optimization by default is essential for the embedded domain, as it will help increase LLVM's impact and uncover other potential issues. :) |
@nikic, I can be a candidate for maintainer, as I am familiar with the codebase of this pass and actively involved in it. But it would be better if there were someone else to co-maintain. I'm concerned that only one person will be working on it, actually. @dybv-sc @UsmanNadeem, do you volunteer too? |
Will look into these cases. |
I believe #161632 has resolved most outliers. The bottleneck mostly lies in the later optimizations in the pipeline after duplicating blocks, instead of lying in DFAJumpThreading itself. |
After effort on reducing compile-time, we get a better number: https://llvm-compile-time-tracker.com/compare.php?from=ed113e7904943565b4cd05588f6b639e40187510&to=2b901ca2e1b77d2b7a31cbcb57a921aa662341f9&stat=instructions:u. |
Yeah, those results look great. I also started a compile-time run on llvm-opt-benchmark with these results: dtcxzyw/llvm-opt-benchmark#2906 (comment) If you have time, looking at the big outlier there (openssl/smime.ll with +175%) would be good. |
@zyw-bot csmith-fuzz |
#162447 resolves it partially, but the main cause remains there: the number of phi nodes increases massively, and thus SLPVectorizer slows down. Not come up with a solution for DFAJumpThreading yet. Maybe we can redesign the SSA updating method there. |
We recommend setting
dfa-jump-thread
to be enabled by default. It’s a mature optimization that’s been supported since GCC 9.1.0. At the-O2
opt level, both the GCC and ICX compilers have this optimization enabled by default.Once it’s enabled, we saw a 13% performance improvement in the CoreMark benchmark on the X86 platform (Intel i9-11900K Rocket Lake), and even a 15% increase on the KunMingHu FPGA. Additionally, we verified the correctness of this pass using SPEC 2017.