-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[Flang-RT] Fix GCC 15.1 Fortran Runtime libstdc++ Undefined Symbols #157385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@Meinersbur please review. |
openmp/runtime/src/CMakeLists.txt
Outdated
@@ -170,7 +170,7 @@ endif() | |||
# avoid an unwanted dependency on libstdc++.so. | |||
add_compile_definitions(_GLIBCXX_NO_ASSERTIONS) | |||
if(NOT WIN32) | |||
add_definitions(-U_GLIBCXX_ASSERTIONS) | |||
add_definitions(-U_GLIBCXX_ASSERTIONS -D_GLIBCXX_NO_ASSERTIONS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't _GLIBCXX_NO_ASSERTIONS already being set two lines above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add more details on what is happening. Which symbols from libstdc++ is the linker complaining about?
There is a test no-cpp-dep.c
that should fail if this happens. I works for be with gcc-15, non-optimzed build, shared as well as static flang-rt.runtime. This is what I get with a shared build:
LD_DEBUG=libs LD_LIBRARY_PATH=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib ./a.out
53008: find library=libflang_rt.runtime.so [0]; searching
53008: search path=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/glibc-hwcaps/x86-64-v4:/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/glibc-hwcaps/x86-64-v3:/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/glibc-hwcaps/x86-64-v2:/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib (LD_LIBRARY_PATH)
53008: trying file=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/glibc-hwcaps/x86-64-v4/libflang_rt.runtime.so
53008: trying file=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/glibc-hwcaps/x86-64-v3/libflang_rt.runtime.so
53008: trying file=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/glibc-hwcaps/x86-64-v2/libflang_rt.runtime.so
53008: trying file=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/libflang_rt.runtime.so
53008:
53008: find library=libc.so.6 [0]; searching
53008: search path=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib (LD_LIBRARY_PATH)
53008: trying file=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/libc.so.6
53008: search cache=/etc/ld.so.cache
53008: trying file=/lib/x86_64-linux-gnu/libc.so.6
53008:
53008: find library=libm.so.6 [0]; searching
53008: search path=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib (LD_LIBRARY_PATH)
53008: trying file=/home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/libm.so.6
53008: search cache=/etc/ld.so.cache
53008: trying file=/lib/x86_64-linux-gnu/libm.so.6
53008:
53008:
53008: calling init: /lib64/ld-linux-x86-64.so.2
53008:
53008:
53008: calling init: /lib/x86_64-linux-gnu/libc.so.6
53008:
53008:
53008: calling init: /lib/x86_64-linux-gnu/libm.so.6
53008:
53008:
53008: calling init: /home/meinersbur/src/llvm/work/debug_flangrt/flang-rt/lib/libflang_rt.runtime.so
53008:
53008:
53008: initialize program: ./a.out
53008:
53008:
53008: transferring control: ./a.out
No libstdc++ in sight.
Not being able to create non-optimized builds is what I would consider only as last resort. Debugging is sometimes necessary to find the cause of bugs.
Please update the summary according the the latest state of the patch.
# Must use minimum optimization level of -O2 to prevent dependency on libstdc++ | ||
string(REPLACE "-O0" "-O2" CMAKE_CXX_FLAGS CMAKE_CXX_FLAGS) | ||
string(REPLACE "-O1" "-O2" CMAKE_CXX_FLAGS CMAKE_CXX_FLAGS) | ||
string(FIND "-O" CMAKE_CXX_FLAGS opt_idx) | ||
if (opt_idx EQUAL -1) | ||
target_compile_options(${tgtname} PRIVATE -O2) | ||
endif () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in foreach (tgtname IN LISTS srctargets)
. If building libflang-rt.runtime.so
as well as libflang-rt.runtime.so
, the mechanism in add_flang_rt_library adds a library of type that is in srctargets, but not in libtargets. The compiler is invoked on targets in srctargets, linker in libtargets
I just built and tested latest LLVM to make sure the problem still exists. It's pulling in symbols related to std::exception for me with GCC 15.1. The underlying problem is related to libstdc++. If you built GCC against an older distro-provided libstdc++, you may not be using a version of libstdc++ that causes this problem. Likewise if you used a distro-provided backport that does the same thing. I updated the issue to clarify.
|
66bb30e
to
3a47b73
Compare
The PR summary still reads
please update and add more details, including in the PR title. "remaining work" presumes having followed the progress of "work", which very few people have. |
Your version of libstdc++ is 15.1, mine is 15.2. What makes the difference might be gcc-mirror/gcc@0dbc588, but I think you might have exceptions enabled because in all cases throwing of I would rather ask users to update their version of libstdc++ than making debugging impossible. Or at least only do it as a workaround for 15.1. |
This is the command. Since Command with this PR:
Command with master:
|
The issue is |
@jprotze look again; there's a |
@Meinersbur I'm upgrading my (rolling release) Slackware install to 15.2 in order to verify that the problem does not exist for me in 15.2. In the meantime, are we sure that this issue is specific to GCC 15? Has anyone tried, for instance, GCC 14.2? |
@Meinersbur if it's only GCC 15.1 we might be able to get away with not fixing this, because I don't think any big distro has done a release with GCC 15.1 as the compiler. If it's GCC 14.2 then Debian Trixie released with 14.2, so we can't ignore it. And if it's GCC 15.2 for me but not for you then we have to figure out why it doesn't reproduce for you. |
@Meinersbur the error is still present for me on GCC 15.2. |
Thanks for the additional info, I can new reproduce the problem now and identified the problem: The previously unused exception is now "used" to avoid an unused-parameter error. I checked with gcc and it still does not emit the casted bad_variant_access (which it not even a parameter that the patch above is targeting) into the object file, but Clang does. Knowing what is happening gives us alternative options:
IMHO, any of those solutions is better than forcing a specific optimization level. The least invasive would be defining |
@Meinersbur I agree that these are all better solutions. I will implement (3) since it is least invasive. Note that we'll have to define it in the source code of the runtime rather than on the command line as we do with the other symbols. Any header file included across the entire runtime should do; I'll find an appropriate one. It wouldn't hurt to pursue (2) as a longer-term solution if you think GCC will be willing to make that change for us. It would probably be best for someone who has worked with the GCC team in the past to file that issue with them if any such person is available. (1) would align our behavior with GCC, which is nice, but I don't know if it may have cascading effects we can't predict. (4) sounds like overkill. |
For (3), a command-line-only replacement would be adding For (1), even if cast-to-void, the side-effects must still execute. It's proably that gcc does not emit anything to the mid-end that it can predetermine is never executed (in this case: after a For (4), using C++ without its link library is not an advertised feature of libstdc++ nor libc++, msvc STL (AFAIK), so problems like these may accumulate over time and eventually make a complete replacement necessary, like libc did from the beginning. But I agree, if we can get away with it for now by just defining |
3a47b73
to
96838b7
Compare
@Meinersbur I implemented (3). What do you think? |
Needs documention about
Please also update the PR summary which will be suggested for the commit message Otherwise, looks good. It is possible that now someone gets the warning that gcc-mirror/gcc@d87c0d5 was meant to suppress. Imaging being someone who compiles with |
flang/include/flang/Common/variant.h
Outdated
@@ -24,6 +24,9 @@ using cuda::std::variant_size_v; | |||
using cuda::std::visit; | |||
} // namespace std | |||
#else // !RT_USE_LIBCUDACXX | |||
#include <initializer_list> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is initializer_list
needed for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meinersbur bits/c++config
needs to be #include
d prior to variant
, because we need to replace that macro after it's defined and before variant
uses it. You're not really supposed to #include
things in the bits
dir, so I used initializer_list
to accomplish this indirectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#ifdef _GLIBCXX_THROW_OR_ABORT
#undef _GLIBCXX_THROW_OR_ABORT
#endif
I would have preferred if we could ensure that our _GLIBCXX_THROW_OR_ABORT
is always the first, so that which _GLIBCXX_THROW_OR_ABORT is active does not depend on #include
order.
Be sure to document this stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meinersbur I will add a comment. I don't see a way to avoid two inclusions, because, if we define first, and then bits/c++config
gets #include
d, it won't matter that we defined first since, by GCC standards, the most recent definition of a macro controls:
https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html
(By language standards, it's just undefined behavior, but I think everyone does the same thing as GCC in practice.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a look into how libstdc++ defines the macro:
https://github.com/gcc-mirror/gcc/blob/d87c0d5443ba860dcbc6be24921e0ffb463cc96f/libstdc%2B%2B-v3/include/bits/c%2B%2Bconfig#L260-L266
#ifndef _GLIBCXX_THROW_OR_ABORT
# if __cpp_exceptions
# define _GLIBCXX_THROW_OR_ABORT(_EXC) (throw (_EXC))
# else
# define _GLIBCXX_THROW_OR_ABORT(_EXC) (__builtin_abort(), (void)(_EXC))
# endif
#endif
I.e. it will NOT redefine if it was already defined. Same mechanism as _GLIBCXX_ASSERTIONS
which is documented to be user-definable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meinersbur thanks, good catch. I'll take out the initializer_list hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meinersbur I've done it with no source code changes as you originally suggested :)
96838b7
to
f1befb2
Compare
@Meinersbur I have added comments and believe this is ready for merge. |
flang/include/flang/Common/variant.h
Outdated
// initializer_list is included to load bits/c++config, which can't be included | ||
// directly and which defines a macro we need to redefine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"a macro" -> _GLIBCXX_THROW_OR_ABORT
flang/include/flang/Common/variant.h
Outdated
@@ -24,6 +24,9 @@ using cuda::std::variant_size_v; | |||
using cuda::std::visit; | |||
} // namespace std | |||
#else // !RT_USE_LIBCUDACXX | |||
#include <initializer_list> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a look into how libstdc++ defines the macro:
https://github.com/gcc-mirror/gcc/blob/d87c0d5443ba860dcbc6be24921e0ffb463cc96f/libstdc%2B%2B-v3/include/bits/c%2B%2Bconfig#L260-L266
#ifndef _GLIBCXX_THROW_OR_ABORT
# if __cpp_exceptions
# define _GLIBCXX_THROW_OR_ABORT(_EXC) (throw (_EXC))
# else
# define _GLIBCXX_THROW_OR_ABORT(_EXC) (__builtin_abort(), (void)(_EXC))
# endif
#endif
I.e. it will NOT redefine if it was already defined. Same mechanism as _GLIBCXX_ASSERTIONS
which is documented to be user-definable.
✅ With the latest revision this PR passed the C/C++ code formatter. |
ca070a0
to
6c22816
Compare
# Define our own _GLIBCXX_THROW_OR_ABORT because libstdc++ headers | ||
# reference std::exception in its definition, and we do not want | ||
# to link against std::exception since doing that would link us to | ||
# the C++ runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is somewhat imprecise: libstdc++ does NOT use std::exception
in its definition, (in the sense of #define _GLIBCXX_THROW_OR_ABORT(...) std::exception
), but uses the _EXC
argument which is typically is an object (pre-instantiated or an expression creating one) derived from std::exception
.
Please also add the following details:
- libstdc++ uses
(void)_EXC
to silence a warning (since 15.1) - Some compilers (specifically Clang) may not always optimize away that unreachable code (unreachable because it occurs after
__builtin_abort()
) - The intention is to only remove that
(void)_EXC
, and nothing else - Redefining _GLIBCXX_THROW_OR_ABORT is not officially supported by libstdc++
Since this use of _GLIBCXX_THROW_OR_ABORT is unsupported/undocumented, describing (correctly) what and why we are doing this is REALLY important. We do not want to mislead anyone who stumbles onto this. For instance, someone would like to remove this (e.g. because it might create problems with a future version of libstdc++), and sees that it still works. But they tested an optimized build only, not knowing that it is meant to fix unoptimized builds (with Clang).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
db0ccf5
to
3ca6376
Compare
@Meinersbur I believe this is ready now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 just a nit left. Will approve as soon as it is fixed
# causes a link reference to std::exception, and we do not want | ||
# to link against std::exception since doing that would link us | ||
# against the C++ runtime library, and we do not want to link | ||
# the Fortran runtime against the C++ runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reads like "causes A, and we do not want to do A since doing that would do B, and we do not want to do B" with A = "link to std::exception" and B = "link against C++ standard library". Can shorten and state the reason instead.
# causes a link reference to std::exception, and we do not want | |
# to link against std::exception since doing that would link us | |
# against the C++ runtime library, and we do not want to link | |
# the Fortran runtime against the C++ runtime. | |
# calls std::exception methods in the libstdc++ static or dynamic library. We cannot link libstdc++ because it may conflict with a C++ runtime of a hybrid Fortran/C++ application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meinersbur I revised the comment in a way similar to your suggestion; please take a look.
Also, although I did mention .so version conflicts in the comment as the reason we don't want to link against libstdc++, are you sure that's really true? The GNU .so files are forward compatible, so, as long as our users use a version at least as new as our build version, we should be fine there. And, we do link the Fortran runtime against the GNU C runtime, which has the same forward-but-not-backward compatibility profile as the GNU C++ runtime. So, I'm not sure .so conflicts would actually manifest.
My understanding was that we didn't want to link against libstdc++ because it's big and bulky and we just don't need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some problems that could occur:
- flang-rt uses GCC libstdc++, application uses LLVM libc++ (or potentially another commercial runtime implementation)
- flang-rt statically links an older version of libstdc++.a, application links a newer version of libstdc++.so/.a
Generally, bad things happen if the linker or loader finds the same mangled symbol name with different implementations. There are solutions to this, like only supporting libstdc++.a and libflang_rt.runtime.so with -fvisibility=hidden
, requiring the user to only link the same C++ standard library or newer version of it dynamically, or not depending on libstdc++.a/.so at all. We chose the latter, staying out of the symbol conflict game.
The size of libstdc++ is not a problem. When linking dynamically, libstdc++.so is present on the system anyway, and only touched pages are loaded into memory. When linking statically, the linker only takes the object files from libstdc++.a that are actually needed. C++ has a "you only pay for what you use" principle. The two C++ features that violate this are exceptions and RTTI. Both are disabled for all of LLVM, including Flang-RT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
If you need someone to land this PR, you can ask me. |
Co-authored-by: Michael Kruse <[email protected]>
0dd27d4
to
78b7b50
Compare
@Meinersbur yes, thank you, please land it after tests finish since I don't have permission to push. (I rebased one last time just to make sure everything was still fine.) |
I updated the commit message because the PR summary does not apply. I used GitHub's "land as soon CI passes" feature, but it seemed to not wait for |
Current fix is to use at least -O2 on Flang runtime to prevent linking in libstdc++.