Skip to content

Commit 0399026

Browse files
authored
Drop flat namespace option (#1109)
This PR fixes two subtle, related issues that are blocking updates from going through downstream in the Kasmer project. At a high level, the issues are: - Flat namespace linking on macOS produces incorrect symbol lookups in dynamic libraries. - #1097 misses a subtle edge case related to tail-call optimisation. The actual code changes required are small, but warrant some detailed explanation. ## Flat Namespaces For a long time, macOS has implemented a system known as _two-level_ namespaces, whereby undefined symbol names in a dynamic library are prefixed with the name of the library in which the loader expects to be able to find them at run-time. This is a conservative behaviour; even if a symbol with the same name exists in a different library, it won't be selected. For example, the dynamic libraries built by `llvm-kompile` in `c` mode link against `libgmp`. Two-level namespaces produce dynamic symbol tables that look like: ```console $ dyld_info test/c/Output/flat-namespace.kore.tmp.dir/libtest.so -symbolic_fixups | grep gmpz_clear +0x2B28 bind pointer libgmp.10.dylib/___gmpz_clear ``` This behaviour is different to Linux, which does not have a notion of two-level namespaces. For legacy compatibility purposes, Apple supply a linker flag `-flat_namespace` that behaves more similarly to Linux behaviour. Its use is discouraged in new code, but we had enabled it to work around an issue in the Python bindings (python/cpython#97524) that should be fixed in a future CPython / macOS combination.[^1] When enabled, the symbol table looks something like this for the same example: ```console $ dyld_info test/c/Output/flat-namespace.kore.tmp.dir/libtest.so -symbolic_fixups | grep gmpz_clear +0x2EE8 bind pointer flat-namespace/___gmpz_clear ``` As a consequence of this, if the symbol `___gmpz_clear` exists in multiple dynamic libraries loaded by the same process, then the order in which they will be selected by the dynamic loader is not clearly well-defined,[^2] and when it's referenced we could end up loading either the correct or the incorrect symbol. This caused the initial bug observed as follows:[^3] - The Haskell backend statically links the `kore-rpc-booster` executable against `libgmp`, meaning that some GMP symbols appear in that binary. - The backend compiles shared libraries that dynamically link against `libgmp`. - `kore-rpc-booster` dynamically loads one of these libraries, and when resolving symbols to load, the flat namespace environment selects the static version for some and the dynamic version for others. - A call to `__gmpz_clear` from a backend hook ends up referencing the statically linked symbol, rather than the dynamically linked version. Generally, I think this situation is harmless - GMP is very stable and it's plausible that doing this for most symbols is not observable. - However, the dynamically-linked GMP library has been set up to use the KORE memory management functions. When the static version is called, it tries to `free()` a pointer allocated by the backend's GC, and crashes. The fix for this issue is to drop our usage of `-flat_namespace` for C shared libraries compiled by the backend. This breaks a few places we were relying on the old (incorrect) behaviour in the presence of C++ RTTI; having multiple instances of identically-named typeinfo symbols in a process is known to be broken there: - `libunwind` is actually implicitly linked via the macOS system library; if we explicitly link it as well, then code that handles exceptions will break. - The `k-rule-apply` tool linked two copies of the KORE AST library, causing `dynamic_cast` to break. #1110 addresses this. ## Tail-Call Optimisation In #1097, we made some changes that explicitly mark K functions as `musttail` when we know they're tail recursive. In doing so, we removed the need to use the `-tailcallopt` flag in most cases. However, the change in that PR missed that as well as IR-level transformations, `-tailcallopt` sets a lower-level flag in the backend[^4] code generator that guarantees tail-call code generation. For large programs, this meant I could observe stack overflows when traversing large terms. The fix is just to enforce that this internal option gets set properly; doing so is just a restoration of the behaviour we got from `-tailcallopt` before. [^1]: But isn't yet fixed, unfortunately - the underlying bug is still present on my system. Should be revisited in the future, ideally! [^2]: It might be defined somewhere, but the initial manifestation of this bug appeared in an apparently unrelated commit, so I think we were just getting lucky previously. The fix in this PR is morally correct whether or not things worked accidentally beforehand. [^3]: I intend to write this up fully later in a separate issue. [^4]: As in the X86 or arm backend of LLVM itself.
1 parent feb291b commit 0399026

File tree

6 files changed

+6602
-6
lines changed

6 files changed

+6602
-6
lines changed

bin/llvm-kompile-clang

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,6 @@ if [[ "$OSTYPE" == "darwin"* ]]; then
136136
flags=(
137137
"-L@BREW_PREFIX@/opt/libffi/lib"
138138
"-L@BREW_PREFIX@/lib"
139-
"-L@LLVM_LIBRARY_DIR@"
140139
"-Wl,-u,_table_getArgumentSortsForTag"
141140
"-I" "@BREW_PREFIX@/include"
142141
)
@@ -158,6 +157,19 @@ else
158157
set_visibility_hidden="$LIBDIR/libSetVisibilityHidden.so"
159158
fi
160159

160+
# On macOS, we get libunwind supplied as part of the developer tools in the OS,
161+
# and so don't need to link it directly. If we instead try to explictly link
162+
# against the libunwind that's part of Homebrew-supplied LLVM, it's easy to end
163+
# up in a situation where exceptions thrown in a shared library are not
164+
# compatible with the unwinding machinery in the main application binary. This
165+
# then manifests as BAD_ACCESS errors (or similar) when an exception is thrown,
166+
# _even if the exception should in principle be caught_.
167+
if [[ "$OSTYPE" == "darwin"* ]]; then
168+
libunwind=""
169+
else
170+
libunwind="-lunwind"
171+
fi
172+
161173
# When building the Python AST module, there's no runtime and no main file, so
162174
# we skip this entire step. The library code is just C++, so we can skip
163175
# straight to invoking the C++ compiler.
@@ -207,7 +219,7 @@ if [[ "$OSTYPE" == "darwin"* ]]; then
207219
start_whole_archive="-force_load"
208220
end_whole_archive=""
209221

210-
flags+=("-Wl,-flat_namespace" "-Wl,-undefined" "-Wl,dynamic_lookup")
222+
flags+=("-Wl,-undefined" "-Wl,dynamic_lookup")
211223
else
212224
start_whole_archive="-Wl,--whole-archive"
213225
end_whole_archive="-Wl,--no-whole-archive"
@@ -218,9 +230,13 @@ if [ "$main" = "static" ]; then
218230
elif [[ "$main" =~ "python" ]]; then
219231
# Don't link jemalloc when building a python library; it clashes with the
220232
# pymalloc implementation that Python expects you to use.
221-
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "-lunwind")
233+
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "$libunwind")
222234
flags+=("-fPIC" "-shared" "-I${INCDIR}" "-fvisibility=hidden")
223235

236+
if [[ "$OSTYPE" == "darwin"* ]]; then
237+
flags+=("-Wl,-flat_namespace")
238+
fi
239+
224240
read -r -a python_include_flags <<< "$("${python_cmd}" -m pybind11 --includes)"
225241
flags+=("${python_include_flags[@]}")
226242

@@ -240,11 +256,11 @@ elif [ "$main" = "c" ]; then
240256

241257
# Avoid jemalloc for similar reasons as Python; we don't know who is loading
242258
# this library so don't want to impose it.
243-
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "-lunwind")
259+
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "$libunwind")
244260
flags+=("-fPIC" "-shared" "$start_whole_archive" "$LIBDIR/libkllvmcruntime.a" "$end_whole_archive")
245261
clangpp_args+=("-o" "${output_file}")
246262
else
247-
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "-ljemalloc" "-lunwind")
263+
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "-ljemalloc" "$libunwind")
248264
fi
249265

250266
if $link; then

lib/codegen/ApplyPasses.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,7 @@ void generate_object_file(llvm::Module &mod, llvm::raw_ostream &os) {
123123

124124
auto features_string = features.getString();
125125
auto options = TargetOptions{};
126+
options.GuaranteedTailCallOpt = true;
126127

127128
#if LLVM_VERSION_MAJOR >= 16
128129
std::optional<CodeModel::Model> model = std::nullopt;

nix/llvm-backend.nix

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ stdenv.mkDerivation {
4040
--replace '"-liconv"' '"-L${libiconv}/lib" "-liconv"' \
4141
--replace '"-lncurses"' '"-L${ncurses}/lib" "-lncurses"' \
4242
--replace '"-ltinfo"' '"-L${ncurses}/lib" "-ltinfo"' \
43-
--replace '"-lunwind"' '"-L${libunwind}/lib" "-lunwind"' \
43+
--replace '"$libunwind"' '"-L${libunwind}/lib" "-lunwind"' \
4444
--replace '"-L@BREW_PREFIX@/opt/libffi/lib"' ' ' \
4545
--replace '"-L@LLVM_LIBRARY_DIR@"' ' ' \
4646
--replace '-L@BREW_PREFIX@/lib' '-L${libcxx}/lib' \

test/c/Inputs/flat-namespace.c

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
#include "api.h"
2+
3+
#include <stdio.h>
4+
#include <stdlib.h>
5+
6+
/**
7+
* The K program corresponding to this test will evaluate the `Int2Bytes` hooked
8+
* function when it is run; this hook ends up allocating and freeing a temporary
9+
* MPZ integer locally. If we're in a situation where `-flat_namespace` has been
10+
* enabled on macOS, it's possible for the hook to end up resolving
11+
* `__gmpz_clear` to a symbol defined in the host binary (rather than libgmp's
12+
* dynamic library!).
13+
*
14+
* This originally manifested when the HB booster loaded a C bindings library
15+
* (the booster statically links libgmp and so contains a symbol with this
16+
* name); this test is a minimised reproduction of that issue.
17+
*/
18+
void __gmpz_clear(void *p) {
19+
abort();
20+
}
21+
22+
int main(int argc, char **argv) {
23+
if (argc <= 1) {
24+
return 1;
25+
}
26+
27+
struct kllvm_c_api api = load_c_api(argv[1]);
28+
29+
api.kllvm_init();
30+
31+
kore_sort *sort_foo = api.kore_composite_sort_new("SortFoo");
32+
kore_pattern *pat = api.kore_composite_pattern_new("Lblfoo");
33+
34+
kore_pattern *input = api.kore_pattern_make_interpreter_input(pat, sort_foo);
35+
36+
block *term = api.kore_pattern_construct(input);
37+
block *after = api.take_steps(-1, term);
38+
39+
printf("%s", api.kore_block_dump(after));
40+
}

0 commit comments

Comments
 (0)