-
Notifications
You must be signed in to change notification settings - Fork 15k
[Bazel] Export compiler-rt builtins sources #157200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This provides a structured collection of the source files used in the compiler-rt builtins library in Bazel. Normal build rules often don't work for runtime libraries as they may need to be built for a specific target platform and in an environment with the associated SDK available to build for that target. Instead, this PR exports the sources in a structured way that can be used by downstream users to collect and build these runtimes in a target-appropriate manner. Currently, this includes AArch64, AArch32 (with and without VFP), x86-64, i386, PPC, and RISC-V. Where I could see a useful division of functionality, those are also exposed. The rules use over-wide globs to minimize the need to manually update lists of files or to risk things slipping out of date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do keep in mind that the wide globs could lead to accidentally "overincluding" files in the future which might lead to bugs that are harder to track down than more obvious "missing file" bugs which can fairly easily be fixed via visual comparison to the CMake sources. Not sure what the better tradeoff is.
| "echo '#define MODEL " + model + "' >> $(OUTS) && " + | ||
| "cat $(SRCS) >> $(OUTS)" | ||
| ), | ||
| )] for (pat, size, model) in AARCH64_OUTLINE_ATOMICS] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It looks like cmake uses the preprocessor here to resolve these values directly for the outputs. IIUC, prepending them to the sources could leak these defines to consumers. The builtins_aarch64_srcs seems to be a final target, so it wouldn't be noticeable in the llvm build itself.
Since these are .S files, I'm not sure how relevant this actually is. If it could be of concern, an alternative might be something like this which initially might seem inefficient but it appears that (at least with the toolchain i tested this with) it actually breaks after one iteration as the compiler comes first in this list (and the list itself has consistent ordering).
cmd = """
$$(for tool in $(locations @bazel_tools//tools/cpp:current_cc_toolchain); do
if [[ $$tool == *clang ]] || [[ $$tool == *gcc ]]; then
echo $$tool
break
fi
done) -E WHATEVER_COMMAND_ARGS > $@
""",
tools = ["@bazel_tools//tools/cpp:current_cc_toolchain"],| COMMAND ${CMAKE_COMMAND} -E ${COMPILER_RT_LINK_OR_COPY} "${source_asm}" "${helper_asm}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought was precisely that by keeping these as .S files we can have clang do this for us while assembling.
Downstream, this pattern has been working well for some time, so I'm hesitant to change it to a more complex thing at this stage. We can revisit of course if anyone encounters problems with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah I did mean to keep the .S files but to use the preprocessor so that the defines don't somehow collide with someone that includes these files or the assembly.h file. (The lse.S file doesn't seem to have include guards).
But tbh this seems more like something that might make more sense to change on the CMake side first if at all. Once the rule-based cc_toolchains in rules_cc get a bit more stable/widely used it might make sense to revisit this if those make it easier to get just a preprocessor, but at the moment just the echos seem like the better option if this works for current downstream usage already.
The bazel_skylib's template expansion isn't really an option here either as the usage of these macros is not something that can easily be substituted in via the template engine.
So I think this is good the way it is 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
Note that in testing I discovered I needed to do something similar for the header-only SipHash used by some of the builtins. PTAL if you can?
Do keep in mind that the wide globs could lead to accidentally "overincluding" files in the future which might lead to bugs that are harder to track down than more obvious "missing file" bugs which can fairly easily be fixed via visual comparison to the CMake sources. Not sure what the better tradeoff is.
Well, in working with an explicit list for a few years downstream, we have definitely ended up with missing files that went unnoticed. So I think at least initially that motivates using the globs.
When initially building this version, I ended up on several cases having overly broad globs and most (but not all) were easily caught due to failing to build for the relevant target platform. So I'm hopeful that this tradeoff works well in practice, but certainly happy for us to revisit if over-inclusion becomes a problem.
| "echo '#define MODEL " + model + "' >> $(OUTS) && " + | ||
| "cat $(SRCS) >> $(OUTS)" | ||
| ), | ||
| )] for (pat, size, model) in AARCH64_OUTLINE_ATOMICS] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought was precisely that by keeping these as .S files we can have clang do this for us while assembling.
Downstream, this pattern has been working well for some time, so I'm hesitant to change it to a more complex thing at this stage. We can revisit of course if anyone encounters problems with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
| "echo '#define MODEL " + model + "' >> $(OUTS) && " + | ||
| "cat $(SRCS) >> $(OUTS)" | ||
| ), | ||
| )] for (pat, size, model) in AARCH64_OUTLINE_ATOMICS] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah I did mean to keep the .S files but to use the preprocessor so that the defines don't somehow collide with someone that includes these files or the assembly.h file. (The lse.S file doesn't seem to have include guards).
But tbh this seems more like something that might make more sense to change on the CMake side first if at all. Once the rule-based cc_toolchains in rules_cc get a bit more stable/widely used it might make sense to revisit this if those make it easier to get just a preprocessor, but at the moment just the echos seem like the better option if this works for current downstream usage already.
The bazel_skylib's template expansion isn't really an option here either as the usage of these macros is not something that can easily be substituted in via the template engine.
So I think this is good the way it is 👍
|
Thanks for the review, merging! |
These were discovered by testing on more platforms and manual comparison of the built runtimes with working ones to spot issues. It also includes fixes enabled by switching to the upstream LLVM CompilerRT support added in llvm/llvm-project#157200
This provides a structured collection of the source files used in the compiler-rt builtins library in Bazel.
Normal build rules often don't work for runtime libraries as they may need to be built for a specific target platform and in an environment with the associated SDK available to build for that target. Instead, this PR exports the sources in a structured way that can be used by downstream users to collect and build these runtimes in a target-appropriate manner.
Currently, this includes AArch64, AArch32 (with and without VFP), x86-64, i386, PPC, and RISC-V. Where I could see a useful division of functionality, those are also exposed.
The rules use over-wide globs to minimize the need to manually update lists of files or to risk things slipping out of date.