Skip to content

Conversation

@jathu
Copy link
Contributor

@jathu jathu commented Sep 28, 2023

Fixes #63890 | Migration of D158942


These builtin headers are required to exist alongside clang-tidy for it to work. Specifically, the tool looks for headers relatively in ../lib/clang/{major version}/include.

cc @rnk @rupprecht @aeubanks @chapuni

@jathu jathu marked this pull request as ready for review September 28, 2023 01:24
@jathu jathu changed the title [clang-tidy][bazel] Include builtin headers with clang-tidy [Bazel][Clang Tidy] Include builtin headers with clang-tidy Sep 29, 2023
@aeubanks
Copy link
Contributor

I think there are two approaches to this problem

  1. point -resource-dir to the directory where the headers are, e.g. clang-tidy --extra-arg=-resource-dir=path/to/include
  2. mirror the CMake binary/include layout (this patch)

issues with 1: extra argument to pass, have to figure out include path to pass
issues with 2: relies on bazel paths, i.e. this puts the symlink at bazel-bin/external/llvm-project/clang-tools-extra/ which happens to be two directories up from bazel-bin/external/llvm-project/clang-tools-extra/clang-tidy/clang-tidy

for 1, I'm not sure how hard it is in bazel to get the path for where we put the builtins
for 2, is it possible to explicitly copy clang-tidy and the headers to a path that mirrors the CMake build? e.g. put clang-tidy in a bin/ dir and the headers in lib/clang/18/include. or is that too un-bazel-like? the gn build does this.

@aeubanks aeubanks self-requested a review October 23, 2023 21:33
@rnk
Copy link
Collaborator

rnk commented Oct 23, 2023

I think option 1 isn't really a permanent solution. We have lots of clang tools that need to find the resource directory, and it should happen automatically.

For option 2, we'd have to reimplement that for every other clang tool that needs to find resources, like LLD as well.

Should we consider baking in some kind of configurable resource directory searching logic into the binary, similar to an RPATH? Essentially, instead of searching at "$(dirname $argv0)/../lib/clang/${version}", we'd make "../lib/clang" configurable, or provide an absolute path.

Actually, a CMake option for this already exists, CLANG_RESOURCE_DIR:
https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Config/config.h.cmake#L39C1-L39C1

Should Bazel go ahead and use that directly? Should we perhaps enhance it so that it is baked into each executable, rather than having it global to in config.h?

@aeubanks
Copy link
Contributor

my worry about baking in the resource dir for the bazel build is that it'll be tied to the bazel build directory structure implementation details, so it's likely that you can't copy the binaries around and create a toolchain out of it. if bazel users don't care about this, then that's fine, but it seems like a common use case to assemble a toolchain that resembles a toolchain built with CMake. that's why I think it'd be nice to explicitly copy files into a typical toolchain structure, and clang tools that rely on the default builtin headers relative path will all "just work"

or something along the lines of this patch where we symlink the headers (choosing relative directories in a more structured way if possible as mentioned earlier, otherwise this is fine if there's no cleaner way)

Copy link

@erl4ng erl4ng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@jathu
Copy link
Contributor Author

jathu commented Nov 3, 2023

@aeubanks @rnk

issues with 2: relies on bazel paths, i.e. this puts the symlink at bazel-bin/external/llvm-project/clang-tools-extra/ which happens to be two directories up from bazel-bin/external/llvm-project/clang-tools-extra/clang-tidy/clang-tidy

I also think this is weird, however this seems to be the location clang-tidy expects the headers to be located. This symlink approach was the least intrusive solution I can think of without modifying the overlay logic.

my worry about baking in the resource dir for the bazel build is that it'll be tied to the bazel build directory structure implementation details, so it's likely that you can't copy the binaries around and create a toolchain out of it

Maybe I'm misunderstanding your point — since these headers are included in the runfiles of the binary, if we do a copy/package the binary, the runfiles will be bundled with the binary. If there is a solution to bundle the headers within the binary, that would be great, but I think that goes beyond the goal of this diff.

@chapuni
Copy link
Contributor

chapuni commented Nov 26, 2023

I still think this is not a right way to control Bazel's build directory tree.

IMO, we should publish the tree with the archive (with filegroup), or produce the utility shell script to construct the symlink-ed tree for one's convenience.

@jathu
Copy link
Contributor Author

jathu commented Nov 26, 2023

@chapuni can you help my understand what you mean by those two suggestions by providing some examples?

construct the symlink-ed tree for one's convenience

Isn't that what we are doing in this diff? We are producing the symlinked tree for the tool.

@jathu
Copy link
Contributor Author

jathu commented Feb 9, 2024

@rnk @aeubanks any feedback or update on this? I believe this solution is reasonable enough to land as is — we can follow up with a preferred solution later on. Please note that the Bazel clang-tidy binary is unusable in its current state

@aeubanks
Copy link
Contributor

$ cat /tmp/b.c
#include <stddef.h>
$ bazel-bin/external/llvm-project/clang/clang -fsyntax-only /tmp/b.c
/tmp/b.c:1:10: fatal error: 'stddef.h' file not found
    1 | #include <stddef.h>
      |          ^~~~~~~~~~
1 error generated.

clang also doesn't work directly from bazel-bin. I don't think it makes sense to add in a workaround just for clang-tidy if clang doesn't even work. I still think this points to wanting a more complete solution where we properly mirror the upstream build directory.

@jathu
Copy link
Contributor Author

jathu commented Feb 17, 2024

@aeubanks

I still think this points to wanting a more complete solution where we properly mirror the upstream build directory.

Seems like this large effort can only be accomplished by the team at Google. Would this be prioritized at all by the Bazel/Clang team? This will ladder up to having a true hermetic C++ build environment

I don't think it makes sense to add in a workaround just for clang-tidy if clang doesn't even work.

Please help me understand the reasoning behind this — this change is relatively independent to the rest of the llvm repo. It primarily exists (based on the mirror) to support clang-tools-extra. The change is pretty benign considering it's just symlinking a tree without affecting the source


If this proposed change is a hard no, is there a way to get clang-tidy to work in its current state? I've tried various attempts (#63890), and it doesn't seem possible.

@aeubanks
Copy link
Contributor

Just checked our internal bazel build, and it's the same issue, a just-built clang without any setting up can't find system headers. We have to copy all built binaries/libraries into a certain directory structure as a separate build step to roughly match the CMake build. Either you can do that as a separate step outside of bazel, or have bazel setup the proper directory structure, but this PR is very specific to clang-tidy and I'd rather see a more complete and maintainable solution.

For example, the gn build has a rule to copy headers to a specific directory to mirror the CMake build, and by default it puts executables in $build_dir/bin, which also mirrors the CMake build. In contrast, this PR creates a symlink to headers that only works for clang-tidy, and sort of by happenstance as I've mentioned before, which is too hacky for my taste.

Perhaps @rupprecht has some thoughts, I believe he's been doing something similar recently.

@rupprecht
Copy link
Collaborator

Just checked our internal bazel build, and it's the same issue, a just-built clang without any setting up can't find system headers. We have to copy all built binaries/libraries into a certain directory structure as a separate build step to roughly match the CMake build. Either you can do that as a separate step outside of bazel, or have bazel setup the proper directory structure, but this PR is very specific to clang-tidy and I'd rather see a more complete and maintainable solution.

For example, the gn build has a rule to copy headers to a specific directory to mirror the CMake build, and by default it puts executables in $build_dir/bin, which also mirrors the CMake build. In contrast, this PR creates a symlink to headers that only works for clang-tidy, and sort of by happenstance as I've mentioned before, which is too hacky for my taste.

Perhaps @rupprecht has some thoughts, I believe he's been doing something similar recently.

It's definitely not at all ready (even internally), but the gist of what I have is to specify a toolchain configuration (i.e. what binaries you want to include in your toolchain, what platforms to support, etc.) and be able to use that as input to construct a runfiles tree that mirrors what an actual toolchain would look like -- so binaries would be in the same bin dir even if they're in different bazel packages, the headers would be in the right relative location to all the clang tools (like that GN link), etc.

I think it might be quite a while until I decouple my changes from all the internal stuff it uses. Maybe I could simplify it and put it somewhere in utils/bazel?

@rnk
Copy link
Collaborator

rnk commented Feb 21, 2024

I think the important thing I want to get out of the review is agreement about the long term direction, and it sounds like @rupprecht 's solution is a good candidate.

If we agree on that, are folks OK approving this in the short term? Even if this clang-tidy symlink will be removed later, it doesn't seem like high interest tech debt.

@jathu
Copy link
Contributor Author

jathu commented Feb 24, 2024

@aeubanks @rupprecht thanks for the update. The toolchain configuration seems like a great long term solution — I'm glad this is being developed internally and hope to see it in the public repos soon.

For now, as @rnk suggested, can we land this diff to allow clang-tidy to be used out of the box. We can update this once @rupprecht lands his changes.

@jathu jathu closed this Nov 28, 2024
@jathu jathu deleted the clang-tools-headers branch November 28, 2024 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bazel][Clang Tidy] Unable to find standard library

6 participants