Address the most important (decided by me) of Teresa's DTLTO review comments #2

bd1976bris · 2025-03-05T16:06:18Z

No description provided.

This will be used for supplying a `-target=<triple>` option to clang for DTLTO. Note that If DTLTO is changed to use an optimisation tool that does not require an explicit triple to be passed then the triple handling can be removed entirely.

The DTLTO ThinLTO backend will override this function to perform code generation. The DTLTO ThinLTO backend will not do any codegen when invoked for each task. Instead, it will generate the required information (e.g., the summary index shard, import list, etc.) to allow for the codegen to be performed externally. The backend's `wait` function will then invoke an external distributor process to do backend compilations.

The DTLTO ThinLTO backend will need to customize the file paths used for the summary index shard files. This is so that the names of these files can include a unique ID so that multiple DTLTO links do not overwrite each other's summary index shard files. It also needs to be able to get the import files list for each job in memory rather than writing it to a file. This is to allow the import lists to be included in the JSON used to communicate with the external distributor process.

The new setup() member is now called to allow the ThinLTO backends to prepare for code generation. For the DTLTO backend, this will be used to preallocate storage for the information required to perform the backend compilation jobs.

Move some setup code and state to a base class. This will allow the DTLTO ThinLTO backend to share this setup code and state by also deriving from this base class.

Structural changes: 1. A new ThinLTO backend implementing DTLTO has been added. 2. Both the new backend and the InProcess backend derive from a common base class to share common setup code and state. 3. The target triple is now stored for the ThinLTO bitcode files. 4. A new setup() member is called, which the ThinLTO backends can use to prepare for code generation. For the DTLTO backend, this is used to pre-allocate storage for the information required to perform the backend compilation jobs. 5. The functions for emitting summary index shard and imports files have been altered to allow the caller to specify the filenames to write and to allow the list of imports to be stored in a container rather than written to a file.

Intentionally minimal for now. Additional state translation will be added in future commits.

Replace the use of AddStream in the DTLTO ThinLTO backend to add files to the link with AddBuffer. Unlike the InProcess ThinLTO backend, DTLTO runs the backend compilation jobs by invoking an external process (currently clang). This writes the output object file to disk. Therefore, DTLTO requires a performant way of adding an existing file to the link. Note that the AddBuffer mechanism is also used for adding a file to the link if there is a cache hit.

Some of the code in LTO.h did not conform to the LLVM coding standard. However, it did match the style already used in that file. However this was causing automated code-formatting checks from Github to fail which was confusing on the PR. I decided to apply clang-format everywhere to prevent this - even though the new code no longer matches the style of the existing.

1. Use "with" when opening files. 2. Add module docstrings.

- Replace `thinlto-remote-opt-tool` with `thinlto-remote-compiler` - Make `thinlto-remote-compiler` a cl::opt

- Pull in documentation improvements.

I have changed this to be `PID`. Note that I expect the naming strategy to change in the future as we get feedback from more distribution systems. I have some concerns that mixing the PID into the filenames might prevent the caching systems for some distribution systems from working effectively.

Thanks. Fixed. > Redundant with earlier line Thanks. Remove the first check leaving only the -save-temps one. > Similar to above suggest s/INDEX/IMPORT/ Thanks. Fixed.

No. Thanks. Fixed now.

…nt string be used here? Thanks. Nice comment. I have removed all the tracing code from the new backend now. I proposed to add time tracing in a separate later PR. Is that acceptable?

Apologies! Something left over from an older revision. I have fixed it to read "(optionally) imports to disk".

…lang and lld patches go in? If so, do we need cross project testing right now? Wondering if it would be better to just stick with the testing you have in this patch that is under llvm/test, and add the cross project testing using only external tools once those patches are in. This test checks that the DTLTO code that the code that creates Clang command lines generates the expected options and that those options are accepted by clang. I wrote it to use llvm-lto and opt so it could be part of this commit before the clang and lld patches go in. However, I think it is a nice principle to only use external tools in cross-project-tests. I have removed the test from this PR with the intention of adding it back later. Additionally if we can use your idea (llvm#127749 (comment)) of storing the Clang `-c` step options and applying them to the remote backend compilations then this test will be rewritten to test that behaviour instead.

The function already exposes a work list to avoid deep recursion, this commit starts utilizing it in a helper that could also lead to a deep recursion. We have observed this crash on `clang/test/C/C99/n590.c` with our internal builds that enable aggressive optimizations and hit the limit earlier than default release builds of Clang. See the added test for an example with a deeper recursion that used to crash in upstream Clang before this change with the following stack trace: ``` #0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13 #1 llvm::sys::RunSignalHandlers() /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Signals.cpp:106:18 #2 SignalHandler(int, siginfo_t*, void*) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3 #3 (/lib/x86_64-linux-gnu/libc.so.6+0x3fdf0) #4 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12772:0 llvm#5 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#6 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#7 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#8 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#9 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#10 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#11 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#12 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#13 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#14 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#15 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#16 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#17 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#18 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#19 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 ... 700+ more stack frames. ```

Fix unnecessary conversion of C-String to StringRef in the `Cmp` lambda inside `lookupLLVMIntrinsicByName`. This both fixes an ASAN error in the code that happens when the `Name` StringRef passed in is not a Null terminated StringRef, and additionally can potentially speed up the code as well by eliminating the unnecessary computation of string length every time a C String is converted to StringRef in this code (It seems practically this computation is eliminated in optimized builds, but this will avoid it in O0 builds as well). Added a unit test that demonstrates this issue by building LLVM with these options: ``` CMAKE_BUILD_TYPE=Debug LLVM_USE_SANITIZER=Address LLVM_OPTIMIZE_SANITIZED_BUILDS=OFF ``` The error reported is as follows: ``` ==462665==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5030000391a2 at pc 0x56525cc30bbf bp 0x7fff9e4ccc60 sp 0x7fff9e4cc428 READ of size 19 at 0x5030000391a2 thread T0 #0 0x56525cc30bbe in strlen (upstream-llvm-second/llvm-project/build/unittests/IR/IRTests+0x713bbe) (BuildId: 0651acf1e582a4d2) #1 0x7f8ff22ad334 in std::char_traits<char>::length(char const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/char_traits.h:399:9 #2 0x7f8ff22a34a0 in llvm::StringRef::StringRef(char const*) /home/rjoshi/upstream-llvm-second/llvm-project/llvm/include/llvm/ADT/StringRef.h:96:33 #3 0x7f8ff28ca184 in _ZZL25lookupLLVMIntrinsicByNameN4llvm8ArrayRefIjEENS_9StringRefES2_ENK3$_0clIjPKcEEDaT_T0_ upstream-llvm-second/llvm-project/llvm/lib/IR/Intrinsics.cpp:673:18 ```

Tracked at llvm#112294 This patch implements from [basic.link]p14 to [basic.link]p18 partially. The explicitly missing parts are: - Anything related to specializations. - Decide if a pointer is associated with a TU-local value at compile time. - [basic.link]p15.1.2 to decide if a type is TU-local. - Diagnose if TU-local functions from other TU are collected to the overload set. See [basic.link]p19, the call to 'h(N::A{});' in translation unit #2 There should be other implicitly missing parts as the wording uses "names" briefly several times. But to implement this precisely, we have to visit the whole AST, including Decls, Expression and Types, which may be harder to implement and be more time-consuming for compilation time. So I choose to implement the common parts. It won't be too bad to miss some cases since we DIDN'T do any such checks in the past 3 years. Any new check is an improvement. Given modules have been basically available since clang15 without such checks, it will be user unfriendly if we give a hard error now. And there are a lot of cases which violating the rule actually just fine. So I decide to emit it as warnings instead of hard errors.

Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime. I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet. ``` (lldb) process connect --plugin wasm connect://localhost:4567 Process 1 stopped * thread #1, name = 'nobody', stop reason = trace frame #0: 0x40000000000001ad wasm32_args.wasm`main: -> 0x40000000000001ad <+3>: global.get 0 0x40000000000001b3 <+9>: i32.const 16 0x40000000000001b5 <+11>: i32.sub 0x40000000000001b6 <+12>: local.set 0 (lldb) b add Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c (lldb) c Process 1 resuming Process 1 stopped * thread #1, name = 'nobody', stop reason = breakpoint 1.1 frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 1 int 2 add(int a, int b) 3 { -> 4 return a + b; 5 } 6 7 int (lldb) bt * thread #1, name = 'nobody', stop reason = breakpoint 1.1 * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12 frame #2: 0x40000000000001fe wasm32_args.wasm ``` This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets. My motivation for supporting Wasm is to support debugging Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html

Pointers and GEP are untyped. SPIR-V required structured OpAccessChain. This means the backend will have to determine a good way to retrieve the structured access from an untyped GEP. This is not a trivial problem, and needs to be addressed to have a robust compiler. The issue is other workstreams relies on the access chain deduction to work. So we have 2 options: - pause all dependent work until we have a good chain deduction. - submit this limited fix to we can work on both this and other features in parallel. Choice we want to make is #2: submitting this **knowing this is not a good** fix. It only increase the number of patterns we can work with, thus allowing others to continue working on other parts of the backend. This patch as-is has many limitations: - If cannot robustly determine the depth of the structured access from a GEP. Fixing this would require looking ahead at the full GEP chain. - It cannot always figure out the correct access indices, especially with dynamic indices. This will require frontend collaboration. Because we know this is a temporary hack, this patch only impacts the logical SPIR-V target. Physical SPIR-V, which can rely on pointer cast remains on the old method. Related to llvm#145002

bd1976bris added 22 commits February 19, 2025 13:28

[DTLTO][LLVM] Derive the InProcess backend from a base class

d55d8c0

Move some setup code and state to a base class. This will allow the DTLTO ThinLTO backend to share this setup code and state by also deriving from this base class.

[DTLTO][LLVM] Translate some LTO configuration state into clang options.

0490b3b

Intentionally minimal for now. Additional state translation will be added in future commits.

[DTLTO][LLVM][Doc] Add DTLTO documentation

0c46c0c

Improve the test distributors

9b5162d

1. Use "with" when opening files. 2. Add module docstrings.

Address minor test nits

2be4b0c

Support more than one LTO partition.

8923484

UI improvements to the current remote-opt-tool

0bfafdd

- Replace `thinlto-remote-opt-tool` with `thinlto-remote-compiler` - Make `thinlto-remote-compiler` a cl::opt

Update python module docstrings to match the script names

2f9d381

Sync with llvm#127749

a828b70

- Pull in documentation improvements.

> Maybe NOIMPORTFILES (they are imports files not index files)?

1eebf91

Thanks. Fixed. > Redundant with earlier line Thanks. Remove the first check leaving only the -save-temps one. > Similar to above suggest s/INDEX/IMPORT/ Thanks. Fixed.

> Is this description correct?

b46bfab

No. Thanks. Fixed now.

> This isn't really tracing the full "thin backend", should a differe…

a98fe79

…nt string be used here? Thanks. Nice comment. I have removed all the tracing code from the new backend now. I proposed to add time tracing in a separate later PR. Is that acceptable?

> What is IndexPath?

88e6151

Apologies! Something left over from an older revision. I have fixed it to read "(optionally) imports to disk".

bd1976bris force-pushed the DTLTO_llvm_only branch from 142bd00 to 8b109cb Compare May 23, 2025 03:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Address the most important (decided by me) of Teresa's DTLTO review comments #2

Address the most important (decided by me) of Teresa's DTLTO review comments #2

Uh oh!

bd1976bris commented Mar 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Address the most important (decided by me) of Teresa's DTLTO review comments #2

Are you sure you want to change the base?

Address the most important (decided by me) of Teresa's DTLTO review comments #2

Uh oh!

Conversation

bd1976bris commented Mar 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant