-
Notifications
You must be signed in to change notification settings - Fork 796
[SYCL-RTC] Distribute SYCL toolchain header files inside libsycl-jit.so #19924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL-RTC] Distribute SYCL toolchain header files inside libsycl-jit.so #19924
Conversation
We only run SYCL JIT on a single TU at a time, so using `ClangTool` is a bit awkward, as it is primarily used to run the same action across a set of files: https://github.com/intel/llvm/blob/357f96b7e19d8acb972eb2f1fb276dbc6aa2060b/clang/include/clang/Tooling/Tooling.h#L310-L317 Using `ToolInvocation` better matches our scenario of always doing a single clang invocation: https://github.com/intel/llvm/blob/357f96b7e19d8acb972eb2f1fb276dbc6aa2060b/clang/include/clang/Tooling/Tooling.h#L244-L245 Another benefit is that we have more control over the virtual file system which I'm planning to use in a subsequent PR to have the SYCL toolchain headers distributed inside `libsycl-jit.so` and then put into an `llvm::vfs::InMemoryFileSystem` once to be re-used across all compilation queries. I'm also simplifying the inheritance scheme around `clang::ToolAction`. Instead of having both hashing/compiling doing that, I'm providing a single helper that accepts a reference to the `FrontendAction` that can be kept on the caller's stack, reducing the amount of boilerplate helpers necessary, i.e. `RTCToolActionBase`/`GetSourceHashAction`/`GetLLVMModuleAction` before vs. single `SYCLToolchain::Action` after.
05b7838 to
2dad63a
Compare
|
Please ignore first commit during review, it's uploaded separately at #19922 |
2dad63a to
fe999de
Compare
We're still accessing `libdevice`'s `*.bc` files from the file system, those are left for another PR.
fe999de to
800c427
Compare
| # TODO: libdevice | ||
| ) | ||
|
|
||
| set(CMAKE_CXX_COMPILER ${CMAKE_BINARY_DIR}/bin/clang++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not enough (see Win CI failure), but I don't know if we have some precedents to doing something similar. Any pointers would be greatly appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could try random combinations of PARENT_SCOPE CACHE and FORCE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe try setting the C compiler too because cl is both C and C++?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also doing it this way is unsupported, it seems if we make this a subproject then it will reliably work
|
|
||
| namespace jit_compiler { | ||
| // Defined in the auto-generated file: | ||
| extern const std::pair<std::string_view, std::string_view> ToolchainFiles[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any ABI problems looming here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't think so - the use is limited to libsycl-jit.so and this isn't being exported outside the DSO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. What I was getting at was, IIUC, the TU containing ToolchainFiles is compiled with a different compiler and flags than the rest of sycl-jit.so, so maybe there's future ABI problem hiding in there. Seems fine for now, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the TU containing ToolchainFiles is compiled with a different compiler and flags than the rest of sycl-jit.so, so maybe there's future ABI problem hiding in there.
I see now what you meant. I believe if any such issues will happen, we'll catch them immediately with our E2E tests, because now that these files are distributed with the libsycl-jit.so we have much more reproducibility in the behavior. Also, resource.cpp.o is very simple after compiling with optimizations.
5cfb88c to
51f3828
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driver changes LGTM.
Failed after #19924. `remangled-*` requiring `libspirv-builtins` on top of `libclc` is technically a separate issue but I realized it while working on fixing the CI, so can go in in the same PR.
Follows intel#19924 that did the same with headers. With this PR all SYCL Toolchain dependencies are linked as resources into the `libsycl-jit.so` and not accessing DPCPP installation on disk in runtime.
Follows intel#19924 that did the same with headers. With this PR all SYCL Toolchain dependencies are linked as resources into the `libsycl-jit.so` and not accessing DPCPP installation on disk in runtime.
Follows intel#19924 that did the same with headers. With this PR all SYCL Toolchain dependencies are linked as resources into the `libsycl-jit.so` and not accessing DPCPP installation on disk in runtime.
Follows intel#19924 that did the same with headers. With this PR all SYCL Toolchain dependencies are linked as resources into the `libsycl-jit.so` and not accessing DPCPP installation on disk in runtime.
Failed after #19924. `remangled-*` requiring `libspirv-builtins` on top of `libclc` is technically a separate issue but I realized it while working on fixing the CI, so can go in in the same PR.
Follows intel#19924 that did the same with headers. With this PR all SYCL Toolchain dependencies are linked as resources into the `libsycl-jit.so` and not accessing DPCPP installation on disk in runtime.
|
This change breaks cross compilation,
Can you take a look, please? |
|
I don't have any access to such configuration and it's not part of the CI. I think it will be many times more effective if you'd look at it on your side. |
|
It seems like the problem here is using the just-built compiler at all, but we already do that for libdevice, so I'm not sure why that case works but this fails. |
|
For libdevice, we always build for a specific target and include an appropriate We could add a |
|
Ah ok, got it. I don't fully remember/understand the use case for the recent sycl-jit change, so I'll let @aelovikov-intel decide if just using the host triple is sufficient. |
|
Cross compilation PR is #20013. But we'll probably soon remove the cross compilation on our end as we'll soon no longer be in a position where we can contribute fixes when it breaks. |
We use `/MT` downstream where #19924 has caused some issues. This fixes that. --------- Co-authored-by: Jinsong Ji <[email protected]>
This regressed due to intel#19924 but, apparently, we didn't have proper tests in place. I'm not sure what's causing this exactly, but having each compilation create its unique `ToolchainFS` instead of all of them using the same `llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem> SYCLToolchain::ToolchainFS` somehow results in the test (added in this PR) passing consistently.
This regressed due to intel#19924 but, apparently, we didn't have proper tests in place. I'm not sure what's causing this exactly, but having each compilation create its unique `ToolchainFS` instead of all of them using the same `llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem> SYCLToolchain::ToolchainFS` somehow results in the test (added in this PR) passing consistently.
…in (#20360) This regressed due to #19924 but, apparently, we didn't have proper tests in place. I'm not sure what's causing this exactly, but having each compilation create its unique `ToolchainFS` instead of all of them using the same `llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem> SYCLToolchain::ToolchainFS` somehow results in the test (added in this PR) passing consistently.
intel#19924 essentially made it `static` but that caused data races that were later fixed by intel#20360 changing each use of it to re-create this in-memory FS (essentially, "removing" `static`), incurring significant performance costs. This PR addresses the issue by "adding" `thread_local` instead of "removing" `static` allowing us to have both no crashes due to data races and minimal overhead. No tests added as the one from intel#20360 is verifying this.
#19924 essentially made it `static` but that caused data races that were later fixed by #20360 changing each use of it to re-create this in-memory FS (essentially, "removing" `static`), incurring significant performance costs. This PR addresses the issue by "adding" `thread_local` instead of "removing" `static` allowing us to have both no crashes due to data races and minimal overhead. No tests added as the one from #20360 is verifying this.
We're still accessing
libdevice's*.bcfiles from the file system, those are left for another PR.