forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 2
Gpu san backup #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
vidsinghal
wants to merge
46
commits into
jdoerfert:gpu_san
Choose a base branch
from
vidsinghal:gpu_san_backup
base: gpu_san
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Through the new `-foffload-via-llvm` flag, CUDA kernels can now be lowered to the LLVM/Offload API. On the Clang side, this is simply done by using the OpenMP offload toolchain and emitting calls to `llvm*` functions to orchestrate the kernel launch rather than `cuda*` functions. These `llvm*` functions are implemented on top of the existing LLVM/Offload API. As we are about to redefine the Offload API, this wil help us in the design process as a second offload language. We do not support any CUDA APIs yet, however, we could: https://www.osti.gov/servlets/purl/1892137 For proper host execution we need to resurrect/rebase https://tianshilei.me/wp-content/uploads/2021/12/llpp-2021.pdf (which was designed for debugging). ``` ❯❯❯ cat test.cu extern "C" { void *llvm_omp_target_alloc_shared(size_t Size, int DeviceNum); void llvm_omp_target_free_shared(void *DevicePtr, int DeviceNum); } __global__ void square(int *A) { *A = 42; } int main(int argc, char **argv) { int DevNo = 0; int *Ptr = reinterpret_cast<int *>(llvm_omp_target_alloc_shared(4, DevNo)); *Ptr = 7; printf("Ptr %p, *Ptr %i\n", Ptr, *Ptr); square<<<1, 1>>>(Ptr); printf("Ptr %p, *Ptr %i\n", Ptr, *Ptr); llvm_omp_target_free_shared(Ptr, DevNo); } ❯❯❯ clang++ test.cu -O3 -o test123 -foffload-via-llvm --offload-arch=native ❯❯❯ llvm-objdump --offloading test123 test123: file format elf64-x86-64 OFFLOADING IMAGE [0]: kind elf arch gfx90a triple amdgcn-amd-amdhsa producer openmp ❯❯❯ LIBOMPTARGET_INFO=16 ./test123 Ptr 0x155448ac8000, *Ptr 7 Ptr 0x155448ac8000, *Ptr 42 ```
This provides the header overlay for cuda_runtime.h which is found before any CUDA installation (none is necessary). Some basic APIs are defined in terms of the omp_target_* ones, but with the API redesign the requirements of CUDA should be taken into account. Based on: llvm#94549
The offload APIs, and the CUDA wrappers in clang, now support "default streams" per thread (and per device). It should be per context but we don't really expose that concept yet. The KernelArguments allow an LLVM/Offload user to provide a "AsyncInfoQueue", which is plugin dependent and can later also be created outside or queried from the runtime. User managed "queues" are kept persistent, thus not returned to the pool once synchronized. The CUDA tests will synchronize via `cudaDeviceSynchronize` before checking the results. Based on: llvm#94821
1edce23 to
9b7ad4b
Compare
3a99d18 to
acf4219
Compare
jdoerfert
pushed a commit
that referenced
this pull request
Aug 30, 2024
…104523) Compilers and language runtimes often use helper functions that are fundamentally uninteresting when debugging anything but the compiler/runtime itself. This patch introduces a user-extensible mechanism that allows for these frames to be hidden from backtraces and automatically skipped over when navigating the stack with `up` and `down`. This does not affect the numbering of frames, so `f <N>` will still provide access to the hidden frames. The `bt` output will also print a hint that frames have been hidden. My primary motivation for this feature is to hide thunks in the Swift programming language, but I'm including an example recognizer for `std::function::operator()` that I wished for myself many times while debugging LLDB. rdar://126629381 Example output. (Yes, my proof-of-concept recognizer could hide even more frames if we had a method that returned the function name without the return type or I used something that isn't based off regex, but it's really only meant as an example). before: ``` (lldb) thread backtrace --filtered=false * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12 frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10 frame #5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12 frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame #8: 0x0000000183cdf154 dyld`start + 2476 (lldb) ``` after ``` (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame #8: 0x0000000183cdf154 dyld`start + 2476 Note: Some frames were hidden by frame recognizers ```
jdoerfert
pushed a commit
that referenced
this pull request
Sep 20, 2024
…lvm#107294) Random testing revealed it's possible to crash the analyzer with the command line invocation: clang -cc1 -analyze -analyzer-checker=nullability empty.c where the source file, empty.c is an empty source file. ``` clang: <root>/clang/lib/StaticAnalyzer/Core/CheckerManager.cpp:56: void clang::ento::CheckerManager::finishedCheckerRegistration(): Assertion `Event.second.HasDispatcher && "No dispatcher registered for an event"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ Stack dump: 0. Program arguments: clang -cc1 -analyze -analyzer-checker=nullability nullability-nocrash.c #0 ... ... #7 <addr> clang::ento::CheckerManager::finishedCheckerRegistration() #8 <addr> clang::ento::CheckerManager::CheckerManager(clang::ASTContext&, clang::AnalyzerOptions&, clang::Preprocessor const&, llvm::ArrayRef<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, llvm::ArrayRef<std::function<void (clang::ento::CheckerRegistry&)>>) ``` This commit removes the assertion which failed here, because it was logically incorrect: it required that if an Event is handled by some (enabled) checker, then there must be an **enabled** checker which can emit that kind of Event. It should be OK to disable the event-producing checkers but enable an event-consuming checker which has different responsibilities in addition to handling the events. Note that this assertion was in an `#ifndef NDEBUG` block, so this change does not impact the non-debug builds. Co-authored-by: Vince Bridgers <[email protected]>
jdoerfert
pushed a commit
that referenced
this pull request
Sep 20, 2024
Random testing found that the Z3 wrapper does not support UnarySymExpr, which was added recently and not included in the original Z3 wrapper. For now, just avoid submitting expressions to Z3 to avoid compiler crashes. Some crash context ... clang -cc1 -analyze -analyzer-checker=core z3-unarysymexpr.c -analyzer-constraints=z3 Unsupported expression to reason about! UNREACHABLE executed at clang/include/clang/StaticAnalyzer/Core/PathSensitive/SMTConstraintManager.h:297! Stack dump: 3. <root>/clang/test/Analysis/z3-unarysymexpr.c:13:7: Error evaluating branch #0 <addr> llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) #1 <addr> llvm::sys::RunSignalHandlers() #8 <addr> clang::ento::SimpleConstraintManager::assumeAux( llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) #9 <addr> clang::ento::SimpleConstraintManager::assume( llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) Co-authored-by: einvbri <[email protected]>
jdoerfert
pushed a commit
that referenced
this pull request
May 9, 2025
The mcmodel=tiny memory model is only valid on ARM targets. While trying this on X86 compiler throws an internal error along with stack dump. llvm#125641 This patch resolves the issue. Reduced test case: ``` #include <stdio.h> int main( void ) { printf( "Hello, World!\n" ); return 0; } ``` ``` 0. Program arguments: /opt/compiler-explorer/clang-trunk/bin/clang++ -gdwarf-4 -g -o /app/output.s -fno-verbose-asm -S --gcc-toolchain=/opt/compiler-explorer/gcc-snapshot -fcolor-diagnostics -fno-crash-diagnostics -mcmodel=tiny <source> 1. <eof> parser at end of file #0 0x0000000003b10218 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b10218) #1 0x0000000003b0e35c llvm::sys::CleanupOnSignal(unsigned long) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b0e35c) #2 0x0000000003a5dbc3 llvm::CrashRecoveryContext::HandleExit(int) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a5dbc3) #3 0x0000000003b05cfe llvm::sys::Process::Exit(int, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b05cfe) #4 0x0000000000d4e3eb LLVMErrorHandler(void*, char const*, bool) cc1_main.cpp:0:0 #5 0x0000000003a67c93 llvm::report_fatal_error(llvm::Twine const&, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a67c93) #6 0x0000000003a67df8 (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a67df8) #7 0x0000000002549148 llvm::X86TargetMachine::X86TargetMachine(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, std::optional<llvm::Reloc::Model>, std::optional<llvm::CodeModel::Model>, llvm::CodeGenOptLevel, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x2549148) #8 0x00000000025491fc llvm::RegisterTargetMachine<llvm::X86TargetMachine>::Allocator(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, std::optional<llvm::Reloc::Model>, std::optional<llvm::CodeModel::Model>, llvm::CodeGenOptLevel, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x25491fc) #9 0x0000000003db74cc clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3db74cc) #10 0x0000000004460d95 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4460d95) #11 0x00000000060005ec clang::ParseAST(clang::Sema&, bool, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x60005ec) #12 0x00000000044614b5 clang::CodeGenAction::ExecuteAction() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x44614b5) #13 0x0000000004737121 clang::FrontendAction::Execute() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4737121) #14 0x00000000046b777b clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x46b777b) #15 0x00000000048229e3 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x48229e3) #16 0x0000000000d50621 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd50621) #17 0x0000000000d48e2d ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0 #18 0x00000000044acc99 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0 #19 0x0000000003a5dac3 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a5dac3) #20 0x00000000044aceb9 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0 #21 0x00000000044710dd clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x44710dd) #22 0x0000000004472071 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4472071) #23 0x000000000447c3fc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x447c3fc) llvm#24 0x0000000000d4d2b1 clang_main(int, char**, llvm::ToolContext const&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd4d2b1) llvm#25 0x0000000000c12464 main (/opt/compiler-explorer/clang-trunk/bin/clang+++0xc12464) llvm#26 0x00007ae43b029d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90) llvm#27 0x00007ae43b029e40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40) llvm#28 0x0000000000d488c5 _start (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd488c5) ``` --------- Co-authored-by: Shashwathi N <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.