-
Notifications
You must be signed in to change notification settings - Fork 0
[SYCL] Add libsycl, a SYCL RT library implementation project #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tahonermann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review feedback.
libsycl/include/sycl/platform.hpp
Outdated
| #include <sycl/detail/export.hpp> | ||
|
|
||
| namespace sycl { | ||
| inline namespace _V1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we've proven to ourselves that this helps us in any meaningful way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a versioned inline namespace is consistent with what libc++ does. Is it good to do likewise? I'm not sure. Perhaps worth getting an opinion from libc++ maintainers.
|
This is a great starting point. I agree with Tom. It will be helpful to populate README doc with following information:
It will also be useful to add a very simple test case here. Thanks |
tahonermann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few more comments regarding CMake files.
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
… `getForwardSlice` matchers (llvm#115670) Improve mlir-query tool by implementing `getBackwardSlice` and `getForwardSlice` matchers. As an addition `SetQuery` also needed to be added to enable custom configuration for each query. e.g: `inclusive`, `omitUsesFromAbove`, `omitBlockArguments`. Note: backwardSlice and forwardSlice algoritms are the same as the ones in `mlir/lib/Analysis/SliceAnalysis.cpp` Example of current matcher. The query was made to the file: `mlir/test/mlir-query/complex-test.mlir` ```mlir ./mlir-query /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir -c "match getDefinitions(hasOpName(\"arith.add f\"),2)" Match #1: /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:5:8: %0 = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0 : tensor<5x5xf32>) outs(%arg1 : tensor<5x5xf32>) { ^ /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:7:10: note: "root" binds here %2 = arith.addf %in, %in : f32 ^ Match #2: /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:10:16: %collapsed = tensor.collapse_shape %0 [[0, 1]] : tensor<5x5xf32> into tensor<25xf32> ^ /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:13:11: %c2 = arith.constant 2 : index ^ /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:14:18: %extracted = tensor.extract %collapsed[%c2] : tensor<25xf32> ^ /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:15:10: note: "root" binds here %2 = arith.addf %extracted, %extracted : f32 ^ 2 matches. ```
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
|
Hi folks, @tahonermann @dvrogozh @asudarsa @aelovikov-intel I'd like to ask for your review again. I think Sergey and I either addressed or answered all your questions/comments about code. I also updated version & ABI namespace approach (closer to libcxx now, but not a complete copy). There were many questions about docs & instructions. We can't foresee if community will request it in this PR but:
Please let me know what you think. Could you please review my changes & comments? Are we ready to publish this PR to community? Thank you. |
dvrogozh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my perspective, this can be brought to upstream review.
tahonermann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @KseniyaTikhomirova! They all look good. I added a few more hopefully final comments regarding the PDB handling. Once these are addressed, I expect to be satisfied with the changes.
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
|
@tahonermann, I uploaded the fixes to the recent comments. |
tahonermann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few final comments. The only one I think that might matter is the one about cmake_path.
libsycl/CMakeLists.txt
Outdated
| if(MSVC) | ||
| set_property(GLOBAL PROPERTY USE_FOLDERS ON) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for restricting the USE_FOLDERS property to MSVC? Other projects that set this property (llvm, runtimes, compier-rt) don't (some conditionally set it based on a "standalone" build mode).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this option is widely used for a proper project view in VS IDE. it seems to have not much sense to enable it for linux.
BTW even llvm CMakeLists.txt states that they do it specifically for VS (but still unconditionally).
https://github.com/sergey-semenov/llvm-project/blob/main/llvm/CMakeLists.txt#L378
https://cmake.org/cmake/help/latest/prop_gbl/USE_FOLDERS.html
"Not all CMake generators support recording folder details for targets. The Xcode and Visual Studio generators are examples of generators that do. Similarly, not all IDEs support presenting targets using folder hierarchies, even if the CMake generator used provides the necessary information."
https://devblogs.microsoft.com/cppblog/cmake-support-in-visual-studio-targets-view-single-file-compilation-and-cache-generation-settings/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I do agree that this line is a little strange since this option is important for generator and IDE. Why do we have condition with compiler check? I will remove this condition, it should be fine to set it unconditionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, thanks. I agree Linux users likely wouldn't benefit from this setting, but Apple Xcode users presumably would.
This seems like one of those settings that should be more centrally managed in LLVM, but addressing that would be far out of scope for this PR.
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
|
fixed or replied to new comments. I also noticed that version.hpp that is generated by cmake configure_file was placed in a wrong folder (source folder). would be nice if someone can review it either. |
tahonermann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working with me through all these iterations @KseniyaTikhomirova! I noted two more nits to be addressed, but went ahead and approved the changes. I'll be on vacation for a week starting tomorrow; please feel free to proceed without any further input from me. This looks great!
Also, could you please paste the directory tree that is generated from the equivalent of a make install on either Windows or Linux? That would help to confirm that the install commands are working as expected.
libsycl/CMakeLists.txt
Outdated
| # Install generated headers. | ||
| install(FILES | ||
| "${LIBSYCL_BUILD_INCLUDE_DIR}/sycl/version.hpp" | ||
| DESTINATION "${LIBSYCL_INCLUDE_DIR}/sycl" | ||
| COMPONENT sycl-headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest colocating this install command with the other install command further down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it down but didn't join them.
I reviewed it and I think it looks ok. I do wonder if these changes make the custom command to copy the |
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
Thank you too. We have pretty good synchronization since I will be on vacation next week too. I plan to create upstreaming PR after my return. I think it is better to do it later but to be able to quickly respond to any questions/comments from community. ssfork_llvm$ ninja -C $build_llvm install | grep "sycl" |
Signed-off-by: Tikhomirova, Kseniya <[email protected]>
I think it is more convenient and transparent than to have 2 dirs to include - build dir with generated headers and source include dir with the rest. |
These were failing on our Windows on Arm bot, or more precisely, not even completing. This is because Microsoft's C runtime does extra parameter validation. So when we called _read with an invalid fd, it called an invalid parameter handler instead of returning an error. https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/reference/read?view=msvc-170 https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/parameter-validation?view=msvc-170 (lldb) run Process 8440 launched: 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\unittests\Host\HostTests.exe' (aarch64) Process 8440 stopped * thread #1, stop reason = Exception 0xc0000409 encountered at address 0x7ffb7453564c frame #0: 0x00007ffb7453564c ucrtbase.dll`_get_thread_local_invalid_parameter_handler + 652 ucrtbase.dll`_get_thread_local_invalid_parameter_handler: -> 0x7ffb7453564c <+652>: brk #0xf003 ucrtbase.dll`_invalid_parameter_noinfo: 0x7ffb74535650 <+0>: b 0x7ffb745354d8 ; _get_thread_local_invalid_parameter_handler + 280 0x7ffb74535654 <+4>: nop 0x7ffb74535658 <+8>: nop You can override this handler but I'm assuming that this reading after close isn't a crucial feature, so disabling the tests seems like the way to go. If it is crucial, we can check the fd before we use it. Tests added by llvm#143946.
# Benefit This patch fixes: 1. After `platform select ios-simulator`, `platform process list` will now print processes which are running in the iOS simulator. Previously, no process will be listed. 2. After `platform select ios-simulator`, `platform attach --name <name>` will succeed. Previously, it will error out saying no process is found. # Several bugs that is being fixed 1. During the process listing, add `aarch64` to the list of CPU types for which iOS simulators are checked for. 2. Given a candidate process, when checking for simulators, the original code will find the desired environment variable (`SIMULATOR_UDID`) and set the OS to iOS, but then the immediate next environment variable will set it back to macOS. 3. For processes running on simulator, set the triple's `Environment` to `Simulator`, so that such processes can pass the filtering [in this line](https://fburl.com/8nivnrjx). The original code leave it as the default `UnknownEnvironment`. # Manual test **With this patch:** ``` royshi-mac-home ~/public_llvm/build % bin/lldb (lldb) platform select ios-simulator (lldb) platform process list 240 matching processes were found on "ios-simulator" PID PARENT USER TRIPLE NAME ====== ====== ========== ============================== ============================ 40511 28844 royshi arm64-apple-ios-simulator FocusPlayground // my toy iOS app running on simulator ... // omit 28844 1 royshi arm64-apple-ios-simulator launchd_sim (lldb) process attach --name FocusPlayground Process 40511 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0000000104e3cb70 libsystem_kernel.dylib`mach_msg2_trap + 8 libsystem_kernel.dylib`mach_msg2_trap: -> 0x104e3cb70 <+8>: ret ... // omit ``` **Without this patch:** ``` $ bin/lldb (lldb) platform select ios-simulator (lldb) platform process list error: no processes were found on the "ios-simulator" platform (lldb) process attach --name FocusPlayground error: attach failed: could not find a process named FocusPlayground ``` # Unittest See PR.
The function already exposes a work list to avoid deep recursion, this commit starts utilizing it in a helper that could also lead to a deep recursion. We have observed this crash on `clang/test/C/C99/n590.c` with our internal builds that enable aggressive optimizations and hit the limit earlier than default release builds of Clang. See the added test for an example with a deeper recursion that used to crash in upstream Clang before this change with the following stack trace: ``` #0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13 #1 llvm::sys::RunSignalHandlers() /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Signals.cpp:106:18 #2 SignalHandler(int, siginfo_t*, void*) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3 #3 (/lib/x86_64-linux-gnu/libc.so.6+0x3fdf0) #4 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12772:0 llvm#5 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#6 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#7 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#8 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#9 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#10 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#11 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#12 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#13 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#14 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#15 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#16 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#17 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#18 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#19 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 ... 700+ more stack frames. ```
Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime. I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet. ``` (lldb) process connect --plugin wasm connect://localhost:4567 Process 1 stopped * thread #1, name = 'nobody', stop reason = trace frame #0: 0x40000000000001ad wasm32_args.wasm`main: -> 0x40000000000001ad <+3>: global.get 0 0x40000000000001b3 <+9>: i32.const 16 0x40000000000001b5 <+11>: i32.sub 0x40000000000001b6 <+12>: local.set 0 (lldb) b add Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c (lldb) c Process 1 resuming Process 1 stopped * thread #1, name = 'nobody', stop reason = breakpoint 1.1 frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 1 int 2 add(int a, int b) 3 { -> 4 return a + b; 5 } 6 7 int (lldb) bt * thread #1, name = 'nobody', stop reason = breakpoint 1.1 * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12 frame #2: 0x40000000000001fe wasm32_args.wasm ``` This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets. My motivation for supporting Wasm is to support debugging Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html
…erver (llvm#148774) Summary: There was a deadlock was introduced by [PR llvm#146441](llvm#146441) which changed `CurrentThreadIsPrivateStateThread()` to `CurrentThreadPosesAsPrivateStateThread()`. This change caused the execution path in [`ExecutionContextRef::SetTargetPtr()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L513) to now enter a code block that was previously skipped, triggering [`GetSelectedFrame()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L522) which leads to a deadlock. Thread 1 gets m_modules_mutex in [`ModuleList::AppendImpl`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Core/ModuleList.cpp#L218), Thread 3 gets m_language_runtimes_mutex in [`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501), but then Thread 1 waits for m_language_runtimes_mutex in [`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501) while Thread 3 waits for m_modules_mutex in [`ScanForGNUstepObjCLibraryCandidate`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Plugins/LanguageRuntime/ObjC/GNUstepObjCRuntime/GNUstepObjCRuntime.cpp#L57). This fixes the deadlock by adding a scoped block around the mutex lock before the call to the notifier, and moved the notifier call outside of the mutex-guarded section. The notifier call [`NotifyModuleAdded`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Target.cpp#L1810) should be thread-safe, since the module should be added to the `ModuleList` before the mutex is released, and the notifier doesn't modify the module list further, and the call is operates on local state and the `Target` instance. ### Deadlocked Thread backtraces: ``` * thread #3, name = 'dbg.evt-handler', stop reason = signal SIGSTOP * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563786bd5f40) at futex-internal.h:146:13 /*... a bunch of mutex related bt ... */ liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007f2f0f1927b0, __m=0x0000563786bd5f40) at std_mutex.h:229:19 frame llvm#8: 0x00007f2f27946eb7 liblldb.so.21.0git`ScanForGNUstepObjCLibraryCandidate(modules=0x0000563786bd5f28, TT=0x0000563786bd5eb8) at GNUstepObjCRuntime.cpp:60:41 frame llvm#9: 0x00007f2f27946c80 liblldb.so.21.0git`lldb_private::GNUstepObjCRuntime::CreateInstance(process=0x0000563785e1d360, language=eLanguageTypeObjC) at GNUstepObjCRuntime.cpp:87:8 frame llvm#10: 0x00007f2f2746fca5 liblldb.so.21.0git`lldb_private::LanguageRuntime::FindPlugin(process=0x0000563785e1d360, language=eLanguageTypeObjC) at LanguageRuntime.cpp:210:36 frame llvm#11: 0x00007f2f2742c9e3 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeObjC) at Process.cpp:1516:9 ... frame llvm#21: 0x00007f2f2750b5cc liblldb.so.21.0git`lldb_private::Thread::GetSelectedFrame(this=0x0000563785e064d0, select_most_relevant=DoNoSelectMostRelevantFrame) at Thread.cpp:274:48 frame llvm#22: 0x00007f2f273f9957 liblldb.so.21.0git`lldb_private::ExecutionContextRef::SetTargetPtr(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:525:32 frame llvm#23: 0x00007f2f273f9714 liblldb.so.21.0git`lldb_private::ExecutionContextRef::ExecutionContextRef(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:413:3 frame llvm#24: 0x00007f2f270e80af liblldb.so.21.0git`lldb_private::Debugger::GetSelectedExecutionContext(this=0x0000563785d83bc0) at Debugger.cpp:1225:23 frame llvm#25: 0x00007f2f271bb7fd liblldb.so.21.0git`lldb_private::Statusline::Redraw(this=0x0000563785d83f30, update=true) at Statusline.cpp:136:41 ... * thread #1, name = 'lldb', stop reason = signal SIGSTOP * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563785e1dd98) at futex-internal.h:146:13 /*... a bunch of mutex related bt ... */ liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007ffe62be0488, __m=0x0000563785e1dd98) at std_mutex.h:229:19 frame llvm#8: 0x00007f2f2742c8d1 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeC_plus_plus) at Process.cpp:1510:41 frame llvm#9: 0x00007f2f2743c46f liblldb.so.21.0git`lldb_private::Process::ModulesDidLoad(this=0x0000563785e1d360, module_list=0x00007ffe62be06a0) at Process.cpp:6082:36 ... frame llvm#13: 0x00007f2f2715cf03 liblldb.so.21.0git`lldb_private::ModuleList::AppendImpl(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, use_notifier=true) at ModuleList.cpp:246:19 frame llvm#14: 0x00007f2f2715cf4c liblldb.so.21.0git`lldb_private::ModuleList::Append(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, notify=true) at ModuleList.cpp:251:3 ... frame llvm#19: 0x00007f2f274349b3 liblldb.so.21.0git`lldb_private::Process::ConnectRemote(this=0x0000563785e1d360, remote_url=(Data = "connect://localhost:1234", Length = 24)) at Process.cpp:3250:9 frame llvm#20: 0x00007f2f27411e0e liblldb.so.21.0git`lldb_private::Platform::DoConnectProcess(this=0x0000563785c59990, connect_url=(Data = "connect://localhost:1234", Length = 24), plugin_name=(Data = "gdb-remote", Length = 10), debugger=0x0000563785d83bc0, stream=0x00007ffe62be3128, target=0x0000563786bd5be0, error=0x00007ffe62be1ca0) at Platform.cpp:1926:23 ``` ## Test Plan: Built a hello world a.out Run server in one terminal: ``` ~/llvm/build/Debug/bin/lldb-server g :1234 a.out ``` Run client in another terminal ``` ~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3" ``` Before: Client hangs indefinitely ``` ~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b main" (lldb) gdb-remote 1234 ^C^C ``` After: ``` ~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3" (lldb) gdb-remote 1234 Process 837068 stopped * thread #1, name = 'a.out', stop reason = signal SIGSTOP frame #0: 0x00007ffff7fe4a60 ld-linux-x86-64.so.2`_start: -> 0x7ffff7fe4a60 <+0>: movq %rsp, %rdi 0x7ffff7fe4a63 <+3>: callq 0x7ffff7fe5780 ; _dl_start at rtld.c:522:1 ld-linux-x86-64.so.2`_dl_start_user: 0x7ffff7fe4a68 <+0>: movq %rax, %r12 0x7ffff7fe4a6b <+3>: movl 0x18067(%rip), %eax ; _dl_skip_args (lldb) b hello.cc:3 Breakpoint 1: where = a.out`main + 15 at hello.cc:4:13, address = 0x00005555555551bf (lldb) c Process 837068 resuming Process 837068 stopped * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x00005555555551bf a.out`main at hello.cc:4:13 1 #include <iostream> 2 3 int main() { -> 4 std::cout << "Hello World" << std::endl; 5 return 0; 6 } ```
…lvm#152156) With this new A320 in-order core, we follow adding the FeatureUseFixedOverScalableIfEqualCost feature to A510 and A520 (llvm#132246), which reaps the same code generation benefits of preferring fixed over scalable when the cost is equal. So when we have: ``` void foo(float* a, float* b, float* dst, unsigned n) { for (unsigned i = 0; i < n; ++i) dst[i] = a[i] + b[i]; } ``` When compiling without the feature enabled, we get: ``` ... ld1b { z0.b }, p0/z, [x0, x10] ld1b { z2.b }, p0/z, [x1, x10] add x12, x0, x10 ldr z1, [x12, #1, mul vl] add x12, x1, x10 ldr z3, [x12, #1, mul vl] fadd z0.s, z2.s, z0.s add x12, x2, x10 fadd z1.s, z3.s, z1.s dech x11 st1b { z0.b }, p0, [x2, x10] incb x10, all, mul #2 str z1, [x12, #1, mul vl] ... ``` When compiling with, we get: ``` ... ldp q0, q1, [x12, #-16] ldp q2, q3, [x11, #-16] subs x13, x13, llvm#8 fadd v0.4s, v2.4s, v0.4s fadd v1.4s, v3.4s, v1.4s add x11, x11, llvm#32 add x12, x12, llvm#32 stp q0, q1, [x10, #-16] add x10, x10, llvm#32 ... ```
Specifically, `X & M ?= C --> (C << clz(M)) ?= (X << clz(M))` where M is a non-empty sequence of ones starting at the least significant bit with the remainder zero and C is a constant subset of M that cannot be materialised into a SUBS (immediate). Proof: https://alive2.llvm.org/ce/z/haqdJ4. This improves the comparison in isinf, for example: ```cpp int isinf(float x) { return __builtin_isinf(x); } ``` Before: ``` isinf: fmov w9, s0 mov w8, #2139095040 and w9, w9, #0x7fffffff cmp w9, w8 cset w0, eq ret ``` After: ``` isinf: fmov w9, s0 mov w8, #-16777216 cmp w8, w9, lsl #1 cset w0, eq ret ```
This patch introduces libsycl, a SYCL runtime library implementation, as a top-level LLVM runtime project. It contains the basic folder layout and CMake infrastructure to build a dummy SYCL library.
This is part of the SYCL support upstreaming effort. The relevant RFCs can be found here:
https://discourse.llvm.org/t/rfc-add-full-support-for-the-sycl-programming-model/74080
https://discourse.llvm.org/t/rfc-sycl-runtime-upstreaming/74479