-
Notifications
You must be signed in to change notification settings - Fork 791
[CI] Change nightly benchmarking run to use no assertions linux_shared_build #18071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
run-sycl-benchmarks: | ||
needs: [ubuntu2204_build] | ||
needs: [linux_shared_build] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you done a perf comparison of static vs shared linking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly no, although I suppose we will find out in https://github.com/intel/llvm/actions/runs/14579704073 if it's slower
This commit adds the bundle state as an argument to the -fsyclbin driver option (default to executable) and --syclbin clang-linker-wrapper option (no default). This argument is propagated to the SYCLBIN files. --------- Signed-off-by: Larsen, Steffen <[email protected]>
This PR cares about potential leftover / typo in https://github.com/intel/llvm/edit/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_tangle.asciidoc#:~:text=320-,321,-322: as suggested by @AlexeySachkov here #16151 (comment)
…of a function is annotated. (#18590) Fixes #17591. --------- Co-authored-by: aelovikov-intel <[email protected]>
Fixes benchmark results presentation on https://oneapi-src.github.io/unified-runtime/performance/ The same runs now have the same color on subsequent charts Signed-off-by: Mateusz P. Nowak <[email protected]>
…18645) We would like to extend urDeviceSelectBinary downstream to allow for device-specific binary targets. This commit refactors the current handling to make this easier. Additionally L0 specific tests are added for urDeviceSelectBinary to verify that the fallback logic works as expected.
Improve readability of Historical Results chart titles, tooltips, and all charts' legends by introducing new Result object field: `display_name` for Compute Benchmarks
Combined the following changes enable adding guards around sections in linker script files which do not support conditional inclusion in their syntax. This is done by pre-processing the linker scripts as part of build configuration. - Cleanup template `helper.py` and add typing hints - Use dict for functions in loader script templates - The `get_loader_functions()` template helper now returns `list[dict]` instead of `list[str]`, this enables passing additional information into the loader script templates - Add script to strip guarded lines from file - In order to pre-process files which don't support conditional inclusion of line blocks, such as linker scripts, we can use this script to remove lines which should not be included unless specified - Hook up `strip-guarded-lines.py` - Actually use the `strip-guarded-lines.py` script when using `configure_file()` on linker scripts
An MSVC compiler update changed the assertion message slightly, the test needed a simple update. Fixes #17116
On some systems `CUDA_Toolkit_ROOT` might be emtpy, even though CUDA is properly found in the CMake. This can cause a failed search for `generated_cuda_meta.h`. In that case only a warning is emitted when building `cuda_trace_collector`, but this can fail the `-Werror` build. This patch ensures that if we can find `CUPTI`, we can find this file.
In practice we only build (and thus test) three libclc targets: 'nvptx64--nvidiacl', 'amdgcn--amdhsa' and 'native_cpu'. All other upstream libclc targets are never built in our CI and would in fail to build. This commit rectifies this by selectively building libspirv only for those three supported targets. More can be added in time if required. There are still certain OpenCL libclc targets that can't be built with this commit. The r600 target, for example, can't build because we unconditionally enable the fp64 OpenCL extension across the board, but the r600 target doesn't support that. The clspv and clspv64 targets also fail to build due to SOURCES files referencing missing files. This will be resolved in the next pulldown.
#18431) To align with the comment in the file that specifies 32 storage locations and 128 bits per warp. Change file to opaque pointer mode. Add more global variables for different sizes to resolve `Reducing storage for small data types`.
The macro __CLC_FUNCTION is special and is used in CLC headers. Defining it here in an implementation file - nextafter.cl - for another purpose then including half_nextafter.inc which pulls in the SPIR-V headers results in a macro redefinition, which is warned about. Sicne the name isn't important and is local to this one file, this commit just changes the macro definition to fix the issue.
- Implements the dynamic_local_accessor class with compiler support. - Refactor the recently added dynamic_work_group_memory class to only use one `impl` member variable. This brings it closer to the design of other sycl classes and avoids future ABI break issues. - There are 2 ABI breaking changes. However, they are both related to the `dynamic_work_group_memory` class whose [specification](#16712) has not been merged yet and is not yet officially supported.
Some resource destruction is done in the destructor, but if we don't manually clear the map, then the destructor is called after the adapter release, which leads to the leak report and maybe some UB(trying to use adapter after it is released).
After #18437, the runtime library is producing a warning about an unused variable AccTarget in handler.cpp. This is due to the variable only being used in assert, which may in turn be removed when assertions are disabled. This commit removes the variable in favor of making the conversion inside the assert. Signed-off-by: Larsen, Steffen <[email protected]>
#18627) The proprietary Intel Compiler (ICX) uses a different installation layout then clang. It puts tools into a bin/compiler subdirectory (to not expose them on the PATH by default). Handle this by not assuming that the compiler is in the same directory as other llvm tools. Ask the compiler for the path to `llvm-config` for the tools directory, using the `-print-prog-name` option. `llvm-config` was choosen because lit already assumes it is in the tools directory see for example: [llvm/utils/lit/lit/llvm/config.py:285][1] [1]: https://github.com/llvm/llvm-project/blob/9cac4bf485e64f7992f2c01bb9517f6379e58164/llvm/utils/lit/lit/llvm/config.py#L285
- Remove unused Context parameters - Avoid unnecessary copy in `guessLocalWorkSize` - Simplify the control flow in setKernelParams - Move cached properties fetching code to constructors - Query HIP for occupancy in `guessLocalWorkSize`
Co-authored-by: Kornev, Nikita <[email protected]>
…t function (#18824) This PR adds new e2e tests for free function kernels extension based on test plan https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/FreeFunctionKernels/test-plan.md#perform-test-that-free-function-kernel-can-be-used-as-device-function-within-another-kernel Extension spec: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_free_function_kernels.asciidoc The overall idea behind this test to verify if free function kernel even if marked with one of the properties (`nd_range_kernel` and `single_task_kernel`) can still be used as device or host function.
…8914) A recent community change llvm/llvm-project#124858 affected how we render some non-type template parameters in the integration header. We were generating malformed arguments of the form 'value-parameter-0-1', which obviously led to compilation errors when including the header in host compilations.
This resolves Coverity issue `440250` (https://scan.coverity.com/projects/intel-llvm?tab=overview). We know that `ND` cannot be `nullptr`, but presence of the check makes Coverity think that earlier accesses (within the `while` loop condition) are potentially unsafe.
…ease on... (#18619) Command Buffer, while it is still executing.
…set (#18869) We are getting a warning on some scorecard, `read-all` doesn't warn. --------- Signed-off-by: Sarnie, Nick <[email protected]>
#18787) This PR decreases the number of TLS accesses in the `NestedCallsTracker` and `tls_code_loc_t`. The idea is to cache TLS location in the reference. As a result, we have only a single lookup for the TLS location.
Before this PR one thread could add new events to the queue while another removes events, both modifying and potentially corrupting NativeCPU queue::events. This PR adds a mutex to the NativeCPU queue handle to prevent this potential corruption. Aims to at least fix: `SYCL/HostInteropTask/host-task-two-queues.cpp`
…not compressed (#18906) **Problem** When linking device images, we reject dependencies whose image format does not match the parent image. However, consider the case when parent image is compressed, while dependencies are not (demonstrated in the test case attached to this PR). In this case, we are incorrectly rejecting device images and thus causing `No device image found for external symbol` error. **Solution** If the format of the main and dependent device image differs and one of them is compressed, we decompress them and recheck the format of decompressed device images. One side-effect of this solution is that now we'll have to decompress device images, even if we are not using them. For example, when format of decompressed main and dependent images differs. Unfortunately, there's no way to find format of the compressed device image, without first decompressing it. However, I don't think this will incur a significant overhead as (1) we decompress device image only once and cache it for subsequent use, and (2) we decompress only if the dependent device image has an export symbol that main device image wants (when finding which images to link) and if it is compatible with the device.
…/intel/llvm into ianayl/benchmark-ci-use-noassert
Oh crap... sorry all for the ping |
Change benchmarking CI to use the shared library no assertions build, incase assertions end up affecting performance