Skip to content

Commit c9b9f70

Browse files
Update dpnp.isclose with scalar-specific SYCL kernels (#2540)
This PR suggests updating `dpnp.isclose()` function adding a scalar-specific SYCL kernels for both contiguous and stride cases to improve performance when `rtol` and `atol` are scalars. Also extends and updates tests for `dpnp.isclose()` The new kernel **improves performance** by **up to 10x** compared to the previous implementation when `rtol` and `atol` are scalars (tested on PVC).
1 parent 6d339e9 commit c9b9f70

File tree

9 files changed

+1009
-17
lines changed

9 files changed

+1009
-17
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
3838
* Changed Windows-specific logic in dpnp initialization [#2553](https://github.com/IntelPython/dpnp/pull/2553)
3939
* Added missing includes to files in ufunc and VM pybind11 extensions [#2571](https://github.com/IntelPython/dpnp/pull/2571)
4040
* Refactored backend implementation of `dpnp.linalg.solve` to use oneMKL LAPACK `gesv` directly [#2558](https://github.com/IntelPython/dpnp/pull/2558)
41+
* Improved performance of `dpnp.isclose` function by implementing a dedicated kernel for scalar `rtol` and `atol` arguments [#2540](https://github.com/IntelPython/dpnp/pull/2540)
4142

4243
### Deprecated
4344

dpnp/backend/extensions/common/ext/details/validation_utils_internal.hpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,8 +114,10 @@ inline void check_no_overlap(const array_ptr &input,
114114
}
115115

116116
const auto &overlap = dpctl::tensor::overlap::MemoryOverlap();
117+
const auto &same_logical_tensors =
118+
dpctl::tensor::overlap::SameLogicalTensors();
117119

118-
if (overlap(*input, *output)) {
120+
if (overlap(*input, *output) && !same_logical_tensors(*input, *output)) {
119121
throw py::value_error(name_of(input, names) +
120122
" has overlapping memory segments with " +
121123
name_of(output, names));

dpnp/backend/extensions/ufunc/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ set(_elementwise_sources
3737
${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/heaviside.cpp
3838
${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/i0.cpp
3939
${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/interpolate.cpp
40+
${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/isclose.cpp
4041
${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/lcm.cpp
4142
${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/ldexp.cpp
4243
${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/logaddexp2.cpp

dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737
#include "heaviside.hpp"
3838
#include "i0.hpp"
3939
#include "interpolate.hpp"
40+
#include "isclose.hpp"
4041
#include "lcm.hpp"
4142
#include "ldexp.hpp"
4243
#include "logaddexp2.hpp"
@@ -66,6 +67,7 @@ void init_elementwise_functions(py::module_ m)
6667
init_heaviside(m);
6768
init_i0(m);
6869
init_interpolate(m);
70+
init_isclose(m);
6971
init_lcm(m);
7072
init_ldexp(m);
7173
init_logaddexp2(m);

0 commit comments

Comments
 (0)