-
Notifications
You must be signed in to change notification settings - Fork 23
Reuse dpnp.nan_to_num in dpnp.nansum and dpnp.nanprod
#2339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
View rendered docs @ https://intelpython.github.io/dpnp/index.html |
|
Array API standard conformance tests for dpnp=0.18.0dev0=py312he4f9c94_16 ran successfully. |
5997cf3 to
4c0908b
Compare
|
This relatively simple and non-invasive change improves performance significantly. On Max GPU before: In [1]: import dpnp
In [2]: x = dpnp.ones(3*10**8, dtype="f4")
In [3]: q = x.sycl_queue
In [4]: %timeit r = dpnp.nansum(x); q.wait()
9.37 ms ± 33.8 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [5]: %timeit r = dpnp.nansum(x); q.wait()
9.42 ms ± 18.8 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [6]: x = dpnp.ones(10**8, dtype="f4")
In [7]: %timeit r = dpnp.nansum(x); q.wait()
4.5 ms ± 8.8 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [8]: %timeit r = dpnp.nansum(x); q.wait()
4.51 ms ± 11 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)after: In [1]: import dpnp
In [2]: x = dpnp.ones(3*10**8, dtype="f4")
In [3]: q = x.sycl_queue
In [4]: %timeit r = dpnp.nansum(x); q.wait()
6.5 ms ± 24.5 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [5]: %timeit r = dpnp.nansum(x); q.wait()
6.47 ms ± 35.7 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [6]: x = dpnp.ones(10**8, dtype="f4")
In [7]: %timeit r = dpnp.nansum(x); q.wait()
2.78 ms ± 14.3 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [8]: %timeit r = dpnp.nansum(x); q.wait()
2.78 ms ± 14 μs per loop (mean ± std. dev. of 7 runs, 100 loops each) |
aa48c71 to
4552fe8
Compare
|
Changes to I will revert the commits changing the nanarg functions and add a warning about synchronization. |
d0dad9b to
f69ef28
Compare
Moved warnings relating to all-NaN and all-negative-inf slices to near the synchronization warning
8d78920 to
1995cd5
Compare
antonwolfy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ndgrigorian, LGTM!
Reuse `dpnp.nan_to_num` in `dpnp.nansum` and `dpnp.nanprod` 14274d8

This PR proposes the use of
nan_to_numover_replace_naninnansum,nanprod,nancumsum, andnancumprodusing new internal function_replace_nan_no_mask.