Skip to content

Conversation

AlexanderKalistratov
Copy link
Collaborator

@AlexanderKalistratov AlexanderKalistratov commented Sep 11, 2024

Implemention of histogram with sycl_kernel.
This PR adds generic histogram kernel which can be used in the future to implement other versions of histogram such as bincount, histogram2d and histogramdd or specialize kernel for special cases like uniform bins.

sycl kernel covers only specific datatype and usm memory types. Unsupported cases are covered by additional copy.

@oleksandr-pavlyk
Copy link
Contributor

Quick validation via independent implementation:

def histogram1d_impl_tensor(data : dpt.usm_ndarray, bins : dpt.usm_ndarray) -> dpt.usm_ndarray:
    assert data.ndim == 1 
    assert bins.ndim == 1
    assert bins.shape[0] > 1
    bin_idx = dpt.searchsorted(bins, data)
    _, c = dpt.unique_counts(dpt.sort(bin_idx))
    return c

In [22]: x = dpnp.random.randn(10**7).get_array()

In [23]: bins = dpnp.asarray([-10, -4, -3, -2, -1, -0.5, -0.25, 0, 0.25, 0.5, 1, 2, 3, 4, 6], dtype=x.dtype).get_array()

In [24]: %time c, _ = dpnp.histogram(dpnp.asarray(x), bins=dpnp.asarray(bins)); print(c)
[    284   13222  214243 1359725 1497540  927356  987069  987352  927886
 1498601 1358597  214704   13114     307]
CPU times: user 10.6 ms, sys: 10.4 ms, total: 21 ms
Wall time: 15.2 ms

In [25]: %time c = histogram1d_impl_tensor(x, bins=bins); print(c)
[    284   13222  214243 1359725 1497540  927356  987069  987352  927886
 1498601 1358597  214704   13114     307]
CPU times: user 711 ms, sys: 581 ms, total: 1.29 s
Wall time: 396 ms

@AlexanderKalistratov
Copy link
Collaborator Author

@antonwolfy

@oleksandr-pavlyk
Copy link
Contributor

I think this is a bug:


In [4]: data, edges = dpt.concat([dpt.full(10**7, fill_value=2., dtype="f"), dpt.full(10**7, fill_value=1., dtype="f"), dpt.full(10**7, fill_value=4., dtype="f")]), dpt.asarray([-2,1, 2, 4], dtype="f")

In [5]: dpnp.histogram(data, edges)
Out[5]:
(array([       0, 10000000, 20000000]),
 usm_ndarray([-2.,  1.,  2.,  4.], dtype=float32))

In [6]: dpnp.histogram(data, edges, density=True)
Out[6]:
(array([0.       , 0.3333333, 0.3333333], dtype=float32),
 usm_ndarray([-2.,  1.,  2.,  4.], dtype=float32))

The density should be (0, 1/3, 2/3), instead of (0, 1/3, 1/3).

@AlexanderKalistratov AlexanderKalistratov force-pushed the histogram branch 2 times, most recently from 3721b6e to bfc7ede Compare October 17, 2024 15:51
@AlexanderKalistratov
Copy link
Collaborator Author

@oleksandr-pavlyk it is not a bug. Numpy demonstrates the same behavior:

>>> import numpy
>>> data, edges = numpy.concatenate([numpy.full(10**7, fill_value=2., dtype="f"), numpy.full(10**7, fill_value=1., dtype="f"), numpy.full(10**7, fill_value=4., dtype="f")]), numpy.asarray([-2,1, 2, 4], dtype="f")
>>> numpy.histogram(data, edges)
(array([       0, 10000000, 20000000]), array([-2.,  1.,  2.,  4.], dtype=float32))
>>> numpy.histogram(data, edges, density=True)
(array([0.        , 0.33333333, 0.33333333]), array([-2.,  1.,  2.,  4.], dtype=float32))

@AlexanderKalistratov
Copy link
Collaborator Author

@oleksandr-pavlyk @antonwolfy please re-review

@AlexanderKalistratov AlexanderKalistratov force-pushed the histogram branch 2 times, most recently from 3849eed to e2b6217 Compare October 31, 2024 05:39
Copy link
Contributor

@antonwolfy antonwolfy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @AlexanderKalistratov for improving the histogram implementation.
I don't have any more comment.

@AlexanderKalistratov AlexanderKalistratov merged commit 8464d9b into IntelPython:master Oct 31, 2024
45 of 46 checks passed
github-actions bot added a commit that referenced this pull request Oct 31, 2024
Implementation of histogram with sycl kernel

---------

Co-authored-by: Anton <[email protected]> 8464d9b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants