-
Notifications
You must be signed in to change notification settings - Fork 1
Closed
Labels
Description
Assuming we eventually want to do end-user analysis on GPUs, the big picture looks like:
Although histograming is not the biggest bottleneck (for example, we don't know how to do GPUDirect for .root files at the moment), it is something we have to solve:

To that end, scikit-hep has: https://github.com/scikit-hep/cuda-histogram, but at the moment, it simply calls cupy kernels:
https://github.com/scikit-hep/cuda-histogram/blob/ef17af76959683e461fff4c73d28d2778cd658e9/src/cuda_histogram/hist.py#L244
this has a few problems:
- It's not very fast (cupy is not very fast, and we launch multiple kernels to get weights and
sumw2) - it doesn't fuse the kernel with analysis loop
In the long run, in order to fuse analysis loop and histogram kernel, we probably need both of them to be written in Jax (or something).
Saransh-cpp, ianna and eduardo-rodriguespfackeldeySaransh-cpp
