Skip to content

Commit 8d3d750

Browse files
Merge pull request #167 from jeremiedb/master
small grant for EvoTrees.jl
2 parents e3a7254 + b2cfd89 commit 8d3d750

File tree

1 file changed

+27
-0
lines changed

1 file changed

+27
-0
lines changed

small_grants.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -523,3 +523,30 @@ in Julia.
523523

524524
**Reviewers**: Chris Rackauckas
525525

526+
## Improve training performance of GPU backend in EvoTrees.jl (\$2000)
527+
528+
EvoTrees.jl[https://github.com/Evovest/EvoTrees.jl] is an efficient pure-Julia
529+
implementation of boosted trees. Performance on CPU is competitive and even superior to
530+
peers such as XGBoost. However, the GPU backend is lagging.
531+
532+
The objective of this project is to improve the GPU backend to bring the training benchmarks to
533+
in a competitive range to XGBoost. A premium of \$500 will be awarded if the solution is implemented
534+
with [KernelAbstractions.jl](https://github.com/JuliaGPU/KernelAbstractions.jl), allowing the support for AMD gpus.
535+
536+
**Information to Get Started**: A key bottleneck is assumed to be the important overhead
537+
from the large number of kernels launched as the depth of the tree grows. Also, only the gradients
538+
and histograms are computed on the GPU, while gains and best node split could also be computed on the GPU
539+
and reduce the GPU to CPU communications. Potential solution paths and preliminary work initiative is
540+
discussed in this [issue](https://github.com/Evovest/EvoTrees.jl/issues/288).
541+
542+
**Success Criteria**: A PR is merged to EvoTrees.jl which brings the benchmarked GPU training time to
543+
less than 125% that of XGBoost for the 1M and 10M observations benchmarks as discussed in the core
544+
[issue](https://github.com/Evovest/EvoTrees.jl/issues/288).
545+
546+
It should be reproducible on either a 3090, 4090 or a RTX A4000.
547+
The solution should be purely Julia based, and not result in a significant increase in code complexity / LoCs.
548+
549+
**Recommended Skills**: Experience in kernel development on GPU, preferably with CUDA.jl or [KernelAbstractions.jl](https://github.com/JuliaGPU/KernelAbstractions.jl).
550+
General performance optimization and multi-threading.
551+
552+
**Reviewers**: [Jeremie Desgagne-Bouchard](https://github.com/jeremiedb)

0 commit comments

Comments
 (0)