Skip to content

Latest commit

 

History

History
28 lines (19 loc) · 625 Bytes

File metadata and controls

28 lines (19 loc) · 625 Bytes

GPU benchmarks

A collection of GPU benchmarks to evaluate software stack performance. The tests are written in CUDA with a simple HIP compatibility layer that allows the tests to be run on AMD GPUs without modification while not requiring HIP as a dependency on NVIDIA systems.

ROCm core API

  • Memory allocations
  • Page faults
  • Launch latencies
  • Memory access latencies
  • Memory bandwidth

Parallel algorithms

Support for both ROCm's rocPRIM and NVIDIA's cub/thrust.

  • Radix sort
  • Prefix sums
  • Reductions

FFT

FFT benchmark for 1D, 2D and 3D transforms. Supports hipFFT / rocFFT and cuFFT.

rocSOLVER