GPU accelerated backends

Now that support is in place for rank 1 through rank 4 arrays (fp32 and fp64), it's time to look into supporting GPU acceleration for function evaluation.

To support portability between Nvidia and AMD GPUs, I'm thinking of using AMD's HIP. Because of the status of ROCm and spotty support for Windows (and no support for MacOS), this build feature will need to be optional. Additionally, because some users may be on systems that do not have ROCm installed, and only the CUDA toolkit, we'll need to use pre-processing to map procedures for GPU memory management to either the CUDA or HIP methods.


## To do list

### Build system
[] Add option for enabling HIP
[] Add option for enabling CUDA
[] Add CUDA and HIP build options to spack package
[] Build with HIP support with fpm ?
[] Build with CUDA support with fpm ?

### Compute Kernels
We will need to have the following element-wise functions/operations defined as HIP kernels with 32-bit and 64-bit data for device pointers

[] `c  = a+b` 
[] `c  = a-b` 
[] `c  = a*b` 
[] `c  = a/b`
[] `c  = a^s` (`s` is a scalar)
[] `c = \abs(a)`
[] `c = \cos(a)`
[] `c = \sin(a)`
[] `c = \tan(a)`
[] `c = \acos(a)`
[] `c = \asin(a)`
[] `c = \atan(a)`
[] `c = \sinh(a)`
[] `c = \cosh(a)`
[] `c = \tanh(a)` 
[] `c = \sqrt(a)` 
[] `c = \ln(a)` (natural logarithm)
[] `c = \log(a)` (log base-10)
[] `c = -a` (sign flip)   

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GPU accelerated backends #18

To do list

Build system

Compute Kernels

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

GPU accelerated backends #18

Description

To do list

Build system

Compute Kernels

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions