Skip to content

Unable to run the FADIAB GPU test without data transfer #50

@sjsprecious

Description

@sjsprecious

What happened?

Currently we can run the FADIAB test on Derecho's GPUs but it will involve data transfer between CPU and GPU at every time step.

Ideally we should only need to copy the data once to GPU at the first time and all the data can be GPU-resident since no physics is computed for FADIAB test. But when I manually comment out the d_p_coupling and p_d_coupling routines in the stepon.F90 code, it failed with the following error message in the atm.log.xxx file:

SHR_REPROSUM_CALC: Input contains  0.00000E+00 NaNs and  0.37101E+05 INFs on process       0
 shr_reprosum_calc ERROR: NaNs or INFs in input

What are the steps to reproduce the bug?

Create the FADIAB test with GNU compiler and enable the Kokkos CUDA target on Derecho, choose the theta-l_kokkos dycore, and manually comment out the d_p_coupling and p_d_coupling routines in the stepon.F90 code.

What CAM tag were you using?

stormspeed

What machine were you running CAM on?

CISL machine (e.g. cheyenne)

What compiler were you using?

GNU

Path to a case directory, if applicable

/glade/derecho/scratch/sunjian/stormspeed/run/FADIAB.ne30_ne30_mg17.derecho.gnu.theta-l_kokkos.gpu04_mpi0004.30tracers.no_p_d_coupling/run

Will you be addressing this bug yourself?

No

Extra info

No response

Metadata

Metadata

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions