-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Description
The implementation of gas_optical_depths_minor_kernel has a synchronization bug.
rte-rrtmgp-cpp/src_kernels_cuda/gas_optics_kernels.cu
Lines 585 to 594 in 6712fc1
| } | |
| } | |
| scalings[threadIdx.z][threadIdx.y] = scaling; | |
| } | |
| __syncthreads(); | |
| scaling = scalings[threadIdx.z][threadIdx.y]; | |
| const int gpt_start = minor_limits_gpt[2*imnr]-1; |
This kernel writes to shared memory, calls __syncthreads, and then reads from shared memory again. However, some threads might already start the next iteration and overwrite shared memory before other threads could read shared memory from the previous iteration.
Verification shows that thus bug indeed sometimes occurs.
The solution would be to have another __syncthreads on L593 after reading scalings
Metadata
Metadata
Assignees
Labels
No labels