You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cudaLimitDevRuntimeSyncDepth is the maximum grid depth at which a thread
can issue the device runtime call cudaDeviceSynchronize() to wait on
child grid launches to complete.
Use of cudaDeviceSynchronize() in device code was deprecated in CUDA
11.6, and removed for devices with compute capability 9.0 or higher,
while it requires explicit opt-in via the compile-time flag
-DCUDA_FORCE_CDP1_IF_SUPPORTED for other devices.
The current code fails at runtime on an NVIDIA H100 or newer GPUs.
0 commit comments