You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Remove eager synchronization with HtoD copies. (#2625)
We assumed unpinned memory would always synchronize, but that does
not seem to be the case. For some copy sizes (and potentially on
some, e.g. coherent, memory architectures) the copy is fully
asynchronous.
This optimization was made to make `CuRef` of a scalar fully async.
I considered making the `CuRef` ctor call `memset` instead, which
is always asynchronous by virtue of passing the memory by value,
however that does not support 64-bits floats while `memcpy` of
64 bits is still executed fully asynchronously.
0 commit comments