Skip to content

Commit 3d45d85

Browse files
authored
Remove eager synchronization with HtoD copies. (#2625)
We assumed unpinned memory would always synchronize, but that does not seem to be the case. For some copy sizes (and potentially on some, e.g. coherent, memory architectures) the copy is fully asynchronous. This optimization was made to make `CuRef` of a scalar fully async. I considered making the `CuRef` ctor call `memset` instead, which is always asynchronous by virtue of passing the memory by value, however that does not support 64-bits floats while `memcpy` of 64 bits is still executed fully asynchronously.
1 parent d07a245 commit 3d45d85

File tree

1 file changed

+4
-12
lines changed

1 file changed

+4
-12
lines changed

src/array.jl

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -527,12 +527,9 @@ Base.copyto!(dest::DenseCuArray{T}, src::DenseCuArray{T}) where {T} =
527527
function Base.unsafe_copyto!(dest::DenseCuArray{T}, doffs,
528528
src::Array{T}, soffs, n) where T
529529
context!(context(dest)) do
530-
# operations on unpinned memory cannot be executed asynchronously, and synchronize
531-
# without yielding back to the Julia scheduler. prevent that by eagerly synchronizing.
532-
if use_nonblocking_synchronization
533-
is_pinned(pointer(src)) || synchronize()
534-
end
535-
530+
# the copy below may block in `libcuda`, so it'd be good to perform a nonblocking
531+
# synchronization here, but the exact cases are hard to know and detect (e.g., unpinned
532+
# memory normally blocks, but not for all sizes, and not on all memory architectures).
536533
GC.@preserve src dest begin
537534
unsafe_copyto!(pointer(dest, doffs), pointer(src, soffs), n; async=true)
538535
if Base.isbitsunion(T)
@@ -546,12 +543,7 @@ end
546543
function Base.unsafe_copyto!(dest::Array{T}, doffs,
547544
src::DenseCuArray{T}, soffs, n) where T
548545
context!(context(src)) do
549-
# operations on unpinned memory cannot be executed asynchronously, and synchronize
550-
# without yielding back to the Julia scheduler. prevent that by eagerly synchronizing.
551-
if use_nonblocking_synchronization
552-
is_pinned(pointer(dest)) || synchronize()
553-
end
554-
546+
# the copy below may block in `libcuda`; see the note above.
555547
GC.@preserve src dest begin
556548
# semantically, it is not safe for this operation to execute asynchronously, because
557549
# the Array may be collected before the copy starts executing. However, when using

0 commit comments

Comments
 (0)