-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Description
I observed that broadcast tend to allocate less memory than map!. E.g.,
julia> @time X .= .+(X,X); @time map!(+, X, X, X);
0.000074 seconds (33 allocations: 608 bytes)
0.000121 seconds (64 allocations: 2.078 KiB)Also, we should cache the spmv buffer to reduce the matmul time
https://github.com/JuliaGPU/CUDA.jl/blob/1389800a26078df16cc689ed5138f0691185ce61/lib/cusparse/generic.jl#L181-L189
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels