Skip to content

Commit f50a1ea

Browse files
committed
use 256 threads
1 parent 3a52020 commit f50a1ea

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/backends/cudanative/cudanative.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -293,7 +293,7 @@ function acc_mapreduce{T, OT, N}(
293293
)
294294
dev = context(A).device
295295
@assert(CUDAdrv.capability(dev) >= v"3.0", "Current CUDA reduce implementation requires a newer GPU")
296-
threads = 512
296+
threads = 256
297297
blocks = min((length(A) + threads - 1) ÷ threads, 1024)
298298
out = similar(buffer(A), OT, (blocks,))
299299
args = map(unpack_cu_array, rest)

0 commit comments

Comments
 (0)