Skip to content

Commit f299182

Browse files
committed
Use a flooring division to ensure we launch the required amount of threads.
1 parent f742727 commit f299182

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/device/execution.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ function launch_configuration(backend::AbstractGPUBackend, heuristic;
9999

100100
if elements_per_thread > 1 && blocks > heuristic.blocks
101101
# we want to launch more blocks than required, so prefer a grid-stride loop instead
102-
nelem = clamp(cld(blocks, heuristic.blocks), 1, elements_per_thread)
102+
nelem = clamp(fld(blocks, heuristic.blocks), 1, elements_per_thread)
103103
blocks = cld(blocks, nelem)
104104
(threads=threads, blocks=blocks, elements_per_thread=nelem)
105105
else

0 commit comments

Comments
 (0)