Skip to content

Commit 5d09154

Browse files
authored
Simplify and optimize mGPU tile intersection sort (#258)
1. Add prefetching for input and output keys and values for radix sort to avoid page faults 2. Avoid forking threads for radix sort merging to avoid pthread overhead Reduces time per iteration by about 2ms Signed-off-by: Matthew Cong <mcong@nvidia.com>
1 parent 151faf3 commit 5d09154

File tree

1 file changed

+181
-177
lines changed

1 file changed

+181
-177
lines changed

0 commit comments

Comments
 (0)