-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
This benchmark should show how many tuples a certain number of GPU threads can scan and copy to a send buffer per time unit: Outcome: x-axis, number of threads, y-axis, tuple throughput.
We should test this with different tuple sizes to see if there are interesting effects when the data to be copied per tuple is larger.
Knowing the results, we would be able to calculate roughly how large the tuple size must be and how many threads we would need to fill the send buffers fast enough to reach a throughput near BW in the shuffle.
Metadata
Metadata
Assignees
Labels
No labels