Skip to content
Discussion options

You must be logged in to vote

this is a spark scheduling question not a spark-rapids plugin question.
I'm assuming you are allowing more then 1 task to run on the GPU? (ie spark.task.resource.gpu.amount=(1/24). If so then spark is doing just fine.

More than likely the reason is locality. Spark uses locality to decide where to put thing where it thinks is most efficient. Meaning the most data is local on that node so you don't have to transfer as much over the network.

We were actually doing a little experimenting with adding an option to spark to force it to spread, but in that particular case the performance was the same or worse. You may get some efficiencies in spreading them by having more resources there but at t…

Replies: 8 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by sameerz
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
5 participants
Converted from issue

This discussion was converted from issue #678 on April 28, 2022 23:29.