GPU type 1: please try much faster output-driven gpu_method=3 #745
ahbarnett
announced in
Announcements
Replies: 1 comment
-
|
I would also add that for type 2 we (thanks to Robert Blackwell) tweaked the parallelism to leverage newer GPUs better. On recent hardware this transform can also be 2/4x faster |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Please note that for 3-10x faster type-1 GPU transforms, use the option
gpu_method=3, which is a new output-driven algorithm which exploits more threads, by Marco Barbone, building on Julia work of Juan Polanco. It is not the default yet, since we'd like users to try it out first. It doesn't affect type-2, which are already fast.Enjoy!
Beta Was this translation helpful? Give feedback.
All reactions