Skip to content
Discussion options

You must be logged in to vote

There is no one-size-fits-all answer here, because it's all about tradeoffs.

Broadly speaking, you have an array of inputs that you would like to map to an array of outputs. You also have a device (GPU or TPU) that's purpose-built for array-oriented computing.

There are two options:

(1) embrace that array-oriented computing, and compute your function for every entry in the array (taking advantage of the implicit array-oriented parallelism in the device architecture) and then mask out the results you don't want. There is wasted computation here because you are computing an expensive result that will be thrown away in some cases, but the benefit is you are using the hardware in precisely th…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@itk22
Comment options

Answer selected by itk22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants