You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Backend] Fix predicates for device assert inside reduction/scan region (#5033)
Reductions have special handling for side effectful "combine ops" (e.g.
"add" for a sum reduction). In the presence of side effects, a predicate
is computed to determine whether a thread should participate in the
reduction, to ensure that invalid/uninitialized data is not operated on.
See #4811 for more details.
~Previously, the predicate logic was incorrect for 2D reductions. This
PR fixes the logic and adds a python test.~
Edit: after additional discussion with @peterbell10, we removed the
lanePred logic. Here's our thinking on why this is valid:
* lanePred info is computed based entirely on the blocked layout info
and properties of the reduction
* the blocked layout won't tell you which threads do or don't have
uninitialized data
Instead, it sounds like the motivation for #4811 is based on
uninitialized values that can be indicated by the `pred` variable passed
into `warpReduce()`.
0 commit comments