Commit 510aaf5
committed
Adds mask reduction utility function
Implements a device function that performs logical OR reduction across mask tensor elements and synchronizes the result across thread blocks using warp-level primitives.
Enables efficient sparse attention pattern processing by allowing threads to collectively determine if any mask elements are active within a given region.1 parent e23b08f commit 510aaf5
1 file changed
+19
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
333 | 333 | | |
334 | 334 | | |
335 | 335 | | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
336 | 355 | | |
337 | 356 | | |
338 | 357 | | |
| |||
0 commit comments