Commit 1cc51c6
[CUDA][avgpool2d] Fix backward launch bounds again for `sm100`, `sm120` (pytorch#150640)
`__CUDA_ARCH__` is not visible in host code, which causes incorrect launch bounds and `too many resources requested for launch` on blackwell
Pull Request resolved: pytorch#150640
Approved by: https://github.com/malfet, https://github.com/drisspg, https://github.com/atalman
(cherry picked from commit 09c4da9)
Co-authored-by: Eddie Yan <[email protected]>
1 parent 28ca4dd commit 1cc51c6
1 file changed
+6
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
402 | 402 | | |
403 | 403 | | |
404 | 404 | | |
405 | | - | |
406 | | - | |
407 | | - | |
408 | | - | |
409 | | - | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
410 | 411 | | |
411 | 412 | | |
412 | 413 | | |
| |||
0 commit comments