Commit 781c5f5
[ROCm] cpp_extension allow user to override default flags (pytorch#152432) (#2374)
cherry-pick of
pytorch@e4adf5d
We need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of
-no-gpu-rdc doesn't work for such cases.
As per
pytorch#152432 (comment):
"rocshmem shares the same global variable in different files, as deepEP
uses CUDAExtention to build the project
https://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51
and depends on rocshmem, this -fgpu-rdc is needed. The current logic in
Pytorch prevents users from overriding this flag."
Pull Request resolved: pytorch#152432
Approved by: https://github.com/jeffdaily
Co-authored-by: Jithun Nair <[email protected]>
Co-authored-by: Jeff Daily <[email protected]>1 parent 76481f7 commit 781c5f5
1 file changed
+11
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2110 | 2110 | | |
2111 | 2111 | | |
2112 | 2112 | | |
2113 | | - | |
| 2113 | + | |
| 2114 | + | |
| 2115 | + | |
2114 | 2116 | | |
| 2117 | + | |
2115 | 2118 | | |
2116 | 2119 | | |
2117 | | - | |
| 2120 | + | |
| 2121 | + | |
| 2122 | + | |
| 2123 | + | |
| 2124 | + | |
2118 | 2125 | | |
2119 | 2126 | | |
2120 | 2127 | | |
| |||
2127 | 2134 | | |
2128 | 2135 | | |
2129 | 2136 | | |
2130 | | - | |
| 2137 | + | |
2131 | 2138 | | |
2132 | 2139 | | |
2133 | 2140 | | |
| |||
2312 | 2319 | | |
2313 | 2320 | | |
2314 | 2321 | | |
2315 | | - | |
2316 | 2322 | | |
| 2323 | + | |
2317 | 2324 | | |
2318 | 2325 | | |
2319 | 2326 | | |
| |||
0 commit comments