Commit cd0f7aa
[ROCm] cpp_extension allow user to override default flags (pytorch#152432) (#2374)
cherry-pick of
pytorch@e4adf5d
We need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of
-no-gpu-rdc doesn't work for such cases.
As per
pytorch#152432 (comment):
"rocshmem shares the same global variable in different files, as deepEP
uses CUDAExtention to build the project
https://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51
and depends on rocshmem, this -fgpu-rdc is needed. The current logic in
Pytorch prevents users from overriding this flag."
Pull Request resolved: pytorch#152432
Approved by: https://github.com/jeffdaily
Co-authored-by: Jithun Nair <[email protected]>
Co-authored-by: Jeff Daily <[email protected]>1 parent 22c98ea commit cd0f7aa
1 file changed
+11
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2407 | 2407 | | |
2408 | 2408 | | |
2409 | 2409 | | |
2410 | | - | |
| 2410 | + | |
| 2411 | + | |
| 2412 | + | |
2411 | 2413 | | |
| 2414 | + | |
2412 | 2415 | | |
2413 | 2416 | | |
2414 | | - | |
| 2417 | + | |
| 2418 | + | |
| 2419 | + | |
| 2420 | + | |
| 2421 | + | |
2415 | 2422 | | |
2416 | 2423 | | |
2417 | 2424 | | |
| |||
2424 | 2431 | | |
2425 | 2432 | | |
2426 | 2433 | | |
2427 | | - | |
| 2434 | + | |
2428 | 2435 | | |
2429 | 2436 | | |
2430 | 2437 | | |
| |||
2612 | 2619 | | |
2613 | 2620 | | |
2614 | 2621 | | |
2615 | | - | |
2616 | 2622 | | |
| 2623 | + | |
2617 | 2624 | | |
2618 | 2625 | | |
2619 | 2626 | | |
| |||
0 commit comments