Commit 9975edc
committed
[Distributed] Fix broadcast_module_parameter for CPU-resident models
Use each rank's own GPU device for NCCL broadcast instead of the
module's execution device, which may be CPU or shared across ranks
when the model is not GPU-resident.
Signed-off-by: Itay Etelis <itay.etelis@ibm.com>1 parent 0f3e1f9 commit 9975edc
1 file changed
+3
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| |||
147 | 147 | | |
148 | 148 | | |
149 | 149 | | |
150 | | - | |
| 150 | + | |
| 151 | + | |
151 | 152 | | |
152 | 153 | | |
153 | 154 | | |
| |||
0 commit comments