Commit 72ed4b2
committed
[Distributed] Fix broadcast_module_parameter for CPU-resident models
Use each rank's own GPU device for NCCL broadcast instead of the
module's execution device, which may be CPU or shared across ranks
when the model is not GPU-resident.
Signed-off-by: Itay Etlis <itayetlis@gmail.com>1 parent 44d65e3 commit 72ed4b2
1 file changed
+3
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| |||
147 | 147 | | |
148 | 148 | | |
149 | 149 | | |
150 | | - | |
| 150 | + | |
| 151 | + | |
151 | 152 | | |
152 | 153 | | |
153 | 154 | | |
| |||
0 commit comments