You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A value of 4 is generally a good compromise to improve point-to-point performance without affecting collectives performance.
32
32
Setting it to a higher value such as 16 or 32 can still further improve send/recv performance, but may degrade collectives performance, so the optimal value depends on the mix of operations used in an application.
33
-
The option is undocumented, but [this issue](https://github.com/NVIDIA/nccl/issues/1272)contains additional details.
33
+
The option is undocumented, but [this issue](https://github.com/NVIDIA/nccl/issues/1272)and the paper linked above contain additional details.
34
34
35
35
!!! warning "NCCL watchdog timeout or hanging process"
36
36
In some cases, still under investigation, NCCL may hang resulting in a stuck process or a watchdog timeout error.
0 commit comments