Commit ba12b9b
[integ-tests] Fix NCCL test on Ubuntu 24 on p6-200
Add NCCL_SOCKET_FAMILY=AF_INET to force NCCL to use IPv4. On Ubuntu 24 with p6-b200, without this parameter, NCCL hangs on IPv6, which is not supported by ParallelCluster
Reference: https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html#nccl-socket-family1 parent c244455 commit ba12b9b
File tree
1 file changed
+2
-0
lines changed- tests/integration-tests/tests/common/data/nccl
1 file changed
+2
-0
lines changedLines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| 24 | + | |
23 | 25 | | |
24 | 26 | | |
0 commit comments