Skip to content

Commit 20a8b3c

Browse files
msimbergRMeli
andauthored
Update docs/software/communication/nccl.md
Co-authored-by: Rocco Meli <[email protected]>
1 parent 36262c8 commit 20a8b3c

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

docs/software/communication/nccl.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,14 @@ The environment variables described below must be set to ensure that NCCL uses t
1717
While the container engine sets these automatically when using the NCCL hook, the following environment variables should always be set for correctness and optimal performance when using NCCL:
1818

1919
```bash
20-
export NCCL_NET="AWS Libfabric" # (1)
21-
export NCCL_NET_GDR_LEVEL=PHB # (2)
22-
export FI_CXI_DEFAULT_CQ_SIZE=131072 # (3)
20+
export NCCL_NET="AWS Libfabric" # (1)!
21+
export NCCL_NET_GDR_LEVEL=PHB # (2)!
22+
export FI_CXI_DEFAULT_CQ_SIZE=131072 # (3)!
2323
export FI_CXI_DEFAULT_TX_SIZE=32768
2424
export FI_CXI_DISABLE_HOST_REGISTER=1
2525
export FI_CXI_RX_MATCH_MODE=software
2626
export FI_MR_CACHE_MONITOR=userfaultfd
27-
export MPICH_GPU_SUPPORT_ENABLED=0 # (4)
27+
export MPICH_GPU_SUPPORT_ENABLED=0 # (4)!
2828
```
2929

3030
1. This forces NCCL to use the libfabric plugin, enabling full use of the Slingshot network. If the plugin can not be found, applications will fail to start. With the default value, applications would instead fall back to e.g. TCP, which would be significantly slower than with the plugin. [More information about `NCCL_NET`](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html#nccl-net).

0 commit comments

Comments
 (0)