Skip to content

Commit ef8f49a

Browse files
authored
fix title levels (#12470)
1 parent e618a33 commit ef8f49a

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

docs/source/advanced/model_parallel.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -719,7 +719,7 @@ DDP Optimizations
719719

720720

721721
When Using DDP Strategies, Set find_unused_parameters=False
722-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
722+
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
723723

724724
By default, we have set ``find_unused_parameters=True`` for compatibility reasons that have been observed in the past (refer to the `discussion <https://github.com/PyTorchLightning/pytorch-lightning/discussions/6219>`_ for more details).
725725
When enabled, it can result in a performance hit and can be disabled in most cases. Read more about it `here <https://pytorch.org/docs/stable/notes/ddp.html#internal-design>`_.
@@ -765,7 +765,7 @@ training and apply special optimizations during runtime.
765765
766766
767767
When Using DDP on a Multi-node Cluster, Set NCCL Parameters
768-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
768+
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
769769

770770
`NCCL <https://developer.nvidia.com/nccl>`__ is the NVIDIA Collective Communications Library that is used by PyTorch to handle communication across nodes and GPUs. There are reported benefits in terms of speedups when adjusting NCCL parameters as seen in this `issue <https://github.com/PyTorchLightning/pytorch-lightning/issues/7179>`__. In the issue, we see a 30% speed improvement when training the Transformer XLM-RoBERTa and a 15% improvement in training with Detectron2.
771771

0 commit comments

Comments
 (0)