@@ -641,11 +641,11 @@ OpenSHMEM Collectives
641641Network Support
642642---------------
643643
644- - There are four main MPI network models available: "ob1", "cm",
645- "yalla ", and "ucx ". "ob1" uses BTL ("Byte Transfer Layer")
644+ - There are several main MPI network models available: "ob1", "cm",
645+ "ucx ", and "yalla ". "ob1" uses BTL ("Byte Transfer Layer")
646646 components for each supported network. "cm" uses MTL ("Matching
647- Transport Layer") components for each supported network. "yalla"
648- uses the Mellanox MXM transport. "ucx" uses the OpenUCX transport.
647+ Transport Layer") components for each supported network. "ucx" uses
648+ the OpenUCX transport.
649649
650650 - "ob1" supports a variety of networks that can be used in
651651 combination with each other:
@@ -668,42 +668,93 @@ Network Support
668668 - OpenFabrics Interfaces ("libfabric" tag matching)
669669 - Portals 4
670670
671- Open MPI will, by default, choose to use "cm" when one of the
672- above transports can be used, unless OpenUCX or MXM support is
673- detected, in which case the "ucx" or "yalla" PML will be used
674- by default. Otherwise, "ob1" will be used and the corresponding
675- BTLs will be selected. Users can force the use of ob1 or cm if
676- desired by setting the "pml" MCA parameter at run-time:
671+ - UCX is the Unified Communication X (UCX) communication library
672+ (http://www.openucx.org/). This is an open-source project
673+ developed in collaboration between industry, laboratories, and
674+ academia to create an open-source production grade communication
675+ framework for data centric and high-performance applications. The
676+ UCX library can be downloaded from repositories (e.g.,
677+ Fedora/RedHat yum repositories). The UCX library is also part of
678+ Mellanox OFED and Mellanox HPC-X binary distributions.
677679
678- shell$ mpirun --mca pml ob1 ...
680+ UCX currently supports:
681+
682+ - OpenFabrics Verbs (including InfiniBand and RoCE)
683+ - Cray's uGNI
684+ - TCP
685+ - Shared memory
686+ - NVIDIA CUDA drivers
687+
688+ While users can manually select any of the above transports at run
689+ time, Open MPI will select a default transport as follows:
690+
691+ 1. If InfiniBand devices are available, use the UCX PML.
692+
693+ 2. If PSM, PSM2, or other tag-matching-supporting Libfabric
694+ transport devices are available (e.g., Cray uGNI), use the "cm"
695+ PML and a single appropriate corresponding "mtl" module.
696+
697+ 3. If MXM/InfiniBand devices are availble, use the "yalla" PML
698+ (NOTE: the "yalla"/MXM PML is deprecated -- see below).
699+
700+ 4. Otherwise, use the ob1 PML and one or more appropriate "btl"
701+ modules.
702+
703+ Users can override Open MPI's default selection algorithms and force
704+ the use of a specific transport if desired by setting the "pml" MCA
705+ parameter (and potentially the "btl" and/or "mtl" MCA parameters) at
706+ run-time:
707+
708+ shell$ mpirun --mca pml ob1 --mca btl [comma-delimted-BTLs] ...
709+ or
710+ shell$ mpirun --mca pml cm --mca mtl [MTL] ...
679711 or
680- shell$ mpirun --mca pml cm ...
681-
682- - Similarly, there are two OpenSHMEM network models available: "ucx",
683- and "ikrit":
684- - "ucx" interfaces directly with UCX;
685- - "ikrit" interfaces directly with Mellanox MXM.
686-
687- - UCX is the Unified Communication X (UCX) communication library
688- (http://www.openucx.org/).
689- This is an open-source project developed in collaboration between
690- industry, laboratories, and academia to create an open-source
691- production grade communication framework for data centric and
692- high-performance applications.
693- UCX currently supports:
694- - OFA Verbs;
695- - Cray's uGNI;
696- - NVIDIA CUDA drivers.
697-
698- - MXM is the Mellanox Messaging Accelerator library utilizing a full
699- range of IB transports to provide the following messaging services
700- to the upper level MPI/OpenSHMEM libraries:
701-
702- - Usage of all available IB transports
703- - Native RDMA support
704- - Progress thread
705- - Shared memory communication
706- - Hardware-assisted reliability
712+ shell$ mpirun --mca pml ucx ...
713+
714+ As alluded to above, there is actually a fourth MPI point-to-point
715+ transport, but it is deprecated and will likely be removed in a
716+ future Open MPI release:
717+
718+ - "yalla" uses the Mellanox MXM transport library. MXM is the
719+ deprecated Mellanox Messaging Accelerator library, utilizing a
720+ full range of IB transports to provide the following messaging
721+ services to the upper level MPI/OpenSHMEM libraries. MXM is only
722+ included in this release of Open MPI for backwards compatibility;
723+ the "ucx" PML should be used insead.
724+
725+ - The main OpenSHMEM network model is "ucx"; it interfaces directly
726+ with UCX.
727+
728+ The "ikrit" OpenSHMEM network model is also available, but is
729+ deprecated; it uses the deprecated Mellanox Message Accelerator
730+ (MXM) library.
731+
732+ - In prior versions of Open MPI, InfiniBand and RoCE support was
733+ provided through the openib BTL and ob1 PML plugins. Starting with
734+ Open MPI 4.0.0, InfiniBand support through the openib+ob1 plugins is
735+ both deprecated and superseded by the ucx PML component.
736+
737+ While the openib BTL depended on libibverbs, the UCX PML depends on
738+ the UCX library.
739+
740+ Once installed, Open MPI can be built with UCX support by adding
741+ --with-ucx to the Open MPI configure command. Once Open MPI is
742+ configured to use UCX, the runtime will automatically select the UCX
743+ PML if one of the supported networks is detected (e.g., InfiniBand).
744+ It's possible to force using UCX in the mpirun or oshrun command
745+ lines by specifying any or all of the following mca parameters:
746+ "--mca pml ucx" for MPI point-to-point operations, "--mca spml ucx"
747+ for OpenSHMEM support, and "--mca osc ucx" for MPI RMA (one-sided)
748+ operations.
749+
750+ - Although the ob1 PML+openib BTL is still the default for iWARP and
751+ RoCE devices, it will reject InfiniBand defaults (by default) so
752+ that they will use the ucx PML. If using the openib BTL is still
753+ desired, set the following MCA parameters:
754+
755+ # Note that "vader" is Open MPI's shared memory BTL
756+ $ mpirun --mca pml ob1 --mca btl openib,vader,self \
757+ --mca btl_openib_allow_ib 1 ...
707758
708759- The usnic BTL is support for Cisco's usNIC device ("userspace NIC")
709760 on Cisco UCS servers with the Virtualized Interface Card (VIC).
@@ -756,32 +807,6 @@ Network Support
756807 mechanisms for Open MPI to utilize single-copy semantics for shared
757808 memory.
758809
759- - In prior versions of Open MPI, InfiniBand and RoCE support was
760- provided through the openib BTL and ob1 PML plugins. Starting with
761- Open MPI 4.0.0, InfiniBand support through the openib+ob1 plugins is
762- both deprecated and superseded by the UCX PML component.
763-
764- UCX is an open-source optimized communication library which supports
765- multiple networks, including RoCE, InfiniBand, uGNI, TCP, shared
766- memory, and others.
767-
768- While the openib BTL depended on libibverbs, the UCX PML depends on
769- the UCX library. The UCX library can be downloaded from
770- http://www.openucx.org/ or from various Linux distribution
771- repositories (e.g., Fedora/RedHat yum repositories). The UCX
772- library is also part of Mellanox OFED and Mellanox HPC-X binary
773- distributions.
774-
775- Once installed, Open MPI can be built with UCX support by adding
776- --with-ucx to the Open MPI configure command. Once Open MPI is
777- configured to use UCX, the runtime will automatically select the UCX
778- PML if one of the supported networks is detected (e.g., InfiniBand).
779- It's possible to force using UCX in the mpirun or oshrun command
780- lines by specifying any or all of the following mca parameters:
781- "-mca pml ucx" for MPI point-to-point operations, "-mca spml ucx"
782- for OpenSHMEM support, and "-mca osc ucx" for MPI RMA (one-sided)
783- operations.
784-
785810Open MPI Extensions
786811-------------------
787812
0 commit comments