@@ -38,15 +38,32 @@ Open MPI offers two flavors of CUDA support:
3838 shell$ ./configure --prefix=/path/to/ucx-cuda-install --with-cuda=/usr/local/cuda --with-gdrcopy=/usr
3939
4040 # Configure Open MPI this way
41- shell$ ./configure --with-cuda=/usr/local/cuda --with-ucx=/path/to/ucx-cuda-install < other configure params>
41+ shell$ ./configure --with-cuda=/usr/local/cuda --with-cuda-libdir=/usr/local/cuda/lib64/stubs/ --with- ucx=/path/to/ucx-cuda-install < other configure params>
4242
4343 #. Via internal Open MPI CUDA support
4444
4545Regardless of which flavor of CUDA support (or both) you plan to use,
4646Open MPI should be configured using the ``--with-cuda=<path-to-cuda> ``
47- configure option to build CUDA support into Open MPI.
47+ and ``--with-cuda-libdir=<path-to-libcuda.so> `` configure options to
48+ build CUDA support into Open MPI.
4849
49- This affects the smcuda shared memory btl, as well as the uct btl.
50+ Open MPI supports building with CUDA libraries and running on systems
51+ without CUDA libraries or hardware. In order to take advantage of
52+ this functionality, when compiling, you have to specify the CUDA
53+ dependent components to be built as DSOs using the
54+ ``--enable-mca-dso=<comma-delimited-list-of-cuda-components. ``
55+ configure option.
56+
57+ This affects the ``smcuda `` shared memory and ``uct `` BTLs, as well
58+ as the ``rgpusm `` and ``gpusm `` rcache components.
59+
60+ An example configure command would look like the following:
61+
62+ .. code-block :: sh
63+
64+ # Configure Open MPI this way
65+ shell$ ./configure --with-cuda=/usr/local/cuda --with-cuda-libdir=/usr/local/cuda/lib64/stubs \
66+ --enable-mca-dso=btl-smcuda,rcache-rgpusm,rcache-gpusm,accelerator-cuda < other configure params>
5067
5168 /////////////////////////////////////////////////////////////////////////
5269
@@ -124,6 +141,7 @@ CUDA-aware support is available in:
124141
125142* The UCX (``ucx ``) PML
126143* The PSM2 (``psm2 ``) MTL with the CM (``cm ``) PML.
144+ * The OFI (``ofi ``) MTL with the CM (``cm ``) PML.
127145* Both CUDA-ized shared memory (``smcuda ``) and TCP (``tcp ``) BTLs
128146 with the OB1 (``ob1 ``) PML.
129147* The HCOLL (``hcoll ``) COLL
@@ -152,6 +170,22 @@ For more information refer to the `Intel Omni-Path documentation
152170
153171/////////////////////////////////////////////////////////////////////////
154172
173+ OFI support for CUDA
174+ ---------------------
175+
176+ CUDA-aware support is present in OFI MTL. When running CUDA-aware
177+ Open MPI over Libfabric, the OFI MTL will check if there are any
178+ providers capable of handling GPU (or other accelerator) memory
179+ through the ``hmem ``-related flags. If a CUDA-capable provider is
180+ available, the OFI MTL will directly send GPU buffers through
181+ Libfabric's API after registering the memory. If there are no
182+ CUDA-capable providers available, the buffers will automatically
183+ be copied to host buffers before being transferred through
184+ Libfabric's API.
185+
186+ /////////////////////////////////////////////////////////////////////////
187+
188+
155189How can I tell if Open MPI was built with CUDA support?
156190-------------------------------------------------------
157191
0 commit comments