Skip to content

Commit 02ba45b

Browse files
committed
doc: added details about cuda-aware mpi in doc
1 parent ec88371 commit 02ba45b

File tree

2 files changed

+47
-7
lines changed

2 files changed

+47
-7
lines changed

docs/source/gpu.rst

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ This library must be installed *before* PyLops-mpi is installed.
1111

1212
.. note::
1313

14-
Set environment variable ``CUPY_PYLOPS=0`` to force PyLops to ignore the ``cupy`` backend.
14+
Set the environment variable ``CUPY_PYLOPS=0`` to force PyLops to ignore the ``cupy`` backend.
1515
This can be also used if a previous (or faulty) version of ``cupy`` is installed in your system,
1616
otherwise you will get an error when importing PyLops.
1717

@@ -22,6 +22,14 @@ can handle both scenarios. Note that, since most operators in PyLops-mpi are thi
2222
some of the operators in PyLops that lack a GPU implementation cannot be used also in PyLops-mpi when working with
2323
cupy arrays.
2424

25+
.. note::
26+
27+
By default when using ``cupy`` arrays, PyLops-MPI will try to use methods in MPI4Py that communicate memory buffers.
28+
However, this requires a CUDA-Aware MPI installation. If your MPI installation is not CUDA-Aware, set the
29+
environment variable ``PYLOPS_MPI_CUDA_AWARE=0`` to force PyLops-MPI to use methods in MPI4Py that communicate
30+
general Python objects (this will incur a loss of performance!).
31+
32+
2533
Moreover, PyLops-MPI also supports the Nvidia's Collective Communication Library (NCCL) for highly-optimized
2634
collective operations, such as AllReduce, AllGather, etc. This allows PyLops-MPI users to leverage the
2735
proprietary technology like NVLink that might be available in their infrastructure for fast data communication.

docs/source/installation.rst

Lines changed: 38 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,13 @@ The minimal set of dependencies for the PyLops-MPI project is:
1515
* `MPI4py <https://mpi4py.readthedocs.io/en/stable/>`_
1616
* `PyLops <https://pylops.readthedocs.io/en/stable/>`_
1717

18-
Additionally, to use the NCCL engine, the following additional
18+
Additionally, to use the CUDA-aware MPI engine, the following additional
19+
dependencies are required:
20+
21+
* `CuPy <https://cupy.dev/>`_
22+
* CUDA-aware MPI
23+
24+
Similarly, to use the NCCL engine, the following additional
1925
dependencies are required:
2026

2127
* `CuPy <https://cupy.dev/>`_
@@ -27,12 +33,18 @@ if this is not possible, some of the dependencies must be installed prior to ins
2733

2834
Download and Install MPI
2935
========================
30-
Visit the official MPI website to download an appropriate MPI implementation for your system.
31-
Follow the installation instructions provided by the MPI vendor.
36+
Visit the official website of your MPI vendor of choice to download an appropriate MPI
37+
implementation for your system:
38+
39+
* `Open MPI <https://docs.open-mpi.org/>`_
40+
* `MPICH <https://www.mpich.org/>`_
41+
* `Intel MPI <https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html>`_
42+
* ...
3243

33-
* `Open MPI <https://www.open-mpi.org/software/ompi/v1.10/>`_
34-
* `MPICH <https://www.mpich.org/downloads/>`_
35-
* `Intel MPI <https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html#gs.10j8fx>`_
44+
Alternatively, the conda-forge community provides ready-to-use binary packages for four MPI implementations
45+
(see `MPI4Py documentation <https://mpi4py.readthedocs.io/en/stable/install.html#conda-packages>`_ for more
46+
details). In this case, you can defer the installation to the stage when the conda environment for your project
47+
is created - see below for more details.
3648

3749
Verify MPI Installation
3850
=======================
@@ -42,6 +54,17 @@ After installing MPI, verify its installation by opening a terminal and running
4254
4355
>> mpiexec --version
4456
57+
Install CUDA-Aware MPI (optional)
58+
=================================
59+
To be able to achieve the best performance when using PyLops-MPI with CuPy arrays, a CUDA-Aware version of
60+
MPI must be installed.
61+
62+
For `Open MPI`, the conda-forge package has built-in CUDA support, as long as a pre-installed CUDA is detected.
63+
Run the following `commands <https://docs.open-mpi.org/en/v5.0.x/tuning-apps/networking/cuda.html#how-do-i-verify-that-open-mpi-has-been-built-with-cuda-support>`_
64+
for diagnostics.
65+
66+
For the other MPI implementations, refer to their specific documentation.
67+
4568
Install NCCL (optional)
4669
=======================
4770
To obtain highly-optimized performance on GPU clusters, PyLops-MPI also supports the Nvidia's collective communication calls
@@ -103,6 +126,15 @@ For a ``conda`` environment, run
103126
This will create and activate an environment called ``pylops_mpi``, with all
104127
required and optional dependencies.
105128

129+
If you want to also install MPI as part of the creation process of the conda environment,
130+
modify the ``environment-dev.yml`` file by adding ``openmpi``\``mpich`\``impi_rt``\``msmpi``
131+
just above ``mpi4py``. Note that only ``openmpi`` provides a CUDA-Aware MPI installation.
132+
133+
If you want to leverage CUDA-Aware MPI but prefer to use another MPI installation, you must
134+
either switch to a `Pip`-based installation (see below), or move ``mpi4py`` into the ``pip``
135+
section of the ``environment-dev.yml`` file and export the variable ``MPICC`` pointing to
136+
the path of your CUDA-Aware MPI installation.
137+
106138
If you want to enable `NCCL <https://developer.nvidia.com/nccl>`_ in PyLops-MPI, run this instead
107139

108140
.. code-block:: bash

0 commit comments

Comments
 (0)