Skip to content

Commit b192faf

Browse files
committed
add nccl to README, installation guides, Makefile, and index
1 parent fe1b41e commit b192faf

File tree

4 files changed

+36
-5
lines changed

4 files changed

+36
-5
lines changed

Makefile

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,10 @@ dev-install:
2424
make pipcheck
2525
$(PIP) install -r requirements-dev.txt && $(PIP) install -e .
2626

27+
dev-install_nccl:
28+
make pipcheck
29+
$(PIP) install cupy-cuda12x nvidia-nccl-cu12 && $(PIP) install -r requirements-dev.txt && $(PIP) install -e .
30+
2731
install_conda:
2832
conda env create -f environment.yml && conda activate pylops_mpi && pip install .
2933

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ and running the following command:
3434
```
3535
make install_conda
3636
```
37+
Optionally, if you work with multi-GPU environment and want to have Nvidia's collective communication calls
38+
[(NCCL)](https://developer.nvidia.com/nccl>) enabled, please visit the [installation guide](https://pylops.github.io/pylops-mpi/installation.html) for further detail
3739
3840
## Run Pylops-MPI
3941
Once you have installed the prerequisites and pylops-mpi, you can run pylops-mpi using the `mpiexec` command.

docs/source/index.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,10 @@ By integrating MPI (Message Passing Interface), PyLops-MPI optimizes the collabo
1414
computing nodes, enabling large and intricate tasks to be divided, solved, and aggregated in an efficient and
1515
parallelized manner.
1616

17+
PyLops-MPI also supports the Nvidia's Collective Communication Library `(NCCL) <https://developer.nvidia.com/nccl>`_ for high-performance
18+
GPU-to-GPU communications.This PyLops-MPI's NCCL engine works congruently with MPI by delegating the GPU-to-GPU communication tasks to
19+
highly-optimized NCCL, while leveraging MPI for CPU-side coordination and orchestration.
20+
1721
Get started by :ref:`installing PyLops-MPI <Installation>` and following our quick tour.
1822

1923
Terminology

docs/source/installation.rst

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,14 @@ Fork the `PyLops-MPI repository <https://github.com/PyLops/pylops-mpi>`_ and clo
4545
We recommend installing dependencies into a separate environment.
4646
For that end, we provide a `Makefile` with useful commands for setting up the environment.
4747

48+
Enable Nvidia Collective Communication Library
49+
=======================================================
50+
To obtain highly-optimized performance on GPU clusters, PyLops-MPI also supports the Nvidia's collective communication calls
51+
`(NCCL) <https://developer.nvidia.com/nccl>`_. Two additional dependencies are required: CuPy and NCCL
52+
53+
* `CuPy with NCCL <https://docs.cupy.dev/en/stable/install.html>`_
54+
55+
4856
Step-by-step installation for users
4957
***********************************
5058

@@ -89,6 +97,12 @@ For a ``conda`` environment, run
8997
9098
This will create and activate an environment called ``pylops_mpi``, with all required and optional dependencies.
9199

100+
If you want to enable `NCCL <https://developer.nvidia.com/nccl>`_ in PyLops-MPI, run this instead
101+
102+
.. code-block:: bash
103+
104+
>> make dev-install_conda_nccl
105+
92106
Pip
93107
---
94108
If you prefer a ``pip`` installation, we provide the following command
@@ -100,15 +114,22 @@ If you prefer a ``pip`` installation, we provide the following command
100114
Note that, differently from the ``conda`` command, the above **will not** create a virtual environment.
101115
Make sure you create and activate your environment previously.
102116

103-
Enable Nvidia Collective Communication Library (NCCL)
104-
=======================================================
105-
To obtain highly-optimized performance on GPU clusters, PyLops-MPI also supports the Nvidia's collective communication calls (NCCL).
106-
`NCCL <https://developer.nvidia.com/nccl>
117+
Simlarly, if you want to enable `NCCL <https://developer.nvidia.com/nccl>`_ but prefer using pip,
118+
you must first check CUDA version of your system:
119+
120+
.. code-block:: bash
121+
122+
>> nvidia-smi
123+
124+
The `Makefile` is pre-configured with CUDA 12.x. If you use this version, run
107125

108126
.. code-block:: bash
109127
110-
>> make dev-install_conda_nc
128+
>> make dev-install_nccl
111129
130+
Otherwise, you can change the command in `Makefile` to appropriate CUDA version
131+
i.e., If you use CUDA 11.x, change ``cupy-cuda12x`` and ``nvidia-nccl-cu12`` to ``cupy-cuda11x`` and ``nvidia-nccl-cu11``
132+
and run the command.
112133

113134
Run tests
114135
=========

0 commit comments

Comments
 (0)