Skip to content

Commit 9adee00

Browse files
committed
Style fixes from resolved comments in PR
1 parent 8d654ea commit 9adee00

File tree

5 files changed

+21
-18
lines changed

5 files changed

+21
-18
lines changed

Makefile

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ PIP := $(shell command -v pip3 2> /dev/null || command which pip 2> /dev/null)
22
PYTHON := $(shell command -v python3 2> /dev/null || command which python 2> /dev/null)
33
NUM_PROCESSES = 3
44

5-
.PHONY: install dev-install install_conda dev-install_conda tests tests_nccl doc docupdate run_examples run_tutorials
5+
.PHONY: install dev-install dev-install_nccl install_conda install_conda_nccl dev-install_conda dev-install_conda_nccl tests tests_nccl doc docupdate run_examples run_tutorials
66

77
pipcheck:
88
ifndef PIP
@@ -26,16 +26,19 @@ dev-install:
2626

2727
dev-install_nccl:
2828
make pipcheck
29-
$(PIP) install cupy-cuda12x nvidia-nccl-cu12 && $(PIP) install -r requirements-dev.txt && $(PIP) install -e .
29+
$(PIP) install -r requirements-dev.txt && $(PIP) install cupy-cuda12x nvidia-nccl-cu12 $(PIP) install -e .
3030

3131
install_conda:
3232
conda env create -f environment.yml && conda activate pylops_mpi && pip install .
3333

34+
install_conda_nccl:
35+
conda env create -f environment.yml && conda activate pylops_mpi && conda install -c conda-forge cupy nccl && pip install .
36+
3437
dev-install_conda:
3538
conda env create -f environment-dev.yml && conda activate pylops_mpi && pip install -e .
3639

3740
dev-install_conda_nccl:
38-
conda env create -f environment-dev.yml && conda activate pylops_mpi && conda install -c conda-forge cupy cudnn cutensor nccl && pip install -e .
41+
conda env create -f environment-dev.yml && conda activate pylops_mpi && conda install -c conda-forge cupy nccl && pip install -e .
3942

4043
lint:
4144
flake8 pylops_mpi/ tests/ examples/ tutorials/

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,10 @@ and running the following command:
3434
```
3535
make install_conda
3636
```
37-
Optionally, if you work with multi-GPU environment and want to have Nvidia's collective communication calls
38-
[(NCCL)](https://developer.nvidia.com/nccl>) enabled, please visit the [installation guide](https://pylops.github.io/pylops-mpi/installation.html) for further detail
37+
Optionally, if you work with multi-GPU environment and want to use Nvidia's collective communication calls (NCCL) enabled, install your environment with
38+
```
39+
make install_conda_nccl
40+
```
3941
4042
## Run Pylops-MPI
4143
Once you have installed the prerequisites and pylops-mpi, you can run pylops-mpi using the `mpiexec` command.

docs/source/gpu.rst

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ proprietary technology like NVLink that might be available in their infrastructu
3030

3131
Set environment variable ``NCCL_PYLOPS_MPI=0`` to explicitly force PyLops-MPI to ignore the ``NCCL`` backend.
3232
However, this is optional as users may opt-out for NCCL by skip passing `cupy.cuda.nccl.NcclCommunicator` to
33-
the :class:`pylops_mpi.StackedDistributedArray`
33+
the :class:`pylops_mpi.DistributedArray`
3434

3535
Example
3636
-------
@@ -88,7 +88,7 @@ your GPU:
8888
The code is almost unchanged apart from the fact that we now use ``cupy`` arrays,
8989
PyLops-mpi will figure this out!
9090

91-
If NCCL is available, ``cupy.cuda.nccl.NcclCommunicator`` can be initialized and passed to :class:`pylops_mpi.DistributedArray`
91+
Finally, if NCCL is available, a ``cupy.cuda.nccl.NcclCommunicator`` can be initialized and passed to :class:`pylops_mpi.DistributedArray`
9292
as follows:
9393

9494
.. code-block:: python
@@ -102,9 +102,9 @@ as follows:
102102
nxl, nt = 20, 20
103103
dtype = np.float32
104104
d_dist = pylops_mpi.DistributedArray(global_shape=nxl * nt,
105-
base_comm_nccl=nccl_comm,
106-
partition=pylops_mpi.Partition.BROADCAST,
107-
engine="cupy", dtype=dtype)
105+
base_comm_nccl=nccl_comm,
106+
partition=pylops_mpi.Partition.BROADCAST,
107+
engine="cupy", dtype=dtype)
108108
d_dist[:] = cp.ones(d_dist.local_shape, dtype=dtype)
109109
110110
# Create and apply VStack operator
@@ -113,18 +113,16 @@ as follows:
113113
y_dist = HOp @ d_dist
114114
115115
Under the hood, PyLops-MPI use both MPI Communicator and NCCL Communicator to manage distributed operations. Each GPU is logically binded to
116-
one MPI process. Generally speaking, the small operation like array-related shape and size remain using MPI while the collective calls
117-
like AllReduce will be carried through NCCL.
116+
one MPI process. In fact, minor communications like those dealing with array-related shapes and sizes are still performed using MPI, while collective calls on array like AllReduce are carried through NCCL
118117

119118
.. note::
120119

121120
The CuPy and NCCL backend is in active development, with many examples not yet in the docs.
122121
You can find many `other examples <https://github.com/PyLops/pylops_notebooks/tree/master/developement-mpi/Cupy_MPI>`_ from the `PyLops Notebooks repository <https://github.com/PyLops/pylops_notebooks>`_.
123122

124123
Supports for NCCL Backend
125-
-------------------
126-
In the following, we provide a list of modules in which operates on :class:`pylops_mpi.DistributedArray`
127-
that can leverage NCCL backend
124+
----------------------------
125+
In the following, we provide a list of modules (i.e., operators and solvers) where we plan to support NCCL and the current status:
128126

129127
.. list-table::
130128
:widths: 50 25

docs/source/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ computing nodes, enabling large and intricate tasks to be divided, solved, and a
1515
parallelized manner.
1616

1717
PyLops-MPI also supports the Nvidia's Collective Communication Library `(NCCL) <https://developer.nvidia.com/nccl>`_ for high-performance
18-
GPU-to-GPU communications.This PyLops-MPI's NCCL engine works congruently with MPI by delegating the GPU-to-GPU communication tasks to
18+
GPU-to-GPU communications. The PyLops-MPI's NCCL engine works congruently with MPI by delegating the GPU-to-GPU communication tasks to
1919
highly-optimized NCCL, while leveraging MPI for CPU-side coordination and orchestration.
2020

2121
Get started by :ref:`installing PyLops-MPI <Installation>` and following our quick tour.

docs/source/installation.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ Note that, differently from the ``conda`` command, the above **will not** creat
115115
Make sure you create and activate your environment previously.
116116

117117
Simlarly, if you want to enable `NCCL <https://developer.nvidia.com/nccl>`_ but prefer using pip,
118-
you must first check CUDA version of your system:
118+
you must first check the CUDA version of your system:
119119

120120
.. code-block:: bash
121121
@@ -127,7 +127,7 @@ The `Makefile` is pre-configured with CUDA 12.x. If you use this version, run
127127
128128
>> make dev-install_nccl
129129
130-
Otherwise, you can change the command in `Makefile` to appropriate CUDA version
130+
Otherwise, you can change the command in `Makefile` to an appropriate CUDA version
131131
i.e., If you use CUDA 11.x, change ``cupy-cuda12x`` and ``nvidia-nccl-cu12`` to ``cupy-cuda11x`` and ``nvidia-nccl-cu11``
132132
and run the command.
133133

0 commit comments

Comments
 (0)