Style fixes from resolved comments in PR

tharittk · tharittk · commit 9adee0005638 · 2025-06-01T12:46:39.000+07:00
diff --git a/Makefile b/Makefile
@@ -2,7 +2,7 @@ PIP := $(shell command -v pip3 2> /dev/null || command which pip 2> /dev/null)
 PYTHON := $(shell command -v python3 2> /dev/null || command which python 2> /dev/null)
 NUM_PROCESSES = 3
 
-.PHONY: install dev-install install_conda dev-install_conda tests tests_nccl doc docupdate run_examples run_tutorials
+.PHONY: install dev-install dev-install_nccl install_conda install_conda_nccl dev-install_conda dev-install_conda_nccl tests tests_nccl doc docupdate run_examples run_tutorials
 
 pipcheck:
 ifndef PIP
@@ -26,16 +26,19 @@ dev-install:
 
 dev-install_nccl:
 	make pipcheck
-	$(PIP) install cupy-cuda12x nvidia-nccl-cu12 && $(PIP) install -r requirements-dev.txt && $(PIP) install -e .
+	$(PIP) install -r requirements-dev.txt && $(PIP) install cupy-cuda12x nvidia-nccl-cu12  $(PIP) install -e .
 
 install_conda:
 	conda env create -f environment.yml && conda activate pylops_mpi && pip install .
 
+install_conda_nccl:
+	conda env create -f environment.yml && conda activate pylops_mpi && conda install -c conda-forge cupy nccl && pip install .
+
 dev-install_conda:
 	conda env create -f environment-dev.yml && conda activate pylops_mpi && pip install -e .
 
 dev-install_conda_nccl:
-	conda env create -f environment-dev.yml && conda activate pylops_mpi && conda install -c conda-forge cupy cudnn cutensor nccl && pip install -e .
+	conda env create -f environment-dev.yml && conda activate pylops_mpi && conda install -c conda-forge cupy nccl && pip install -e .
 
 lint:
 	flake8 pylops_mpi/ tests/ examples/ tutorials/
diff --git a/README.md b/README.md
@@ -34,8 +34,10 @@ and running the following command:
       ```
       make install_conda
       ```
-      Optionally, if you work with multi-GPU environment and want to have Nvidia's collective communication calls
-      [(NCCL)](https://developer.nvidia.com/nccl>) enabled, please visit the [installation guide](https://pylops.github.io/pylops-mpi/installation.html) for further detail
+Optionally, if you work with multi-GPU environment and want to use Nvidia's collective communication calls (NCCL) enabled, install your environment with
+   ```
+   make install_conda_nccl 
+   ```
    
 ## Run Pylops-MPI
 Once you have installed the prerequisites and pylops-mpi, you can run pylops-mpi using the `mpiexec` command. 
diff --git a/docs/source/gpu.rst b/docs/source/gpu.rst
@@ -30,7 +30,7 @@ proprietary technology like NVLink that might be available in their infrastructu
 
    Set environment variable ``NCCL_PYLOPS_MPI=0`` to explicitly force PyLops-MPI to ignore the ``NCCL`` backend.
    However, this is optional as users may opt-out for NCCL by skip passing `cupy.cuda.nccl.NcclCommunicator` to
-   the :class:`pylops_mpi.StackedDistributedArray` 
+   the :class:`pylops_mpi.DistributedArray` 
 
 Example
 -------
@@ -88,7 +88,7 @@ your GPU:
 The code is almost unchanged apart from the fact that we now use ``cupy`` arrays,
 PyLops-mpi will figure this out!
 
-If NCCL is available, ``cupy.cuda.nccl.NcclCommunicator`` can be initialized and passed to :class:`pylops_mpi.DistributedArray`
+Finally, if NCCL is available, a ``cupy.cuda.nccl.NcclCommunicator`` can be initialized and passed to :class:`pylops_mpi.DistributedArray`
 as follows:
 
 .. code-block:: python
@@ -102,9 +102,9 @@ as follows:
     nxl, nt = 20, 20
     dtype = np.float32
     d_dist = pylops_mpi.DistributedArray(global_shape=nxl * nt,
-                                        base_comm_nccl=nccl_comm,
-                                        partition=pylops_mpi.Partition.BROADCAST,
-                                        engine="cupy", dtype=dtype)
+                                         base_comm_nccl=nccl_comm,
+                                         partition=pylops_mpi.Partition.BROADCAST,
+                                         engine="cupy", dtype=dtype)
     d_dist[:] = cp.ones(d_dist.local_shape, dtype=dtype)
 
     # Create and apply VStack operator
@@ -113,18 +113,16 @@ as follows:
     y_dist = HOp @ d_dist
 
 Under the hood, PyLops-MPI use both MPI Communicator and NCCL Communicator to manage distributed operations. Each GPU is logically binded to 
-one MPI process. Generally speaking, the small operation like array-related shape and size remain using MPI while the collective calls 
-like AllReduce will be carried through NCCL.
+one MPI process. In fact, minor communications like those dealing with array-related shapes and sizes are still performed using MPI, while collective calls on array like AllReduce are carried through NCCL
 
 .. note::
 
    The CuPy and NCCL backend is in active development, with many examples not yet in the docs.
    You can find many `other examples <https://github.com/PyLops/pylops_notebooks/tree/master/developement-mpi/Cupy_MPI>`_ from the `PyLops Notebooks repository <https://github.com/PyLops/pylops_notebooks>`_.
 
 Supports for NCCL Backend
--------------------
-In the following, we provide a list of modules in which operates on :class:`pylops_mpi.DistributedArray` 
-that can leverage NCCL backend
+----------------------------
+In the following, we provide a list of modules (i.e., operators and solvers) where we plan to support NCCL and the current status:
 
 .. list-table::
    :widths: 50 25 
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -15,7 +15,7 @@ computing nodes, enabling large and intricate tasks to be divided, solved, and a
 parallelized manner.
 
 PyLops-MPI also supports the Nvidia's Collective Communication Library `(NCCL) <https://developer.nvidia.com/nccl>`_ for high-performance
-GPU-to-GPU communications.This PyLops-MPI's NCCL engine works congruently with MPI by delegating the GPU-to-GPU communication tasks to 
+GPU-to-GPU communications. The PyLops-MPI's NCCL engine works congruently with MPI by delegating the GPU-to-GPU communication tasks to 
 highly-optimized NCCL, while leveraging MPI for CPU-side coordination and orchestration.
 
 Get started by :ref:`installing PyLops-MPI <Installation>` and following our quick tour.
diff --git a/docs/source/installation.rst b/docs/source/installation.rst
@@ -115,7 +115,7 @@ Note that, differently from the  ``conda`` command, the above **will not** creat
 Make sure you create and activate your environment previously.
 
 Simlarly, if you want to enable `NCCL <https://developer.nvidia.com/nccl>`_ but prefer using pip,
-you must first check CUDA version of your system:
+you must first check the CUDA version of your system:
 
 .. code-block:: bash
 
@@ -127,7 +127,7 @@ The `Makefile` is pre-configured with CUDA 12.x. If you use this version, run
 
    >> make dev-install_nccl
 
-Otherwise, you can change the command in `Makefile` to appropriate CUDA version
+Otherwise, you can change the command in `Makefile` to an appropriate CUDA version
 i.e., If you use CUDA 11.x, change ``cupy-cuda12x`` and ``nvidia-nccl-cu12`` to ``cupy-cuda11x`` and ``nvidia-nccl-cu11``  
 and run the command.