| 
7 | 7 | [](https://pylops.slack.com)  | 
8 | 8 | [](https://doi.org/10.21105/joss.07512)  | 
9 | 9 | 
 
  | 
10 |  | -## PyLops MPI  | 
11 |  | -pylops-mpi is a Python library built on top of [PyLops](https://pylops.readthedocs.io/en/stable/), designed to enable distributed and parallel processing of   | 
 | 10 | +# Distributed linear operators and solvers  | 
 | 11 | +Pylops-mpi is a Python library built on top of [PyLops](https://pylops.readthedocs.io/en/stable/), designed to enable distributed and parallel processing of   | 
12 | 12 | large-scale linear algebra operations and computations.    | 
13 | 13 | 
 
  | 
14 | 14 | ## Installation  | 
15 |  | -To install pylops-mpi, you need to have MPI (Message Passing Interface) installed on your system.  | 
 | 15 | +To install pylops-mpi, you need to have Message Passing Interface (MPI) and optionally Nvidia's Collective Communication Library (NCCL) installed on your system.  | 
 | 16 | + | 
16 | 17 | 1. **Download and Install MPI**: Visit the official MPI website to download an appropriate MPI implementation for your system.   | 
17 | 18 | Follow the installation instructions provided by the MPI vendor.  | 
18 | 19 |    - [Open MPI](https://www.open-mpi.org/software/ompi/v1.10/)  | 
19 | 20 |    - [MPICH](https://www.mpich.org/downloads/)  | 
20 | 21 |    - [Intel MPI](https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html#gs.10j8fx)  | 
 | 22 | + | 
21 | 23 | 2. **Verify MPI Installation**: After installing MPI, verify its installation by opening a terminal or command prompt   | 
22 | 24 | and running the following command:  | 
23 |  | -    ```  | 
24 |  | -    mpiexec --version  | 
25 | 25 |    ```  | 
26 |  | - 3. **Install pylops-mpi**: Once MPI is installed and verified, you can proceed to install `pylops-mpi`.   | 
27 |  | -     | 
28 |  | -      You can install with `pip`:  | 
29 |  | -      ```  | 
30 |  | -      pip install pylops-mpi  | 
31 |  | -      ```  | 
32 |  | -     | 
33 |  | -      You can install with `make` and `conda`:  | 
34 |  | -      ```  | 
35 |  | -      make install_conda  | 
36 |  | -      ```  | 
37 |  | -Optionally, if you work with multi-GPU environment and want to use Nvidia's collective communication calls (NCCL) enabled, install your environment with  | 
 | 26 | +   mpiexec --version  | 
 | 27 | +   ```  | 
 | 28 | + | 
 | 29 | +3. **Install pylops-mpi**: Once MPI is installed and verified, you can proceed to install `pylops-mpi` via `pip`:  | 
 | 30 | +   ```  | 
 | 31 | +   pip install pylops-mpi  | 
 | 32 | +   ```  | 
 | 33 | + | 
 | 34 | +4. (Optional) To enable the NCCL backend for multi-GPU systems, install `cupy` and `nccl` via `pip`:  | 
38 | 35 |    ```  | 
39 |  | -   make install_conda_nccl   | 
 | 36 | +   pip install cupy-cudaXx nvidia-nccl-cuX  | 
40 | 37 |    ```  | 
41 | 38 | 
 
  | 
 | 39 | +   with `X=11,12`.  | 
 | 40 | + | 
 | 41 | +Alternatively, if the Conda package manager is used to setup the Python environment, steps 1 and 2 can be skipped and `mpi4py` can be installed directly alongside the MPI distribution of choice:  | 
 | 42 | + | 
 | 43 | +```  | 
 | 44 | +conda install -c conda-forge mpi4py X  | 
 | 45 | +```  | 
 | 46 | + | 
 | 47 | +with `X=mpich, openmpi, impi_rt, msmpi`. Similarly step 4 can be accomplished using:  | 
 | 48 | + | 
 | 49 | +```  | 
 | 50 | +conda install -c conda-forge cupy nccl   | 
 | 51 | +```  | 
 | 52 | + | 
 | 53 | +See the docs ([Installation](https://pylops.github.io/pylops-mpi/installation.html)) for more information.  | 
 | 54 | + | 
42 | 55 | ## Run Pylops-MPI  | 
43 | 56 | Once you have installed the prerequisites and pylops-mpi, you can run pylops-mpi using the `mpiexec` command.   | 
44 |  | -Here's an example on how to run the command:  | 
 | 57 | + | 
 | 58 | +Here is an example on how to run a python script called `<script_name>.py`:  | 
45 | 59 | ```  | 
46 | 60 | mpiexec -n <NUM_PROCESSES> python <script_name>.py  | 
47 | 61 | ```  | 
48 | 62 | 
 
  | 
49 |  | -## Example  | 
50 |  | -The DistributedArray can be used to either broadcast or scatter the NumPy array across different   | 
51 |  | -ranks or processes.  | 
 | 63 | +## Example: A distributed finite-difference operator  | 
 | 64 | +The following example is a modified version of   | 
 | 65 | +[PyLops' README](https://github.com/PyLops/pylops/blob/dev/README.md)_ starting   | 
 | 66 | +example that can handle a 2D-array distributed across ranks over the first dimension   | 
 | 67 | +via the `DistributedArray` object:  | 
 | 68 | + | 
52 | 69 | ```python  | 
 | 70 | +import numpy as np  | 
53 | 71 | from pylops_mpi import DistributedArray, Partition  | 
54 | 72 | 
 
  | 
55 |  | -global_shape = (10, 5)  | 
 | 73 | +# Initialize DistributedArray with partition set to Scatter  | 
 | 74 | +nx, ny = 11, 21  | 
 | 75 | +x = np.zeros((nx, ny), dtype=np.float64)  | 
 | 76 | +x[nx // 2, ny // 2] = 1.0  | 
56 | 77 | 
 
  | 
57 |  | -# Initialize a DistributedArray with partition set to Broadcast  | 
58 |  | -dist_array_broadcast = DistributedArray(global_shape=global_shape,  | 
59 |  | -                                        partition=Partition.BROADCAST)  | 
 | 78 | +x_dist = pylops_mpi.DistributedArray.to_dist(  | 
 | 79 | +            x=x.flatten(),   | 
 | 80 | +            partition=Partition.SCATTER)  | 
60 | 81 | 
 
  | 
61 |  | -# Initialize a DistributedArray with partition set to Scatter  | 
62 |  | -dist_array_scatter = DistributedArray(global_shape=global_shape,  | 
63 |  | -                                      partition=Partition.SCATTER)  | 
64 |  | -```  | 
 | 82 | +# Distributed first-derivative  | 
 | 83 | +D_op = pylops_mpi.MPIFirstDerivative((nx, ny), dtype=np.float64)  | 
65 | 84 | 
 
  | 
66 |  | -Additionally, the DistributedArray can be used to scatter the array along any  | 
67 |  | -specified axis.  | 
 | 85 | +# y = Dx  | 
 | 86 | +y_dist = D_op @ x_dist  | 
68 | 87 | 
 
  | 
69 |  | -```python  | 
70 |  | -# Partition axis = 0  | 
71 |  | -dist_array_0 = DistributedArray(global_shape=global_shape,   | 
72 |  | -                                partition=Partition.SCATTER, axis=0)  | 
 | 88 | +# xadj = D^H y  | 
 | 89 | +xadj_dist = D_op.H @ y_dist  | 
73 | 90 | 
 
  | 
74 |  | -# Partition axis = 1  | 
75 |  | -dist_array_1 = DistributedArray(global_shape=global_shape,   | 
76 |  | -                                partition=Partition.SCATTER, axis=1)  | 
 | 91 | +# xinv = D^-1 y  | 
 | 92 | +x0_dist = pylops_mpi.DistributedArray(D_op.shape[1], dtype=np.float64)  | 
 | 93 | +x0_dist[:] = 0  | 
 | 94 | +xinv_dist = pylops_mpi.cgls(D_op, y_dist, x0=x0_dist, niter=10)[0]  | 
77 | 95 | ```  | 
78 | 96 | 
 
  | 
79 |  | -The DistributedArray class provides a `to_dist` class method that accepts a NumPy array as input and converts it into an   | 
80 |  | -instance of the `DistributedArray` class. This method is used to transform a regular NumPy array into a DistributedArray that can be distributed   | 
81 |  | -and processed across multiple nodes or processes.  | 
 | 97 | +Note that the `DistributedArray` class provides the `to_dist` class method that accepts a NumPy array as input and converts it into an instance of the `DistributedArray` class. This method is used to transform a regular NumPy array into a DistributedArray that is distributed and processed across multiple nodes or processes.  | 
82 | 98 | 
 
  | 
83 |  | -```python  | 
84 |  | -import numpy as np  | 
85 |  | -np.random.seed(42)  | 
86 |  | - | 
87 |  | -dist_arr = DistributedArray.to_dist(x=np.random.normal(100, 100, global_shape),   | 
88 |  | -                                    partition=Partition.SCATTER, axis=0)  | 
89 |  | -```  | 
90 |  | -The DistributedArray also provides fundamental mathematical operations, like element-wise addition, subtraction, and multiplication,   | 
91 |  | -as well as dot product and the [`np.linalg.norm`](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) function in a distributed fashion,   | 
92 |  | -thus utilizing the efficiency of the MPI protocol. This enables efficient computation and processing of large-scale distributed arrays.  | 
 | 99 | +Moreover, the `DistributedArray` class provides also fundamental mathematical operations, such as element-wise addition, subtraction, multiplication, dot product, and an equivalent of the [`np.linalg.norm`](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) function that operate in a distributed fashion,   | 
 | 100 | +thus utilizing the efficiency of the MPI/NCC; protocols. This enables efficient computation and processing of large-scale distributed arrays.  | 
93 | 101 | 
 
  | 
94 | 102 | ## Running Tests  | 
95 |  | -The test scripts are located in the tests folder.  | 
 | 103 | +The MPI test scripts are located in the `tests` folder.  | 
96 | 104 | Use the following command to run the tests:  | 
97 | 105 | ```  | 
98 |  | -mpiexec -n <NUM_PROCESSES> pytest --with-mpi  | 
 | 106 | +mpiexec -n <NUM_PROCESSES> pytest tests/ --with-mpi  | 
 | 107 | +```  | 
 | 108 | +where the `--with-mpi` option tells pytest to enable the `pytest-mpi` plugin, allowing the tests to utilize the MPI functionality.  | 
 | 109 | + | 
 | 110 | +Similarly, to run the NCCL test scripts in the `tests_nccl` folder,   | 
 | 111 | +use the following command to run the tests:  | 
 | 112 | +```  | 
 | 113 | +mpiexec -n <NUM_PROCESSES> pytest tests_nccl/ --with-mpi  | 
99 | 114 | ```  | 
100 |  | -The `--with-mpi` option tells pytest to enable the `pytest-mpi` plugin,   | 
101 |  | -allowing the tests to utilize the MPI functionality.  | 
102 | 115 | 
 
  | 
103 | 116 | ## Documentation   | 
104 | 117 | The official documentation of Pylops-MPI is available [here](https://pylops.github.io/pylops-mpi/).  | 
105 | 118 | Visit the official docs to learn more about pylops-mpi.  | 
106 | 119 | 
 
  | 
107 | 120 | ## Contributors  | 
108 | 121 | * Rohan Babbar, rohanbabbar04  | 
 | 122 | +* Yuxi Hong, hongyx11  | 
109 | 123 | * Matteo Ravasi, mrava87  | 
 | 124 | +* Tharit Tangkijwanichakul, tharittk  | 
0 commit comments