Skip to content

Commit b6df40a

Browse files
authored
Merge pull request #187 from mrava87/v0.4.0
build: prepare for v0.4.0
2 parents ab15936 + 1f5ac7c commit b6df40a

File tree

4 files changed

+44
-10
lines changed

4 files changed

+44
-10
lines changed

CHANGELOG.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,18 @@
1+
# 0.4.0
2+
* Added `pylops_mpi.Distributed.DistributedMixIn` class with
3+
communicator-agnostic calls to communication methods.
4+
* Added `pylops_mpi.utils._mpi` with implementations of MPI
5+
communication methods.
6+
* Added `kind="summa"` implementation in
7+
`pylops_mpi.basicoperators.MPIMatrixMult` operator.
8+
* Added `kind` paramter to all operators in `pylops_mpi.basicoperators.MPILaplacian`
9+
* Added `cp.cuda.Device().synchronize()` before any MPI call when using
10+
Cuda-Aware MPI.
11+
* Modified `pylops_mpi.utils._nccl.initialize_nccl_comm` to
12+
handle nodes with more GPUs than ranks.
13+
* Fixed bug in `pylops_mpi.DistributedArray.__neg__` by
14+
explicitely passing `base_comm_nccl` during internal creation
15+
of distributed array .
116

217
# 0.3.0
318
* Added `pylops_mpi.basicoperators.MPIMatrixMult` operator.

docs/source/changelog.rst

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,27 @@ Changelog
44
=========
55

66

7+
Version 0.4.0
8+
-------------
9+
10+
*Released on: 08/03/2026*
11+
12+
* Added :class:`pylops_mpi.Distributed.DistributedMixIn` class with
13+
communicator-agnostic calls to communication methods.
14+
* Added :mod:`pylops_mpi.utils._mpi` with implementations of MPI
15+
communication methods.
16+
* Added `kind="summa"` implementation in
17+
:class:`pylops_mpi.basicoperators.MPIMatrixMult` operator.
18+
* Added `kind` paramter to all operators in :class:`pylops_mpi.basicoperators.MPILaplacian`
19+
* Added `cp.cuda.Device().synchronize()` before any MPI call when using
20+
Cuda-Aware MPI.
21+
* Modified :func:`pylops_mpi.utils._nccl.initialize_nccl_comm` to
22+
handle nodes with more GPUs than ranks
23+
* Fixed bug in :func:`pylops_mpi.DistributedArray.__neg__` by
24+
explicitely passing `base_comm_nccl` during internal creation
25+
of distributed array
26+
27+
728
Version 0.3.0
829
-------------
930

docs/source/gpu.rst

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,9 @@ cupy arrays.
2424

2525
.. note::
2626

27-
By default when using ``cupy`` arrays, PyLops-MPI will try to use methods in MPI4Py that communicate memory buffers.
28-
However, this requires a CUDA-Aware MPI installation. If your MPI installation is not CUDA-Aware, set the
29-
environment variable ``PYLOPS_MPI_CUDA_AWARE=0`` to force PyLops-MPI to use methods in MPI4Py that communicate
30-
general Python objects (this will incur a loss of performance!).
31-
27+
By default when using ``cupy`` arrays, PyLops-MPI will try to use methods in MPI4Py that communicate general
28+
Python objects (this will incur a loss of performance!). If you have a CUDA-Aware MPI installation, set
29+
``PYLOPS_MPI_CUDA_AWARE=1`` for PyLops-MPI to use methods in MPI4Py that communicate memory buffers.
3230

3331
Moreover, PyLops-MPI also supports the Nvidia's Collective Communication Library (NCCL) for highly-optimized
3432
collective operations, such as AllReduce, AllGather, etc. This allows PyLops-MPI users to leverage the
@@ -53,11 +51,11 @@ In summary:
5351
- Default
5452
- Cannot be disabled
5553
* - CuPy + MPI
56-
- ``PYLOPS_MPI_CUDA_AWARE=0``
57-
- ``PYLOPS_MPI_CUDA_AWARE=1`` (default)
54+
- ``PYLOPS_MPI_CUDA_AWARE=0`` (default)
55+
- ``PYLOPS_MPI_CUDA_AWARE=1``
5856
* - CuPy + CUDA-Aware MPI
59-
- ``PYLOPS_MPI_CUDA_AWARE=1`` (default)
60-
- ``PYLOPS_MPI_CUDA_AWARE=0``
57+
- ``PYLOPS_MPI_CUDA_AWARE=1``
58+
- ``PYLOPS_MPI_CUDA_AWARE=0`` (default)
6159
* - CuPy + NCCL
6260
- ``NCCL_PYLOPS_MPI=1`` (default)
6361
- ``NCCL_PYLOPS_MPI=0``

pylops_mpi/utils/deps.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ def nccl_import(message: Optional[str] = None) -> str:
4040

4141

4242
cuda_aware_mpi_enabled: bool = (
43-
True if int(os.getenv("PYLOPS_MPI_CUDA_AWARE", 1)) == 1 else False
43+
False if int(os.getenv("PYLOPS_MPI_CUDA_AWARE", 0)) == 0 else True
4444
)
4545

4646
nccl_enabled: bool = (

0 commit comments

Comments
 (0)