Skip to content

Conversation

@tharittk
Copy link
Collaborator

@tharittk tharittk commented Jul 6, 2025

This PR is for the documentation of Multi-Dimensional Deconvolution (MDD) and Least-Squares Migration (LSM) under NCCL communication.

It was tested on 3 GPUs and ran successfully.
For quick stats on 3 GPUs (UIUC Delta)

LSM (iterations = 100)

CuPy + NCCL Total time (s) = 1.96
CuPy + MPI. Total time (s) = 1.94
NumPy + MPI Total time (s) = 1.19

MDD (iterations = 50)

CuPy + NCCL Total time (s) = 1.62
CuPy + MPI Total time (s) = 2.57
NumPy + MPI Total time (s) = 6.83

The LSM result is not yet as expected. It could be that the input size is too small for the GPU computations.
Further tests will be carried out.

Copy link
Contributor

@mrava87 mrava87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work @tharittk, just left some minor suggestions for you 😄

@mrava87
Copy link
Contributor

mrava87 commented Jul 9, 2025

@tharittk nearly ready, just left two comments above as I think you have a couple of typos that may not make things render properly?

@mrava87 mrava87 merged commit f2b2f62 into PyLops:main Jul 10, 2025
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants