-
Notifications
You must be signed in to change notification settings - Fork 911
MPI 4.0: Allow MPI_WIN_SHARED_QUERY on regular windows #13330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
caa091d
to
a15d221
Compare
30af8b1
to
a3b1650
Compare
I think I got this working now with I have a test case for it but I'm not sure where to put it. It's hard to make this a hard test because whether or not the underlying implementation (like UCX) provide shared memory support. It can be used to check that we don't have a hard fail anywhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements MPI 4.0 functionality that allows MPI_WIN_SHARED_QUERY
to be called on any allocated and created windows, not just shared memory windows. The function may return the pointer, size, and displacement unit if the peer is located on the same node and the memory is shared.
- Adds support for
MPI_WIN_SHARED_QUERY
on regular windows in OSC/SM and OSC/RDMA modules - Modifies the core
win_shared_query
function to gracefully handle unsupported cases - Includes partial implementation for OSC/UCX (incomplete according to PR description)
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
ompi/mpi/c/win_shared_query.c.in |
Updates core function to handle unsupported cases gracefully instead of raising errors |
ompi/mca/osc/ucx/osc_ucx_component.c |
Adds shared query support for UCX with optimization for size/displacement unit handling |
ompi/mca/osc/ucx/osc_ucx.h |
Adds helper functions and structure fields for size and displacement unit management |
ompi/mca/osc/sm/osc_sm_component.c |
Removes flavor restriction for shared memory module and fixes alignment calculations |
ompi/mca/osc/rdma/osc_rdma_peer.h |
Adds CPU atomics flag and helper function for peer capabilities |
ompi/mca/osc/rdma/osc_rdma_component.c |
Implements shared query for RDMA module and updates peer flag handling |
ompi/mca/osc/rdma/osc_rdma_accumulate.c |
Updates atomic operations to use new CPU atomics check instead of local base check |
Comments suppressed due to low confidence (2)
ompi/mca/osc/sm/osc_sm_component.c:286
- This line was removed but the assignment to
total
is still needed for the non-contiguous case. The variabletotal
should be assigned the page-aligned size whenmodule->noncontig
is true.
} else {
ompi/mca/osc/sm/osc_sm_component.c:288
- This line was removed but the assignment to
total
is still needed for the contiguous case. The variabletotal
should be assigned the actual size whenmodule->noncontig
is false.
"allocating window using contiguous strategy");
MPI 4.0 introduced allows applications to query regular windows for shared memory. This patch enables it for osc/rdma and osc/ucx and otherwise makes sure we fail gracefully if the component does not provide the query callback. For osc/rdma, this is currently supported only for allocated windows but could later be extended to windows with application-provided memory through xpmem. Signed-off-by: Joseph Schuchart <[email protected]>
965564a
to
0c5b0ca
Compare
I squashed everything down to one commit. This should be ready for review. |
MPI 4.0 allows MPI_WIN_SHARED_QUERY to be called on any allocated and created windows. It may return the pointer, size, and displacement unit if the peer is located on the same node and the memory is shared.
This PR adds this functionality for osc/sm and osc/rdma. There is an attempt to implement it in osc/ucx but it's incomplete. We need to exchange and store the base pointer for the peer processes on the same node so we can pass them to
ucp_rkey_ptr()
. I'm not sure how to easily determine whether a process is on the same node. Some help would be appreciated.