-
Notifications
You must be signed in to change notification settings - Fork 935
Implementation of the current state of MPI Continuations proposal [WIP] #9502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
bosilca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the impact of the continuations on the latency ? What if we mix normal requests and continuations in the same app ?
|
I pushed a bunch of changes. The biggest change is that a continuation request is now derived from |
|
Oh, and I will squash everything down to one commit once we're done... |
bosilca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitively much cleaner.
ompi/mpiext/continue/configure.m4
Outdated
|
|
||
| # This example can always build, so we just execute $1 if it was | ||
| # requested. | ||
| AS_IF([test "$ENABLE_continue" = "1" || \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These IFs looks similar enough to be merged.
2f49fa2 to
fa149c2
Compare
|
@bosilca I pushed a few changes:
|
| int ompi_continue_progress_request(ompi_request_t *cont_req); | ||
|
|
||
| /** | ||
| * Attach a continuation to a set of operations represented by \c requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this \c ?
|
|
||
| /** | ||
| * Register the provided continuation request to be included in the | ||
| * global progress loop (used while a thread is waiting for the contnuation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo. contnuation
|
|
||
| opal_atomic_unlock(&cont_req->cont_lock); | ||
|
|
||
| if (NULL == thread_progress_list) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this global object not protected for concurrent accesses ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a thread_local variable that stores the continuation requests the current thread is waiting on
0fffb9c to
e014717
Compare
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: d80d3ca: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
03c36a5 to
e2979db
Compare
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
5 similar comments
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
df79651 to
c8c0437
Compare
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
3 similar comments
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: f3babb5: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
a02d677 to
b6440c0
Compare
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: a32e7e3: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
2 similar comments
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: a32e7e3: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: a32e7e3: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Signed-off-by: Joseph Schuchart <[email protected]>
Also: always store continuations directly in poll-only CRs. Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
- Swap flags and max_poll parameter in MPI_Continue_init - Swap cont_req and info parameter in MPI_Continue_init - Add new flags and remove support for info keys mpi_continue_poll_only, mpi_continue_max_poll, and mpi_continue_enqueue_complete Signed-off-by: Joseph Schuchart <[email protected]>
This was a temporary bandaid to fix the C11 sequentially consistent store... Signed-off-by: Joseph Schuchart <[email protected]>
Continuation requests may be marked as complete before another continuation is registered. In that case, we should mark the continuation request as pending again, unless a sync object has been registered already. Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
The cont->cont_req field was reset *after* it was returned to the freelist... Signed-off-by: Joseph Schuchart <[email protected]>
We must protect the activation of the completion of a continuation request to ensure that we don't miss any updates. Signed-off-by: Joseph Schuchart <[email protected]>
Runnable continuations should be added to cont_complete_defer_list if the continuation request is inactive, so that it can be enqueued for execution once the request is enabled. Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
This was a temporary idea that has been removed again. Signed-off-by: Joseph Schuchart <[email protected]>
MPIX_CONT_PERSISTENT will not be part of the first proposal and has been removed. Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
790a7b6 to
68b887c
Compare
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: d77ca1c: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
It is not sufficient to just set one value, we need proper initialization through OMPI_REQUEST_INIT. Signed-off-by: Joseph Schuchart <[email protected]>
|
Hello! The Git Commit Checker CI bot found a few problems with this PR: d77ca1c: Remove re-iteration of continuation requests in te...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
|
I found something that may need more clarification. The continuation proposal says
It seems to me that the MPI runtime should set the status of the continuation request to This difference matters because it decides whether the following code logic is correct: if my application never tested (and then re-started) the continuation request (imagining I have some magic way to keep making progress on the runtime e.g., testing other regular requests), could I keep attaching new continues to it and assume they always get executed without delay? |
|
I think this is a good point. The idea was that CRs may complete once all registered continuations have executed and the CR is tested/waited on. It will not be complete otherwise, so it is safe to register new continuations and expect them to be executed (provided they are not poll-only). |
This PR adds an implementation of the MPI Continuations proposal as extension to Open MPI. This proposal is current under discussion in the MPI hybrid and accelerator working group. By integrating it as an extension into Open MPI, I am hoping to provide an up-to-date implementation for the community to experiment with. It reflects the current state of the proposal and will be kept in sync with the evolving proposal.
The implementation is mostly confined to the extension itself, with the exception of hooks in request test and wait functions used to allow polling on a continuation request to complete outstanding continuations. This is needed for the implementation of the
"mpi_continue_poll_only"info key (see the description below).Note: this PR is currently WIP as it relies on a workaround to define
OMPI_HAVE_MPI_EXT_CONTINUE, which is used to disable the hooks in request test/wait functions if the extension is not enabled. The underlying issue is that the extensions integration currently does not support including thempiext.hheader in implementation files (needed to disable the hooks mentioned above).Overview of MPI Continuations
Continuations provide a mechanism for attaching callbacks to outstanding operation requests. A call to
MPIX_Continuetakes a request, a function pointer, a user-provided data pointer, and a status object (orMPI_STATUS_IGNORE), along with a continuation request and attaches the continuation to the operation request:The ownership of non-persistent requests is returned to MPI and the pointer to the request will be set to
MPI_REQUEST_NULL. The callback is passed the status pointer and the user-provided data pointer:The status has to remain valid until the invocation of the callback and is set according to the operation before the callback is invoked.
The continuation is registered with the provided continuation request. The continuation request is a request allocated using
MPIX_Continue_init:Continuation requests may be used to test/wait for completion of all continuations registered with it using
MPI_Test/Wait. Supported info keys are:"mpi_continue_poll_only": if true, only execute continuations whenMPI_Test/Waitis called on the continuation request. If false, continuations may be executed at any time inside a call into MPI (inside a callback registered withopal_progressin the implementation; default: false)."mpi_continue_enqueue_complete": if false, the continuation is executed immediately if the operations are already complete when MPIX_Continue is called. Execution is deferred otherwise (default: false)"mpi_continue_max_poll": the maximum number of continuations to execute when callingMPI_Teston the continuation request (default: -1, meaning unlimited)A continuation may in turn be attached to a continuation request, in which case it will be executed once all continuations registered with the continuation request have completed.
In addition to
MPIX_Continue, the proposal also includesMPIX_Continueallwhich attaches a continuation to a set of requests such that the continuation is executed once all operations have completed.Signed-off-by: Joseph Schuchart [email protected]
Signed-off-by: George Bosilca [email protected]