-
Notifications
You must be signed in to change notification settings - Fork 929
Oshmem multiple contexts #6492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oshmem multiple contexts #6492
Conversation
This reverts commit f1b095c. Signed-off-by: Tomislav Janjusic <[email protected]>
Signed-off-by: Tomislav Janjusic <[email protected]>
Signed-off-by: Xin Zhao <[email protected]> Signed-off-by: Tomislav Janjusic <[email protected]>
Signed-off-by: Tomislav Janjusic <[email protected]>
… track of ucx_ctx_default's rkeys Signed-off-by: Tomislav Janjusic <[email protected]>
|
Can one of the admins verify this patch? |
|
@yosefe @brminich @hoopoepg @xinzhao3 @jladd-mlnx |
|
ok to test |
|
The IBM CI (XL Compiler) build failed! Please review the log, linked below. Gist: https://gist.github.com/f05fa592b4652af5e5403e9818f34b90 |
|
The IBM CI (GNU Compiler) build failed! Please review the log, linked below. Gist: https://gist.github.com/b27aaa4fc3d8fe6c00aa6f6111124511 |
|
@janjust Looks like there's some compile errors in the UCX stuff. Can you update/fix? |
c8e5405 to
280f330
Compare
|
@jsquyres fixed, last minute squash went wrong :( |
|
@jsquyres @bwbarrett build checker failed with obviously unrelated error: Who is the right person to ask for investigation. |
|
Just re-run the test and see if the same thing happens. bot:ompi:retest |
oshmem/mca/spml/ucx/spml_ucx.c
Outdated
|
|
||
| if (array->ctxs_count < array->ctxs_num) { | ||
| array->ctxs[array->ctxs_count] = ctx; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: } else {
|
|
||
| typedef spml_ucx_mkey_t * (*mca_spml_ucx_get_mkey_slow_fn_t)(shmem_ctx_t ctx, int pe, void *va, void **rva); | ||
|
|
||
| typedef struct mca_spml_ucx_ctx_array { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about use linked list instead of array? it is much simpler to operate with list entries (add/remove/shift between lists) than array elements, there are set of macro in opal to manipulate linked lists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you weren't part of the discussion leading up to this PR, but @yosefe suggested to use a similar implementation to opal callback, hence why the arrays - it's possibly faster, I haven't measured.
opal/mca/common/ucx/common_ucx.c
Outdated
| OPAL_DECLSPEC int opal_common_ucx_del_procs(opal_common_ucx_del_proc_t *procs, size_t count, | ||
| size_t my_rank, size_t max_disconnect, ucp_worker_h worker) | ||
| { | ||
| OPAL_DECLSPEC int opal_common_ucx_del_procs_nb(opal_common_ucx_del_proc_t *procs, size_t count, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: this call is blocking on wait all disconnection requests... _nb may confuse :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right, so the difference between the two is that the original had a fence which in oshmem case is a global sync, I'll remove _nb, maybe add _nofence to the call.
| } | ||
| else { | ||
| array->ctxs = realloc(array->ctxs, (array->ctxs_num + 8) * sizeof(mca_spml_ucx_ctx_t *)); | ||
| opal_atomic_wmb (); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need write barrier here? what data should be stored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the same reason why it's done in _opal_progress_register(); opal_progress.c:409, although it doesn't use a realloc, but it looks like it reimplements realloc - not sure why. So to keep consistent I added a wmb() after the realloc().
|
|
||
| SHMEM_MUTEX_LOCK(mca_spml_ucx.internal_mutex); | ||
| _ctx_remove(&mca_spml_ucx.active_array, (mca_spml_ucx_ctx_t *)ctx); | ||
| _ctx_add(&mca_spml_ucx.idle_array, (mca_spml_ucx_ctx_t *)ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are removing context from progress, all outstanding operations should be flushed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I can see context is not destroyed here - it is just moved into idle array and all outstanding ops may be left uncompleted. ucp_worker_flush may depend from remote side for fetch_and_op calls, but as I can see user can't remove OSHMEM context when there are incomplete ops, like called ***_nbi and after this call ctx_destroy without call shmem_ctx_quiet... but I'm not sure about this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be ok, because disconnect_nb (invoked by ctx_cleanup) will flush eps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added flush() - @xinzhao3 need to run this through unit tests again
opal/mca/common/ucx/common_ucx.c
Outdated
|
|
||
| if (OPAL_SUCCESS != (ret = opal_pmix.fence_nb(NULL, 0, | ||
| opal_common_ucx_mca_fence_complete_cb, (void*)fenced))){ | ||
| return ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to check ret here. Can just return opal_pmix.fence_nb (or just use it without this wrapper)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
opal/mca/common/ucx/common_ucx.c
Outdated
| { | ||
| int ret = OPAL_SUCCESS; | ||
| opal_common_ucx_del_procs_nb(procs, count, my_rank, max_disconnect, worker); | ||
| if (OPAL_SUCCESS != (ret = opal_common_ucx_mca_pmix_fence(worker))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to check ret, can just return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| mca_spml_ucx.active_array.ctxs = calloc(mca_spml_ucx.active_array.ctxs_num, | ||
| sizeof(mca_spml_ucx_ctx_t *)); | ||
| mca_spml_ucx.idle_array.ctxs = calloc(mca_spml_ucx.idle_array.ctxs_num, | ||
| sizeof(mca_spml_ucx_ctx_t *)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
oshmem/mca/spml/ucx/spml_ucx.c
Outdated
| array->ctxs[array->ctxs_count] = ctx; | ||
| } | ||
| else { | ||
| array->ctxs = realloc(array->ctxs, (array->ctxs_num + 8) * sizeof(mca_spml_ucx_ctx_t *)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better use macro definition for 8 instead of number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in latest commit
6edc838 to
db7541e
Compare
|
The IBM CI (GNU Compiler) build failed! Please review the log, linked below. Gist: https://gist.github.com/7f03afbb3d3d229153a57dd21dc90058 |
|
The IBM CI (XL Compiler) build failed! Please review the log, linked below. Gist: https://gist.github.com/f3568fb647f9904f1134791e020b3104 |
db7541e to
b6d07da
Compare
oshmem/mca/spml/ucx/spml_ucx.c
Outdated
| _ctx_add(&mca_spml_ucx.idle_array, (mca_spml_ucx_ctx_t *)ctx); | ||
| SHMEM_MUTEX_UNLOCK(mca_spml_ucx.internal_mutex); | ||
|
|
||
| ucp_worker_flush(ucx_ctx->ucp_worker); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not notice that quiet(ctx) is called in the beginning of this function, so it makes no sense to call flush here then.
I'm sorry for this confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, my fault. in current implementation of quiet there is flush_nb is called and it has own progress loop for worker (independent from opal progress).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I'll remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, removed flush
b6d07da to
e294277
Compare
|
👍 looks good, team. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is e294277 fixes to prior commits on this PR?
If so, please fix the problem in the respective commits themselves -- do not make commits and then have additional commits to fix the original commits.
Thanks.
Signed-off-by: Tomislav Janjusic <[email protected]>
Signed-off-by: Tomislav Janjusic <[email protected]>
…s/delete oshmem_barrier in shmem_ctx_destroy ompi/oshmem/spml/ucx: optimize spml ucx progress Signed-off-by: Tomislav Janjusic <[email protected]>
e294277 to
9c3d00b
Compare
|
@jsquyres done, remove commit, and addressed comments in respective commits. |
The PR addresses several issues with creating multiple contexts in oshmem/ucx.