Skip to content

Conversation

@janjust
Copy link
Contributor

@janjust janjust commented Mar 15, 2019

The PR addresses several issues with creating multiple contexts in oshmem/ucx.

@ompiteam-bot
Copy link

Can one of the admins verify this patch?

@janjust
Copy link
Contributor Author

janjust commented Mar 15, 2019

@yosefe @brminich @hoopoepg @xinzhao3 @jladd-mlnx
The PR is up, as changes are requested we'll push against this PR.

@jsquyres
Copy link
Member

ok to test

@ibm-ompi
Copy link

The IBM CI (XL Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/f05fa592b4652af5e5403e9818f34b90

@ibm-ompi
Copy link

The IBM CI (GNU Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/b27aaa4fc3d8fe6c00aa6f6111124511

@jsquyres
Copy link
Member

@janjust Looks like there's some compile errors in the UCX stuff. Can you update/fix?

@janjust janjust force-pushed the oshmem-multiple-contexts-master branch from c8e5405 to 280f330 Compare March 15, 2019 18:11
@janjust
Copy link
Contributor Author

janjust commented Mar 15, 2019

@jsquyres fixed, last minute squash went wrong :(

@artpol84
Copy link
Contributor

@jsquyres @bwbarrett build checker failed with obviously unrelated error:

FATAL: Channel "unknown": Remote call on JNLP4-connect connection from 
proxy-s1.lanl.gov/192.12.184.6:36160 failed. The channel is closing down or has closed down

Who is the right person to ask for investigation.

@jsquyres
Copy link
Member

Just re-run the test and see if the same thing happens.

bot:ompi:retest


if (array->ctxs_count < array->ctxs_num) {
array->ctxs[array->ctxs_count] = ctx;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: } else {


typedef spml_ucx_mkey_t * (*mca_spml_ucx_get_mkey_slow_fn_t)(shmem_ctx_t ctx, int pe, void *va, void **rva);

typedef struct mca_spml_ucx_ctx_array {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about use linked list instead of array? it is much simpler to operate with list entries (add/remove/shift between lists) than array elements, there are set of macro in opal to manipulate linked lists

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you weren't part of the discussion leading up to this PR, but @yosefe suggested to use a similar implementation to opal callback, hence why the arrays - it's possibly faster, I haven't measured.

OPAL_DECLSPEC int opal_common_ucx_del_procs(opal_common_ucx_del_proc_t *procs, size_t count,
size_t my_rank, size_t max_disconnect, ucp_worker_h worker)
{
OPAL_DECLSPEC int opal_common_ucx_del_procs_nb(opal_common_ucx_del_proc_t *procs, size_t count,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: this call is blocking on wait all disconnection requests... _nb may confuse :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, so the difference between the two is that the original had a fence which in oshmem case is a global sync, I'll remove _nb, maybe add _nofence to the call.

}
else {
array->ctxs = realloc(array->ctxs, (array->ctxs_num + 8) * sizeof(mca_spml_ucx_ctx_t *));
opal_atomic_wmb ();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need write barrier here? what data should be stored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the same reason why it's done in _opal_progress_register(); opal_progress.c:409, although it doesn't use a realloc, but it looks like it reimplements realloc - not sure why. So to keep consistent I added a wmb() after the realloc().


SHMEM_MUTEX_LOCK(mca_spml_ucx.internal_mutex);
_ctx_remove(&mca_spml_ucx.active_array, (mca_spml_ucx_ctx_t *)ctx);
_ctx_add(&mca_spml_ucx.idle_array, (mca_spml_ucx_ctx_t *)ctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are removing context from progress, all outstanding operations should be flushed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I can see context is not destroyed here - it is just moved into idle array and all outstanding ops may be left uncompleted. ucp_worker_flush may depend from remote side for fetch_and_op calls, but as I can see user can't remove OSHMEM context when there are incomplete ops, like called ***_nbi and after this call ctx_destroy without call shmem_ctx_quiet... but I'm not sure about this :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be ok, because disconnect_nb (invoked by ctx_cleanup) will flush eps

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the other side, flushing worker may help to avoid 'hard-to-debug' issues. I tend to agree with @hoopoepg that worker flush is worth to be done here. @yosefe, wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added flush() - @xinzhao3 need to run this through unit tests again


if (OPAL_SUCCESS != (ret = opal_pmix.fence_nb(NULL, 0,
opal_common_ucx_mca_fence_complete_cb, (void*)fenced))){
return ret;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to check ret here. Can just return opal_pmix.fence_nb (or just use it without this wrapper)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

{
int ret = OPAL_SUCCESS;
opal_common_ucx_del_procs_nb(procs, count, my_rank, max_disconnect, worker);
if (OPAL_SUCCESS != (ret = opal_common_ucx_mca_pmix_fence(worker))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to check ret, can just return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

mca_spml_ucx.active_array.ctxs = calloc(mca_spml_ucx.active_array.ctxs_num,
sizeof(mca_spml_ucx_ctx_t *));
mca_spml_ucx.idle_array.ctxs = calloc(mca_spml_ucx.idle_array.ctxs_num,
sizeof(mca_spml_ucx_ctx_t *));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: indentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

array->ctxs[array->ctxs_count] = ctx;
}
else {
array->ctxs = realloc(array->ctxs, (array->ctxs_num + 8) * sizeof(mca_spml_ucx_ctx_t *));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better use macro definition for 8 instead of number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in latest commit

@janjust janjust force-pushed the oshmem-multiple-contexts-master branch from 6edc838 to db7541e Compare March 18, 2019 17:34
@ibm-ompi
Copy link

The IBM CI (GNU Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/7f03afbb3d3d229153a57dd21dc90058

@ibm-ompi
Copy link

The IBM CI (XL Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/f3568fb647f9904f1134791e020b3104

@janjust janjust force-pushed the oshmem-multiple-contexts-master branch from db7541e to b6d07da Compare March 18, 2019 18:02
_ctx_add(&mca_spml_ucx.idle_array, (mca_spml_ucx_ctx_t *)ctx);
SHMEM_MUTEX_UNLOCK(mca_spml_ucx.internal_mutex);

ucp_worker_flush(ucx_ctx->ucp_worker);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not notice that quiet(ctx) is called in the beginning of this function, so it makes no sense to call flush here then.
I'm sorry for this confusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, my fault. in current implementation of quiet there is flush_nb is called and it has own progress loop for worker (independent from opal progress).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I'll remove it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, removed flush

@janjust janjust force-pushed the oshmem-multiple-contexts-master branch from b6d07da to e294277 Compare March 19, 2019 13:49
@jladd-mlnx
Copy link
Member

👍 looks good, team.

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is e294277 fixes to prior commits on this PR?

If so, please fix the problem in the respective commits themselves -- do not make commits and then have additional commits to fix the original commits.

Thanks.

…s/delete oshmem_barrier in shmem_ctx_destroy

ompi/oshmem/spml/ucx: optimize spml ucx progress

Signed-off-by: Tomislav Janjusic <[email protected]>
@janjust janjust force-pushed the oshmem-multiple-contexts-master branch from e294277 to 9c3d00b Compare March 21, 2019 21:06
@janjust
Copy link
Contributor Author

janjust commented Mar 21, 2019

@jsquyres done, remove commit, and addressed comments in respective commits.

@jsquyres
Copy link
Member

@janjust Many thanks. You'll probably need to update the corresponding cherry-pick hash notices in #6509, too.

@jladd-mlnx jladd-mlnx merged commit 9ab6ecb into open-mpi:master Mar 22, 2019
@janjust janjust deleted the oshmem-multiple-contexts-master branch March 27, 2025 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants