Skip to content

Conversation

@hppritcha
Copy link
Member

The PSM2 MTL was setting the PSM2_DEVICES environment
variable to self,shm for single node jobs. This is
causing single node/single process jobs to fail on
some omnipath systems. Not setting this environment
variable fixes these issues.

This fix is needed as part of bringup of omnipath
clusters at several customer sites.

Fixes issue #1559

@matcabral
@rhc54 (copying you just for fyi)

Signed-off-by: Howard Pritchard [email protected]

The PSM2 MTL was setting the PSM2_DEVICES environment
variable to self,shm for single node jobs.  This is
causing single node/single process jobs to fail on
some omnipath systems.  Not setting this environment
variable fixes these issues.

This fix is needed as part of bringup of omnipath
clusters at several customer sites.

Fixes issue open-mpi#1559

Signed-off-by: Howard Pritchard <[email protected]>
@hppritcha hppritcha added this to the v1.10.3 milestone Apr 25, 2016
@hjelmn
Copy link
Member

hjelmn commented Apr 27, 2016

@hppritcha The psm2 shared memory performance is already pretty bad. Does this make is worse?

@hppritcha
Copy link
Member Author

Doubtful. But intel says if you ask for HFI for single mode jobs probably can't run as many ranks as cores - probably for BWL or the likes. Iterating with intel on this. Meanwhile TOSS took patch for now.

@matcabral
Copy link
Contributor

matcabral commented Apr 27, 2016

Hi @hppritcha,
When initializing the PSM2_DEVICES=hfi the maximum number ranks on a single node will be defined by the number of HW contexts in the HFI card. The cards have 160 contexts. By default the hfi1.ko driver will load 1 HW context per physical core on the system, but this can be edited using module load time parameter num_user_contexts. Therefore, there is no limitation on having one rank per core as described by the above numbers. Additionally it can support up to 8 way sharing of the hardware contexts to get to ~1280 ranks per node.
In regards to performance, shm device will continue to be used transparently (in single node jobs). There will be little overheard only at init time to init hfi.

@hppritcha
Copy link
Member Author

@matcabral thanks for the detailed answer. this helps a lot. Given this though can you explain why its not good to get rid of this PSM2_DEVICES env setting in the psm2 mtl?

@matcabral
Copy link
Contributor

matcabral commented Apr 28, 2016

@hppritcha,
I have pending to test the proposed change in our lab (I’m waiting for resources) to evaluate potential impacts. By analytical review, I see the change could fit as an immediate fix for the issues found (root cause research ongoing). However, I am not fully satisfied with the PSM2 MTL having PSM to lock HW resources (HFI card contexts) that are not used. In the short term, there may not be significant impact when running under-subscribed . However, as cores count grow (KNx series) this will become more impactfull.

@hppritcha
Copy link
Member Author

closing this. looks like a better fix is available.

@hppritcha hppritcha closed this May 3, 2016
@hppritcha
Copy link
Member Author

reopening in case may be useful for @matcabral

@matcabral
Copy link
Contributor

matcabral commented May 11, 2016

@hppritcha , having fixes for both of the problems reported in #1159 (root cause fix) there is no need for this PR. However, would still be convenient to check the environment before setting PSM2_DEVICES. I will send a new PR for that.

thanks,

@ibm-ompi
Copy link

Test passed.

@hppritcha
Copy link
Member Author

closing this PR, fixed elsewhere.

@hppritcha hppritcha closed this May 24, 2016
@hppritcha hppritcha deleted the topic/psm2_mtl_fix_for_issue_1559 branch May 2, 2018 02:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants