Commit caf7824
committed
dpm: don't use locality info for multi PMIX namespace environments
Some of our collective frameworks are now locality aware and make use
of this information to make decisions on how to handle app coll ops.
It turns out that in certain multi-namespace situations (jobid in ompi speak), some procs
can get locality info about other procs but not in a symmetric fashion
using PMIx mechanisms. This can lead to communicators with different
locality information on different procs. This can lead to deadlock
when using certain collectives.
This situation can be seen with the ompi-tests/ibm/dynamic/intercomm_merge.c
In this test the following happens:
1. process set A is started with mpirun
2. process set A spawns a set of processes B
3. processes in sets A and B create an intra comm using the intercomm from MPI_Comm_spawn and MPI_Comm_get_parent in the spawners and spawnees respectively
4. process in set A spawns a set of processes C
5. processes in sets A and C create an intra comm using the intercomm from MPI_Comm_spawn and MPI_Comm_get_parent in the spawners and spawnees respectively
6. processes in A and B create new intercomm
7. processes in A and C create new intercomm
8. processes in A, B, anc C create a new intra comm using the intercomms from steps 6 and 7
9. processes in A, B, and C try to do an MPI_Barrier using the intra comm from step 8
It turns out in step 8 the locality info supplied by pmix is asymmetric. Processes in sets B and C aren't able to deterimine locality info
from each other (PMIx returns not found when attempts are made to get locality info for the remote processes). This causes issues when the
step 9 is executed.
Processes in set A are trying to use the tuned collective component for the barrier.
Processes in sets B and C are trying to use the HAN collective component for the barrier.
In process sets B and C, HAN thinks that the communicator has both local and remote procs so tries to use a hierarchical algorithm.
Meanwhile, procs in set A can retrieve locality from all procs in sets B and C and think the collective is occuring on a single node - which in fact it is.
This behavior can be observed using prrte master at 8ecee645de and openpmix master at a083d8f9.
This patch restricts using locality info for a proc if its in a different PMIx namespace.
It also removes some comments which are now no longer accurate.
Signed-off-by: Howard Pritchard <[email protected]>1 parent ecd206d commit caf7824
3 files changed
+10
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
436 | 436 | | |
437 | 437 | | |
438 | 438 | | |
439 | | - | |
440 | | - | |
| 439 | + | |
| 440 | + | |
441 | 441 | | |
442 | 442 | | |
443 | 443 | | |
444 | | - | |
| 444 | + | |
445 | 445 | | |
446 | | - | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
447 | 451 | | |
448 | 452 | | |
449 | 453 | | |
| |||
0 commit comments