- 
                Notifications
    You must be signed in to change notification settings 
- Fork 928
Description
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
v5.0.8
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
5.0.8: micromamba using the conda-forge channel
4.1.4: spack
Please describe the system on which you are running
- Operating system/version: rocky 8
- Computer hardware:
- Network type: ib / high speed ethernet
Details of the problem
Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.
Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:
In a slurm job I ask for 2 tasks and 8 cpus per task, making in total 16 cpus available.
While with version 4.1.4 the following worked without any issue:
NUMBA_NUM_THREADS=${SLURM_CPUS_PER_TASK:?}
mpiexec -n "${SLURM_NTASKS:?}" --map-by slot:pe="${NUMBA_NUM_THREADS:?}" python pi_hybrid.pyThis throws an error with 5.0.8.
I don't get why, it seems MPI processes are also counted now? When I set NUMBA_NUM_THREADS to 7 it works but then 2 CPUS are basically unused because the MPI processes are idle during the numba parallelization.
Possible solutions are any of the following
mpiexec -n "${SLURM_NTASKS:?}" --cpus-per-proc "${NUMBA_NUM_THREADS:?}" python pi_hybrid.py
mpiexec -n "${SLURM_NTASKS:?}" --bind-to none python pi_hybrid.py
mpiexec -n "${SLURM_NTASKS:?}" --oversubscribe --map-by slot:pe="${NUMBA_NUM_THREADS:?}" python pi_hybrid.py--cpus-per-proc seems the most obvious to me, but in the docs (https://docs.open-mpi.org/en/v5.0.8/man-openmpi/man1/mpirun.1.html#options-old-hard-coded-content-mdash-to-be-audited) the following is mentioend:
deprecated in favor of
--map-by <obj>:PE=n
So I would rather use --map-by (which is only working when using --oversubscribe (no other <obj> seems to work, using core for instance just quits without giving any output in .out / .err) which seems counterintuitive.
Using --bind-to none is also not what I want because this does not ensure that all numba threads are on the same socket.
So what am I missing here?