Skip to content

mpirun is slow to start #12943

@giraldeau

Description

@giraldeau

Starting a simple program with OpenMPI is very slow compared to MPICH (order of 100x slower). This is very anoying when developing and running tests.

$ time mpirun.openmpi -n 4 hostname
uqam-TECRA-A50-K
uqam-TECRA-A50-K
uqam-TECRA-A50-K
uqam-TECRA-A50-K

real 0m3,659s
user 0m0,020s
sys 0m0,374s

$ time mpirun.mpich -n 4 hostname
uqam-TECRA-A50-K
uqam-TECRA-A50-K
uqam-TECRA-A50-K
uqam-TECRA-A50-K

real 0m0,005s
user 0m0,001s
sys 0m0,006s

Using strace, I saw that the slowdown is related to PCI bus scanning, that seems to be done by the mca_ess_hnp component. Here is the stackstrace:

strace -k -o out -e trace=openat mpirun -n 4 hostname
...
 > /usr/lib/x86_64-linux-gnu/libc.so.6(openat64+0x42) [0x11b2e2]
 > /usr/lib/x86_64-linux-gnu/libhwloc.so.15.7.0(hwloc_linux_get_tid_last_cpu_location+0xe38d) [0x4558d]
 > /usr/lib/x86_64-linux-gnu/libhwloc.so.15.7.0(hwloc_linux_get_tid_last_cpu_location+0x63eb) [0x3d5eb]
 > /usr/lib/x86_64-linux-gnu/libhwloc.so.15.7.0(hwloc_linux_get_tid_last_cpu_location+0xda46) [0x44c46]
 > /usr/lib/x86_64-linux-gnu/libhwloc.so.15.7.0(hwloc_topology_load+0xfed) [0x106cd]
 > /usr/lib/x86_64-linux-gnu/libopen-pal.so.40.30.3(opal_hwloc_base_get_topology+0x12a6) [0x7b976]
 > /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_ess_hnp.so() [0x4c56]
 > /usr/lib/x86_64-linux-gnu/libopen-rte.so.40.30.3(orte_init+0x2aa) [0x9815a]
 > /usr/lib/x86_64-linux-gnu/libopen-rte.so.40.30.3(orte_submit_init+0x911) [0x420e1]
 > /usr/bin/orterun() [0x11e8]
 > /usr/lib/x86_64-linux-gnu/libc.so.6(__libc_init_first+0x8a) [0x2a1ca]
 > /usr/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b) [0x2a28b]
 > /usr/bin/orterun() [0x1415]
openat(-1, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 10

The scan seems to even repeat the device query multiple time:

$ grep "/sys/bus/pci/devices/0000:00:07.1/config" out 
openat(-1, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 10
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(AT_FDCWD, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 12
openat(-1, "/sys/bus/pci/devices/0000:00:07.1/config", O_RDONLY) = 15

I didn't find any MCA parameter to disable this component. I found that disabling some HWLOCK_COMPONENTS #11783 was actually reducing drastrically the startup time, especially pci and opencl. The linux component seems to be essential, otherwise the launch fails.

export HWLOC_COMPONENTS=-pci,-opencl,-x86,-no_os,-gl

I wonder what does actually the ess hnp component (I don't find any mention of that component in the documentation) and how to disable it using MCA parameter (I didn't find a way to disable it).

Thanks

Background information

OpenMPI version

Simply the standard package from Ubuntu 24.04

$ dpkg -l | grep openmpi-bin
ii  openmpi-bin                                      4.1.6-7ubuntu2                             amd64        high performance message passing library -- binaries

Please describe the system on which you are running

  • Operating system/version: Ubuntu 24.04
  • Computer hardware: generic x86_64 laptop 12th Gen Intel(R) Core(TM) i7-1270P
  • Network type: none, slowdown occurs on localhost

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions