-
Notifications
You must be signed in to change notification settings - Fork 935
Closed
Description
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
I'm using v5.0.8
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
from github repo
If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.
❯ git submodule status
907b1ccaeec61a1197f0ee5264d4fef20b257b84 3rd-party/openpmix (v5.0.8)
222f03fbb98b71abd293aa205b38fa9a38e57965 3rd-party/prrte (v3.0.10-3-g222f03fbb9)
dfff67569fb72dbf8d73a1dcf74d091dad93f71b config/oac (dfff675)
Please describe the system on which you are running
- Operating system/version: Arch
- Computer hardware: 12th Gen Intel(R) Core(TM) i5-12500H
- Network type: WLAN
Details of the problem
I built ucx with this command:
❯ ./autogen.sh && ./configure --prefix=/opt/ucx --enable-mt && make -j12 && make install
built openmpi with this command:
./autogen.pl && ./configure --prefix=/opt/ompi --with-ucx=/opt/ucx && make -j14 && make install
And then I got mpi execute file:
❯ ompi_info | grep ucx
Configure command line: '--prefix=/opt/ompi' '--with-ucx=/opt/ucx'
MCA osc: ucx (MCA v2.1.0, API v3.0.0, Component v5.0.8)
MCA pml: ucx (MCA v2.1.0, API v2.1.0, Component v5.0.8)
But when I run mpi application with ucx, I got:
❯ mpirun -np 2 -mca pml ucx ./helloworld
--------------------------------------------------------------------------
No components were able to be opened in the pml framework.
This typically means that either no components of this type were
installed, or none of the installed components can be loaded.
Sometimes this means that shared libraries required by these
components are unable to be found/loaded.
Host: Troy-arch
Framework: pml
--------------------------------------------------------------------------
--------------------------------------------------------------------------
prterun has exited due to process rank 0 with PID 0 on node Troy-arch calling
"abort". This may have caused other processes in the application to be
terminated by signals sent by prterun (as reported here).
--------------------------------------------------------------------------
without ucx, it works:
❯ mpirun -np 2 -mca btl self,sm ./helloworld
Hello from rank 0 of 2 on Troy-arch
Hello from rank 1 of 2 on Troy-arch
Metadata
Metadata
Assignees
Labels
No labels