-
Notifications
You must be signed in to change notification settings - Fork 936
Description
For a university project I'm trying to build a rasberry pi cluster with slurm.
I've had quite a few issues on trying to run srun with mpi and I've settled to install openmpi from git repo specifying external pmix, hwloc and libevent for pmix/slurm integration.
I'm building openmpi version: 5.1.0a1 on a raspberry pi 5 cluster managed with slurm. (nodes have raspberry pi os lite)
What I've done so far:
HWLOC (v2.11 git clone)--->
./configure --disable-rsmi --prefix=/hwloc-install-prefix
make
make install
LIBEVENT (latest git clone)--->
./configure --prefix==/libevent-install-prefix
make
make install
OPENPMIX (latest git clone)--->
./configure --with-slurm --with-libevent=/libevent-install-prefix --with-hwloc=/hwloc-install-prefix --prefix=/pmix-install-prefix
make
make install
OPENMPI (latest git clone)--->
./configure --disable-sphinx --with-slurm --with-libevent=libevent-install-prefix --with-hwloc=hwloc-install-prefix --with-pmi=pmix-install-prefix --prefix=ompi-prefix
make --------> I fail here
(note that I'm disabling sphinx because I've not yet installed a python module on the cluster)
The output of pmix configure correctly indicates slurm support and the paths to external libevent and hwloc.
Also the output of ompi configure correctly indicates pmi, libevent and hwloc as external.
When I try to run openmpi make I'm not able to build it for this error:
In file included from /clusterfs/apps/openpmix/include/pmix_common.h:2797,
from /clusterfs/apps/openpmix/include/pmix/src/class/pmix_list.h:78,
from /clusterfs/src/ompi/3rd-party/prrte/src/pmix/pmix-internal.h:26,
from prted/pmix/pmix_server_session.c:12:
prted/pmix/pmix_server_session.c: In function 'process_directive':
prted/pmix/pmix_server_session.c:145:50: error: 'PMIX_SESSION_PROVISION' undeclared (first use in this function); did you mean 'PMIX_SESSION_PROVISION_NODES'?
145 | } else if (PMIX_CHECK_KEY(&req->info[n], PMIX_SESSION_PROVISION) ||
| ^~~~~~~~~~~~~~~~~~~~~~
/clusterfs/apps/openpmix/include/pmix_deprecated.h:497:30: note: in definition of macro 'PMIX_CHECK_KEY'
497 | PMIx_Check_key((a)->key, b)
| ^
prted/pmix/pmix_server_session.c:145:50: note: each undeclared identifier is reported only once for each function it appears in
145 | } else if (PMIX_CHECK_KEY(&req->info[n], PMIX_SESSION_PROVISION) ||
| ^~~~~~~~~~~~~~~~~~~~~~
/clusterfs/apps/openpmix/include/pmix_deprecated.h:497:30: note: in definition of macro 'PMIX_CHECK_KEY'
497 | PMIx_Check_key((a)->key, b)
| ^
prted/pmix/pmix_server_session.c: At top level:
prted/pmix/pmix_server_session.c:416:1: fatal error: opening dependency file prted/pmix/.deps/libprrte_la-pmix_server_session.Tpo: Permission denied
416 | }
| ^
compilation terminated.
make[4]: *** [Makefile:1655: prted/pmix/libprrte_la-pmix_server_session.lo] Error 1
make[4]: *** Waiting for unfinished jobs....
make[4]: Leaving directory '/clusterfs/src/ompi/3rd-party/prrte/src'
make[3]: *** [Makefile:1862: all-recursive] Error 1
make[3]: Leaving directory '/clusterfs/src/ompi/3rd-party/prrte/src'
make[2]: *** [Makefile:795: all-recursive] Error 1
make[2]: Leaving directory '/clusterfs/src/ompi/3rd-party/prrte'
make[1]: *** [Makefile:1385: all-recursive] Error 1
make[1]: Leaving directory '/clusterfs/src/ompi/3rd-party'
make: *** [Makefile:1512: all-recursive] Error 1