diff --git a/docs/installing-open-mpi/configure-cli-options/installation.rst b/docs/installing-open-mpi/configure-cli-options/installation.rst index 05da73ef917..eed24f7effa 100644 --- a/docs/installing-open-mpi/configure-cli-options/installation.rst +++ b/docs/installing-open-mpi/configure-cli-options/installation.rst @@ -174,6 +174,11 @@ be used with ``configure``: These two options, along with ``--enable-mca-no-build``, govern the behavior of how Open MPI's frameworks and components are built. + .. tip:: + + :ref:`See this section ` for + advice to packagers about these CLI options. + The ``--enable-mca-dso`` option specifies which frameworks and/or components are built as Dynamic Shared Objects (DSOs). Specifically, DSOs are built as "plugins" outside of the core Open @@ -219,7 +224,7 @@ be used with ``configure``: .. note:: As of Open MPI |ompi_ver|, ``configure``'s global default is to build all components as static (i.e., part of the Open MPI core libraries, not as DSOs). Prior to Open MPI - 5.0.0, the global default behavior was to build + v5.0.0, the global default behavior was to build most components as DSOs. .. important:: If the ``--disable-dlopen`` option is specified, then @@ -260,11 +265,6 @@ be used with ``configure``: shell$ ./configure --enable-mca-dso=btl-tcp --enable-mca-static=btl-tcp - .. tip:: - - :ref:`See this section ` for - advice to packagers about this CLI option. - * ``--enable-mca-no-build=LIST``: Comma-separated list of ``-`` pairs that will not be built. For example, ``--enable-mca-no-build=threads-qthreads,pml-monitoring`` will diff --git a/docs/installing-open-mpi/packagers.rst b/docs/installing-open-mpi/packagers.rst index e43d52b101a..c0fb13ea23a 100644 --- a/docs/installing-open-mpi/packagers.rst +++ b/docs/installing-open-mpi/packagers.rst @@ -80,8 +80,8 @@ running Open MPI's ``configure`` script. .. _label-install-packagers-dso-or-not: -Components ("plugins"): DSO or no? ----------------------------------- +Components ("plugins"): static or DSO? +-------------------------------------- Open MPI contains a large number of components (sometimes called "plugins") to effect different types of functionality in MPI. For @@ -89,6 +89,69 @@ example, some components effect Open MPI's networking functionality: they may link against specialized libraries to provide highly-optimized network access. +Open MPI can build its components as Dynamic Shared Objects (DSOs) or +statically included in core libraries (regardless of whether those +libraries are built as shared or static libraries). + +.. note:: As of Open MPI |ompi_ver|, ``configure``'s global default is + to build all components as static (i.e., part of the Open + MPI core libraries, not as DSOs). Prior to Open MPI v5.0.0, + the global default behavior was to build most components as + DSOs. + +Why build components as DSOs? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There are advantages to building components as DSOs: + +* Open MPI's core libraries |mdash| and therefore MPI applications + |mdash| will have very few dependencies. For example, if you build + Open MPI with support for a specific network stack, the libraries in + that network stack will be dependencies of the DSOs, not Open MPI's + core libraries (or MPI applications). + +* Removing Open MPI functionality that you do not want is as simple as + removing a DSO from ``$libdir/open-mpi``. + +Why build components as part of Open MPI's core libraries? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The biggest advantage to building the components as part of Open MPI's +core libraries is when running at (very) large scales when Open MPI is +installed on a network filesystem (vs. being installed on a local +filesystem). + +For example, consider launching a single MPI process on each of 1,000 +nodes. In this scenario, the following is accessed from the network +filesystem: + +#. The MPI application +#. The core Open MPI libraries and their dependencies (e.g., + ``libmpi``) + + * Depending on your configuration, this is probably on the order of + 10-20 library files. + +#. All DSO component files and their dependencies + + * Depending on your configuration, this can be 200+ component + files. + +If all components are physically located in the libraries, then the +third step loads zero DSO component files. When using a networked +filesystem while launching at scale, this can translate to large +performance savings. + +.. note:: If not using a networked filesystem, or if not launching at + scale, loading a large number of DSO files may not consume a + noticeable amount of time during MPI process launch. Put + simply: loading DSOs as indvidual files generally only + matters when using a networked filesystem while launching at + scale. + +Direct controls for building components as DSOs or not +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + Open MPI |ompi_ver| has two ``configure``-time defaults regarding the treatment of components that may be of interest to packagers: @@ -135,19 +198,121 @@ using ``--enable-mca-dso`` to selectively build some components as DSOs and leave the others included in their respective Open MPI libraries. +:ref:`See the section on building accelerator support +` for a +practical example where this can be useful. + +.. _label-install-packagers-gnu-libtool-dependency-flattening: + +GNU Libtool dependency flattening +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When compiling Open MPI's components statically as part of Open MPI's +core libraries, `GNU Libtool `_ +|mdash| which is used as part of Open MPI's build system |mdash| will +attempt to "flatten" dependencies. + +For example, the :ref:`ompi_info(1) ` command links +against the Open MPI core library ``libopen-pal``. This library will +have dependencies on various HPC-class network stack libraries. For +simplicity, the discussion below assumes that Open MPI was built with +support for `Libfabric `_ and `UCX +`_, and therefore ``libopen-pal`` has direct +dependencies on ``libfabric`` and ``libucx``. + +In this scenario, GNU Libtool will automatically attempt to "flatten" +these dependencies by linking :ref:`ompi_info(1) ` +directly to ``libfabric`` and ``libucx`` (vs. letting ``libopen-pal`` +pull the dependencies in at run time). + +* In some environments (e.g., Ubuntu 22.04), the compiler and/or + linker will automatically utilize the linker CLI flag + ``-Wl,--as-needed``, which will effectively cause these dependencies + to *not* be flattened: :ref:`ompi_info(1) ` will + *not* have a direct dependencies on either ``libfabric`` or + ``libucx``. + +* In other environments (e.g., Fedora 38), the compiler and linker + will *not* utilize the ``-Wl,--as-needed`` linker CLI flag. As + such, :ref:`ompi_info(1) ` will show direct + dependencies on ``libfabric`` and ``libucx``. + +**Just to be clear:** these flattened dependencies *are not a +problem*. Open MPI will function correctly with or without the +flattened dependencies. There is no performance impact associated +with having |mdash| or not having |mdash| the flattened dependencies. +We mention this situation here in the documentation simply because it +surprised some Open MPI downstream package managers to see that +:ref:`ompi_info(1) ` in Open MPI |ompi_ver| had more +shared library dependencies than it did in prior Open MPI releases. + +If packagers want :ref:`ompi_info(1) ` to not have +these flattened dependencies, use either of the following mechanisms: + +#. Use ``--enable-mca-dso`` to force all components to be built as + DSOs (this was actually the default behavior before Open MPI v5.0.0). + +#. Add ``LDFLAGS=-Wl,--as-needed`` to the ``configure`` command line + when building Open MPI. + + .. note:: The Open MPI community specifically chose not to + automatically utilize this linker flag for the following + reasons: + + #. Having the flattened dependencies does not cause any + correctness or performance problems. + #. There's multiple mechanisms (see above) for users or + packagers to change this behavior, if desired. + #. Certain environments have chosen to have |mdash| or + not have |mdash| this flattened dependency behavior. + It is not Open MPI's place to override these choices. + #. In general, Open MPI's ``configure`` script only + utilizes compiler and linker flags if they are + *needed*. All other flags should be the user's / + packager's choice. + +.. _label-install-packagers-building-accelerator-support-as-dsos: + +Building accelerator support as DSOs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you are building a package that includes support for one or more +accelerators, it may be desirable to build accelerator-related +components as DSOs (see the :ref:`static or DSO? +` section for details). + +.. admonition:: Rationale + :class: tip + + Accelerator hardware is expensive, and may only be present on some + compute nodes in an HPC cluster. Specifically: there may not be + any accelerator hardware on "head" or compile nodes in an HPC + cluster. As such, invoking Open MPI commands on a "head" node with + an MPI that was built with static accelerator support but no + accelerator hardware may fail to launch because of run-time linker + issues (because the accelerator hardware support libraries are + likely not present). + + Building Open MPI's accelerator-related components as DSOs allows + Open MPI to *try* opening the accelerator components, but proceed + if those DSOs fail to open due to the lack of support libraries. + +Use the ``--enable-mca-dso`` command line parameter to Open MPI's +``configure`` command can allow packagers to build all +accelerator-related components as DSO. For example: + .. code:: sh - # Build all the "accelerator" components as DSOs (all other + # Build all the accelerator-related components as DSOs (all other # components will default to being built in their respective # libraries) - shell$ ./configure --enable-mca-dso=accelerator ... - -This allows packaging ``$libdir`` as part of the "main" Open MPI -binary package, but then packaging -``$libdir/openmpi/mca_accelerator_*.so`` as sub-packages. These -sub-packages may inherit dependencies on the CUDA and/or ROCM -packages, for example. User can always install the "main" Open MPI -binary package, and can install the additional "accelerator" Open MPI -binary sub-package if they actually have accelerator hardware -installed (which will cause the installation of additional -dependencies). + shell$ ./configure --enable-mca-dso=btl-smcuda,rcache-rgpusm,rcache-gpusm,accelerator + +Per the example above, this allows packaging ``$libdir`` as part of +the "main" Open MPI binary package, but then packaging +``$libdir/openmpi/mca_accelerator_*.so`` and the other named +components as sub-packages. These sub-packages may inherit +dependencies on the CUDA and/or ROCM packages, for example. The +"main" package can be installed on all nodes, and the +accelerator-specific subpackage can be installed on only the nodes +with accelerator hardware and support libraries. diff --git a/docs/man-openmpi/man1/mpisync.1.rst b/docs/man-openmpi/man1/mpisync.1.rst index 19c08f3ca25..80195aaaf49 100644 --- a/docs/man-openmpi/man1/mpisync.1.rst +++ b/docs/man-openmpi/man1/mpisync.1.rst @@ -1,4 +1,4 @@ -.. _mpisync: +.. _man1-mpisync: mpisync @@ -6,7 +6,7 @@ mpisync .. include_body -Open MPI timing tools +mpisync |mdash| Open MPI timing tools SYNTAX diff --git a/docs/man-openmpi/man1/ompi-wrapper-compiler.1.rst b/docs/man-openmpi/man1/ompi-wrapper-compiler.1.rst index 6a288b38279..d61ad6085b9 100644 --- a/docs/man-openmpi/man1/ompi-wrapper-compiler.1.rst +++ b/docs/man-openmpi/man1/ompi-wrapper-compiler.1.rst @@ -9,7 +9,7 @@ Open MPI Wrapper Compilers .. include_body -mpicc, mpic++, mpicxx, mpifort, mpijavac -- Open MPI wrapper compilers +mpicc, mpic++, mpicxx, mpifort, mpijavac |mdash| Open MPI wrapper compilers SYNTAX ------ diff --git a/docs/man-openmpi/man1/ompi_info.1.rst b/docs/man-openmpi/man1/ompi_info.1.rst index 21313d7bc3b..8b3e11c5761 100644 --- a/docs/man-openmpi/man1/ompi_info.1.rst +++ b/docs/man-openmpi/man1/ompi_info.1.rst @@ -6,7 +6,7 @@ ompi_info .. include_body -ompi_info - Display information about the Open MPI installation +ompi_info |mdash| Display information about the Open MPI installation SYNOPSIS diff --git a/docs/man-openmpi/man1/opal_wrapper.1.rst b/docs/man-openmpi/man1/opal_wrapper.1.rst index 80002dbd015..cd4d4513854 100644 --- a/docs/man-openmpi/man1/opal_wrapper.1.rst +++ b/docs/man-openmpi/man1/opal_wrapper.1.rst @@ -6,7 +6,7 @@ opal_wrapper .. include_body -opal_wrapper - Back-end Open MPI wrapper command +opal_wrapper |mdash| Back-end Open MPI wrapper command DESCRIPTION diff --git a/docs/news/index.rst b/docs/news/index.rst index 6e377a4f6fd..c7a2ece2222 100644 --- a/docs/news/index.rst +++ b/docs/news/index.rst @@ -10,29 +10,6 @@ This file contains the main features as well as overviews of specific bug fixes (and other actions) for each version of Open MPI since version 1.0. -.. error:: GP - move elsewhere and refer to software versioning here. - - As more fully described in the "Software Version Number" section in - the README file, Open MPI typically releases two separate version - series simultaneously. Since these series have different goals and - are semi-independent of each other, a single NEWS-worthy item may be - introduced into different series at different times. For example, - feature F was introduced in the vA.B series at version vA.B.C, and was - later introduced into the vX.Y series at vX.Y.Z. - - The first time feature F is released, the item will be listed in the - vA.B.C section, denoted as: - - (** also to appear: X.Y.Z) -- indicating that this item is also - likely to be included in future release - version vX.Y.Z. - - When vX.Y.Z is later released, the same NEWS-worthy item will also be - included in the vX.Y.Z section and be denoted as: - - (** also appeared: A.B.C) -- indicating that this item was previously - included in release version vA.B.C. - :ref:`search` .. toctree:: diff --git a/docs/news/news-v5.0.x.rst b/docs/news/news-v5.0.x.rst index d5ebfbc3c19..b6ce469c9cd 100644 --- a/docs/news/news-v5.0.x.rst +++ b/docs/news/news-v5.0.x.rst @@ -221,10 +221,19 @@ Open MPI version 5.0.0 - The default atomics have been changed to be GCC, with C11 as a fallback. C11 atomics incurs sequential memory ordering, which in most cases is not desired. + - The default build mode has changed from building Open MPI's + components as Dynamic Shared Objects (DSOs) to being statically + included in their respective libraries. + + .. important:: This has consequences for packagers. Be sure to + read the :ref:`GNU Libtool dependency flattening + ` + subsection. + - Various datatype bugfixes and performance improvements. - Various pack/unpack bugfixes and performance improvements. - Various OSHMEM bugfixes and performance improvements. - - Thanks to Jeff Hammond, Pak Lui, Felix Uhl, Naribayashi Akira, + - Thanks to Jeff Hammond, Pak Lui, Felix Uhl, Naribayashi Akira, Julien Emmanuel, and Yaz Saito for their invaluable contributions. - Documentation updates and improvements: @@ -242,9 +251,12 @@ Open MPI version 5.0.0 directory. - Many, many people from the Open MPI community contributed to the - overall documentation effort |mdash| not only those who are - listed in the Git commit logs |mdash| including (but not limited - to): + overall documentation effort |mdash| not just those who are + listed in the Git commit logs. Indeed, many Open MPI core + developers contributed their time and effort, as did a fairly + large group of non-core developers (e.g., those who participated + just to help the documentation revamp), including (but not + limited to): - Lachlan Bell - Simon Byrne @@ -254,7 +266,6 @@ Open MPI version 5.0.0 - Sophia Fang - Rick Gleitz - Colton Kammes - - Quincey Koziol - Robert Langfield - Nick Papior - Luz Paz @@ -265,5 +276,3 @@ Open MPI version 5.0.0 - Fangcong Yin - Seth Zegelstein - Yixin Zhang - - William Zhang -