Skip to content

6.0.x Feature List

Howard Pritchard edited this page May 19, 2025 · 91 revisions

Time Line

Target date - Q1CY25.

Release Managers

  • Howard Pritchard
  • Edgar Gabriel

List of Features planned for the 6.0.x release stream

MPI 4.0:

  • Big count support
    • API level functions (merged)
    • Collective embiggening (DONE)
    • Changes to datatype engine/combiner support (could be a challenge, but API level function PR works around some of the issues)
    • ROMIO refresh (DONE decided to remove it and that has been done)
    • Embiggen man pages (DONE, probably will do the way MPICH does this if possible, 13179)
    • Embiggen other documentation (which documentation?)
    • Remove hcol component (drop hcoll build by default, needs explicit --with-hcoll config)
  • MPI_T events (DONE stub implementation merged in to main). 13133

MPI 4.1:

  • Memory Kind support: (DONE)
    • Add memory-kind option
    • Return supported memory kinds
  • Other MPI 4.1 items - too embarrassing to think about - see here

MPI 5.0 ABI:

  • If Jake's ABI work is ready, it might help solidify the standard to have our implementation done.
    • Merge ABI work into main, enable it only when requested, and stress in documentation it is experimental.

PRRTE switch Phase 1

  • Resync with upstream PRRTe and decide which branch to use for the 6.0.x branch
  • Documentation Changes (partially DONE UofL)
  • Prefix prte binary names (DONE UofL)
  • Remove --with-prrte configure option from ompi (DONE UofL)
  • Remove unneeded MCA components and frameworks (DONE UofL/rhc54)
  • Need to merge UofL changes into whatever solution we find for a PRRTE embedded in OMPI solution for 6.0.x. Note some UofL changes are in the OMPI source code.

Accelerator support:

  • extended accelerator API functionality (IPC) and conversion of the last components to use accelerator API (DONE for ROCM and CUDA, not ZE).
  • level zero (ze) accelerator component (DONE basic support, IPC not implemented, Howard)
  • support for MPI 4.1 memory kinds info object (DONE)
  • SMSC accelerator (Edgar - DONE CUDA needs to be testied)
  • Add features to coll accelerator (DONE)
  • Runtime and maybe config time big flag to turn off/on accelerator support (IN PROGRESS Edgar/AMD, PRRTE patches done)

Things to remove:

  • GNI BTL - no longer have access to systems to support this (Howard) (DONE)
  • UDREG Rcache - no longer have access to systems that can use this (Howard) (DONE)
  • FS/PVFS2 an FBTL/PVFS2 - no longer have access to systems to support this (Edgar) (DONE)
  • coll/sm (DONE)
  • Remove TKR version of use mpi module. (Howard) (DONE)
    • This was deferred from 4.0.x because in April/May 2018 (and then deferred again from v5.0.x in October 2018), it was discovered that:
      1. The RHEL 7.x default gcc (4.8.5) still uses the TKR mpi module
      2. The NAG compiler still uses the TKR mpi module.

Collectives:

  • mca/coll: hierarchical MPI_Alltoall(v), MPI_Gatherv, MPI_Scatterv. (DONE various orgs working on this)
  • might benefit from a json file based parameter file (DONE AWS/Luke)
  • mca/coll: new algorithms (DONE various orgs working on this)

There are quite a few open PRs related to collectives. Can some of these get merged? See notes from 2024 F2F Meeting

Random:

  • Sessions - add support for UCX PML (Howard, 2-3 weeks) (DONE)
  • Sessions - various small fixes (Howard, 1 month) (DONE)
  • Require C11 (DONE)
  • Need fix for LTO (IN PROGRESS)

Likely to miss the 6.0.0 release

  • Phase 2 PRRTE
    • MCA parameters move into ompi namespace.
    • prte_info is gone, move those to ompi_info, perhaps a prte-mca option?
  • BTL Self accelerator aware (probably defer to later release)
  • What about smart pointers?
  • reduction op (and others) offload support (Joseph estimates 1-2 months to get in)
  • Stream-aware datatype engine.
  • Datatype engine accelerator awareness(e.g. memcpy2d) (George).
  • mca/coll: blocking reduction on accelerator (this is discussed above, Joseph)
  • Atomics - can we just rely on C11 and remove some of this code? We are currently using gcc atomics for performance reasons. Joseph would like to have a wrapper for atomic types and direct load/store access.
  • ZE support for IPC (maybe)
Clone this wiki locally