-
Notifications
You must be signed in to change notification settings - Fork 901
6.0.x Feature List
Howard Pritchard edited this page Jun 4, 2025
·
88 revisions
Target date - Q1CY25.
- Howard Pritchard
- Edgar Gabriel
- Big count support
- API level functions (merged)
- Collective embiggening (DONE)
- Changes to datatype engine/combiner support (could be a challenge, but API level function PR works around some of the issues)
- ROMIO refresh (DONE decided to remove it and that has been done)
- Embiggen man pages (DONE, probably will do the way MPICH does this if possible, 13179)
- Embiggen other documentation (which documentation?)
- Remove hcol component (drop hcoll build by default, needs explicit --with-hcoll config)
- MPI_T events (DONE stub implementation merged in to main). 13133
- Memory Kind support: (DONE)
- Add memory-kind option
- Return supported memory kinds
- Other MPI 4.1 items - too embarrassing to think about - see here
- If Jake's ABI work is ready, it might help solidify the standard to have our implementation done.
- Merge ABI work into main, enable it only when requested, and stress in documentation it is experimental.
- Resync with upstream PRRTe and decide which branch to use for the 6.0.x branch
- Documentation Changes (partially DONE UofL)
- Prefix prte binary names (DONE UofL)
- Remove --with-prrte configure option from ompi (DONE UofL)
- Remove unneeded MCA components and frameworks (DONE UofL/rhc54)
- Need to merge UofL changes into whatever solution we find for a PRRTE embedded in OMPI solution for 6.0.x. Note some UofL changes are in the OMPI source code. (DONE changes in new branch in OMPI fork of PRRTE. TODO changes for OMPI code proper)
- extended accelerator API functionality (IPC) and conversion of the last components to use accelerator API (DONE for ROCM and CUDA, not ZE).
- level zero (ze) accelerator component (DONE basic support, IPC not implemented, Howard)
- support for MPI 4.1 memory kinds info object (DONE)
- SMSC accelerator (Edgar - DONE CUDA needs to be testied)
- Add features to coll accelerator (DONE)
- Runtime and maybe config time big flag to turn off/on accelerator support (IN PROGRESS Edgar/AMD, PRRTE patches done)
- GNI BTL - no longer have access to systems to support this (Howard) (DONE)
- UDREG Rcache - no longer have access to systems that can use this (Howard) (DONE)
- FS/PVFS2 an FBTL/PVFS2 - no longer have access to systems to support this (Edgar) (DONE)
- coll/sm (DONE)
- Remove TKR version of
use mpi
module. (Howard) (DONE)- This was deferred from 4.0.x because in April/May 2018 (and then deferred again from v5.0.x in October 2018), it was discovered that:
- The RHEL 7.x default gcc (4.8.5) still uses the TKR
mpi
module - The NAG compiler still uses the TKR
mpi
module.
- The RHEL 7.x default gcc (4.8.5) still uses the TKR
- This was deferred from 4.0.x because in April/May 2018 (and then deferred again from v5.0.x in October 2018), it was discovered that:
- mca/coll: hierarchical MPI_Alltoall(v), MPI_Gatherv, MPI_Scatterv. (DONE various orgs working on this)
- might benefit from a json file based parameter file (DONE AWS/Luke)
- mca/coll: new algorithms (DONE various orgs working on this)
- new components: xhc, acoll (DONE)
There are quite a few open PRs related to collectives. Can some of these get merged? See notes from 2024 F2F Meeting
- Sessions - add support for UCX PML (Howard, 2-3 weeks) (DONE)
- Sessions - various small fixes (Howard, 1 month) (DONE)
- Require C11 (DONE)
- Need fix for LTO (DONE)
- Need to update release notes
- Phase 2 PRRTE
- MCA parameters move into ompi namespace.
- prte_info is gone, move those to ompi_info, perhaps a prte-mca option?
- BTL Self accelerator aware (probably defer to later release)
- What about smart pointers?
- reduction op (and others) offload support (Joseph estimates 1-2 months to get in)
- Stream-aware datatype engine.
- Datatype engine accelerator awareness(e.g. memcpy2d) (George).
- mca/coll: blocking reduction on accelerator (this is discussed above, Joseph)
- Atomics - can we just rely on C11 and remove some of this code? We are currently using gcc atomics for performance reasons. Joseph would like to have a wrapper for atomic types and direct load/store access.
- ZE support for IPC (maybe)