- Allow for IPC, RO, GDA backends to be selected at runtime
- Added the GDA conduit for different NIC vendors
- AMD Pensando IONIC
- Broadcom BNXT_RE (Thor 2)
- Mellanox MLX5 (IB and RoCE ConnectX-7)
- Added new APIs:
rocshmem_get_device_ctxrocshmem_ctx_pe_quietrocshmem_pe_quiet
- The following APIs have been deprecated:
rocshmem_wg_initrocshmem_wg_finalizerocshmem_wg_init_thread
rocshmem_ptrcan now return non-null pointer to a shared memory region when the IPC transport is available to reach that region. Previously, it would return a null pointer.ROCSHMEM_RO_DISABLE_IPCwas renamed toROCSHMEM_DISABLE_MIXED_IPC. This enviroment variable was not documented for prior releases. It is now documented to inform users who were using this undocumented feature.
- rocSHMEM no-longer requires rocPRIM and rocThrust as dependencies
- Removed MPI compile-time dependency
- Only a subset of rocSHMEM APIs are implemented for the GDA conduit
- Added the Reverse Offload conduit
- Added new APIs:
rocshmem_ctx_barrierrocshmem_ctx_barrier_waverocshmem_ctx_barrier_wgrocshmem_barrier_allrocshmem_barrier_all_waverocshmem_barrier_all_wgrocshmem_ctx_syncrocshmem_ctx_sync_waverocshmem_ctx_sync_wgrocshmem_sync_allrocshmem_sync_all_waverocshmem_sync_all_wgrocshmem_init_attrrocshmem_get_uniqueidrocshmem_set_attr_uniqueid_args
- Added dlmalloc based allocator
- Added XNACK support
- Added support for initialization with MPI communicators other than
MPI_COMM_WORLD
- Changed collective APIs to use
_wgsuffix rather than_wg_infix
- Resolved segfault in
rocshmem_wg_ctx_create, now provides nullptr if ctx cannot be created
- Resolved incorrect output for
rocshmem_ctx_my_peandrocshmem_ctx_n_pes - Resolved multi-team errors by providing team specific buffers in
rocshmem_ctx_wg_team_sync - Resolved missing implementation of
rocshmem_gfor IPC conduit