diff --git a/docs/tuning-apps/networking/cuda.rst b/docs/tuning-apps/accelerators/cuda.rst
similarity index 100%
rename from docs/tuning-apps/networking/cuda.rst
rename to docs/tuning-apps/accelerators/cuda.rst
diff --git a/docs/tuning-apps/accelerators/index.rst b/docs/tuning-apps/accelerators/index.rst
new file mode 100644
index 00000000000..c6b0ecfcd70
--- /dev/null
+++ b/docs/tuning-apps/accelerators/index.rst
@@ -0,0 +1,16 @@
+Accelerator support
+===================
+
+Open MPI supports a variety of different accelerator vendor
+eco-systems. This section provides some generic guidance on tuning MPI
+applications that use device memory, as well as vendor specific
+options.
+
+
+.. toctree::
+   :maxdepth: 1
+
+   initialize
+   memkind
+   cuda
+   rocm
diff --git a/docs/tuning-apps/accelerators/initialize.rst b/docs/tuning-apps/accelerators/initialize.rst
new file mode 100644
index 00000000000..0bd147f4efb
--- /dev/null
+++ b/docs/tuning-apps/accelerators/initialize.rst
@@ -0,0 +1,39 @@
+Selecting an Accelerator Device before calling MPI_Init
+=======================================================
+
+A common problem when using accelerators arises when selecting which
+GPU should be used by an MPI process. The decision is often based by
+the rank of that process in ``MPI_COMM_WORLD``. The rank of a process
+can however only be retrieved after the MPI library has correctly
+initialized. On the other hand, the accelerator resources initialized
+during ``MPI_Init`` can have some associations with the `current`
+device, which will be the default device used by a particular
+eco-system if not set to a different value.
+
+To circumvent this circular problem, applications are encouraged to
+make use of the environment variable ``OMPI_COMM_WORLD_LOCAL_RANK``
+that is set by Open MPI at launch time and can be retrieved before
+``MPI_Init``. An example code sample using the HIP programming model
+looks as follows:
+
+.. code-block:: c
+
+   int num_devices;
+   hipGetDeviceCount(&num_devices);
+   assert (num_devices > 0);
+
+   char* ompi_local_rank = getenv("OMPI_COMM_WORLD_LOCAL_RANK");
+   if (nullptr != ompi_local_rank) {
+	hipSetDevice(atoi(ompi_local_rank) % num_devices);
+   }
+
+   MPI_Init (&argc, &argv);
+   ...
+
+
+.. note:: Open MPI currently assumes that an MPI processes is using a
+          single accelerator device. Certain software stacks might be
+          able to support multiple GPUs per rank.
+
+
+
diff --git a/docs/tuning-apps/accelerators/memkind.rst b/docs/tuning-apps/accelerators/memkind.rst
new file mode 100644
index 00000000000..7567d9f7e82
--- /dev/null
+++ b/docs/tuning-apps/accelerators/memkind.rst
@@ -0,0 +1,64 @@
+Support for Memory-kind Info Objects
+====================================
+
+`MPI version 4.1. <https://www.mpi-forum.org/docs/mpi-4.1/mpi41-report.pdf>`_
+introduced the notion of memory allocation kinds, which allow an
+application to specify what memory types it plans to use, and to query
+what memory types are supported by the MPI library in a portable
+manner. In addition, the application can place restrictions on certain
+objects such as creating a separate communicator for using with
+host-memory and a communicator that will be used with device memory
+only. This approach allows the MPI library to perform certain
+optimizations, such as bypassing checking the memory-type of buffer
+pointers. Please refer to the MPI specification as well as the `Memory
+Allocation Kinds Side Document
+<https://www.mpi-forum.org/docs/sidedocs/mem-alloc10.pdf>`_ for more
+details and examples.
+
+Open MPI starting from version 6.0.0 supports the following values for the memory allocation kind Info object:
+
+* mpi
+* system
+* cuda:device
+* cuda:host
+* cuda:managed
+* level_zero:device
+* level_zero:host
+* level_zero:shared
+* rocm:device
+* rocm:host
+* rocm:managed
+
+.. note:: Support for accelerator memory allocation kind info objects
+          will depend on the accelerator support compiled into Open
+          MPI.
+
+
+Passing memory-kind info to mpiexec
+===================================
+
+The following example demonstrates how to pass memory allocation kind
+information to Open MPI at application launch:
+
+.. code:: sh
+
+   # Specify that the application will use system, MPI, and CUDA device memory
+   shell$ mpiexec --memory-allocation-kinds system,mpi,cuda:device -n 64 ./<my_executable>
+
+Asserting usage of memory kind when creating a Communicator
+===========================================================
+
+The following code-snipplet demonstrates how to assert that a
+communicator will only be used for ROCm device buffers:
+
+.. code:: c
+ 
+  MPI_Info info_assert;
+  MPI_Info_create (&info_assert);
+  char assert_key[] = "mpi_assert_memory_alloc_kinds";
+  char assert_value[] = "rocm:device";
+  MPI_Info_set (info_assert, assert_key, assert_value);
+
+  MPI_Comm comm_dup
+  MPI_Comm_dup_with_info (MPI_COMM_WORLD, info_assert, &comm_dup);
+  ...
diff --git a/docs/tuning-apps/accelerators/rocm.rst b/docs/tuning-apps/accelerators/rocm.rst
new file mode 100644
index 00000000000..812734088d3
--- /dev/null
+++ b/docs/tuning-apps/accelerators/rocm.rst
@@ -0,0 +1,269 @@
+ROCm
+====
+
+ROCm is the name of the software stack used by AMD GPUs. It includes
+the ROCm Runtime (ROCr), the HIP programming model, and numerous
+numerical and machine learning libraries tuned for the AMD Instinct and Radeon
+accelerators. More information can be found at the following
+`AMD webpages <https://rocm.docs.amd.com/en/latest/>`_
+
+
+Building Open MPI with ROCm support
+-----------------------------------
+
+ROCm-aware support means that the MPI library can send and receive
+data from AMD GPU device buffers directly. Starting from Open MPI
+v6.0.0 ROCm support is available directly within Open MPI for single
+node scenarios, and through UCX or libfabric for multi-node scenarios.
+
+
+Compiling Open MPI with ROCm support
+------------------------------------
+
+Compiling Open MPI with ROCm support requires setting the
+``--with-rocm=<rocm-path>`` option at configure time:
+
+.. code-block:: sh
+
+ # Configure Open MPI with ROCm support
+ shell$ cd ompi
+ shell$ ./configure --with-rocm=/opt/rocm \
+        <other configure params>
+
+
+/////////////////////////////////////////////////////////////////////////
+
+Checking that Open MPI has been built with ROCm support
+-------------------------------------------------------
+
+Verify that Open MPI has been built with ROCm using the
+:ref:`ompi_info(1) <man1-ompi_info>` command:
+
+.. code-block:: sh
+
+   # Use ompi_info to verify ROCm support in Open MPI
+   shell$ ./ompi_info | grep "MPI extensions"
+          MPI extensions: affinity, cuda, ftmpi, rocm
+
+/////////////////////////////////////////////////////////////////////////
+
+Runtime querying of ROCm support in Open MPI
+--------------------------------------------
+
+Querying the availability of ROCm support in Open MPI at runtime is
+possible through the memory allocation kind info object, see ::ref::`memkind`
+page for details.
+
+In addition, starting with Open MPI v5.0.0 :ref:`MPIX_Query_rocm_support(3)
+<mpix_query_rocm_support>` is available as an extension to check
+the availability of ROCm support in the library. To use the
+function, the code needs to include ``mpi-ext.h``. Note that
+``mpi-ext.h`` is an Open MPI specific header file.
+
+
+.. _sm-rocm-options-label:
+
+/////////////////////////////////////////////////////////////////////////
+
+Running single node jobs with ROCm support
+------------------------------------------
+
+The user has multiple options for running an Open MPI job with GPU support
+in a single node scenario:
+
+* the default shared memory component ``btl/sm`` has support for
+  accelerators, will use however by default a bounce buffer on the CPU
+  for data transfers. Hence, while this works, it will not be able to
+  take advantage of the high-speed GPU-to-GPU InfinityFabric
+  interconnect (if available).
+
+* to use the high-speed GPU-to-GPU interconnect within a node, the user has to
+  enable the accelerator single-copy component (``smsc/accelerator``), e.g.:
+
+.. code-block:: sh
+
+  # Enable the smsc/accelerator component
+  shell$ mpirun --mca smsc_accelerator_priority 80 -n 64 ./<my_executable>
+
+* Alternatively, the user can replace the default shared memory
+  component ``btl/sm`` with the ``btl/smcuda`` component, which has
+  been extended to support ROCm devices. While this approach supports
+  communication over a high-speed GPU-to-GPU interconnect, it does not
+  support single-copy data transfers for host-memory through
+  e.g. ``xpmem`` or ``cma``.  Hence, the performance of host-memory
+  based data transfers might be lower than with the default ``btl/sm``
+  component. Example:
+
+.. code-block:: sh
+
+  # Use btl/smcuda instead of btl/sm for communication
+  shell$ mpirun --mca btl smcuda,tcp,self -n 64 ./<my_executable>
+
+/////////////////////////////////////////////////////////////////////////
+
+ROCm support in Open MPI with UCX
+---------------------------------
+
+In this configuration, UCX will provide the ROCm support, and hence it
+is important to ensure that UCX itself is built with ROCm support. Both,
+inter- and intra-node communication will be executed through UCX.
+
+To see if your UCX library was built with ROCm support, run the
+following command:
+
+.. code-block:: sh
+
+   # Check if ucx was built with ROCm support
+   shell$ ucx_info -v
+
+   # configured with: --with-rocm=/opt/rocm --enable-mt
+
+If you need to build the UCX library yourself to include ROCm support,
+please see the UCX documentation for `building UCX with Open MPI:
+<https://openucx.readthedocs.io/en/master/running.html#openmpi-with-ucx>`_
+
+It should look something like:
+
+.. code-block:: sh
+
+   # Configure UCX with ROCm support
+   shell$ cd ucx
+   shell$ ./configure --prefix=/path/to/ucx-rocm-install \
+                      --with-rocm=/opt/rocm
+
+   # Configure Open MPI with UCX and ROCm support
+   shell$ cd ompi
+   shell$ ./configure --with-rocm=/opt/rocm    \
+          --with-ucx=/path/to/ucx-rocm-install \
+          <other configure params>
+
+/////////////////////////////////////////////////////////////////////////
+
+Using ROCm-aware UCX with Open MPI
+----------------------------------
+
+If UCX and Open MPI have been configured with ROCm support, specifying
+the UCX pml component is sufficient to take advantage of the ROCm
+support in the libraries. For example, the command to execute the
+``osu_latency`` benchmark from the `OSU benchmarks
+<https://mvapich.cse.ohio-state.edu/benchmarks>`_ with ROCm buffers
+using Open MPI and UCX ROCm support is something like this:
+
+.. code-block:: sh
+
+   shell$ mpirun -n 2 --mca pml ucx \
+           ./osu_latency D D
+
+.. note:: some additional configure flags are required to compile the
+          OSU benchmark to support ROCm buffers. Please refer to the
+          `UCX ROCm instructions
+          <https://github.com/openucx/ucx/wiki/Build-and-run-ROCM-UCX-OpenMPI>`_
+          for details.
+
+/////////////////////////////////////////////////////////////////////////
+
+ROCm support in Open MPI with libfabric
+---------------------------------------
+
+Some network interconnects are supported through the libfabric library.
+Configuring libfabric and Open MPI with ROCm support looks something like:
+
+.. code-block:: sh
+
+   # Configure libfabric with ROCm support
+   shell$ cd libfabric
+   shell$ ./configure --prefix=/path/to/ofi-rocm-install \
+                      --with-rocr=/opt/rocm
+
+   # Configure Open MPI with libfabric and ROCm support
+   shell$ cd ompi
+   shell$ ./configure --with-rocm=/opt/rocm    \
+          --with-ofi=/path/to/ofi-rocm-install \
+          <other configure params>
+
+/////////////////////////////////////////////////////////////////////////
+
+
+Using ROCm-aware libfabric with Open MPI
+----------------------------------------
+
+There are two mechanism for using libfabric and Open MPI with ROCm support.
+
+* Specifying the ``mtl/ofi`` component is sufficient to take advantage
+  of the ROCm support in the libraries. In this case, both intra- and
+  inter-node communication will be performed by the libfabric library. In
+  order to ensure that the application will make use of the shared
+  memory provider for intra-node communication and the network
+  interconnect specific provider for inter-node communication, the
+  user might have to request using the ``linkX`` provider, e.g.:
+
+.. code-block:: sh
+
+   # Force using the ofi mtl component
+   mpirun --mca pml cm --mca mtl ofi                             \
+          --mca opal_common_ofi_provider_include "shm+cxi:lnx"   \
+          -n 64 ./<my_executable>
+
+* Alternatively, the user can use the ``btl/ofi`` component, in which
+  case the intra-node communication will use the Open MPI shared
+  memory mechanisms (see <_sm-rocm-options-label>), and use
+  libfabric only for inter-node scenarios.
+
+.. code-block:: sh
+
+   # Use the ofi btl for inter-node and sm btl
+   # for intra-node communication
+   mpirun --mca pml ob1 --mca btl ofi,sm,tcp,self   \
+          --mca smsc_accelerator_priority 80        \
+          -n 64 ./<my_executable>
+  
+  
+/////////////////////////////////////////////////////////////////////////
+
+Collective component supporting ROCm device memory
+--------------------------------------------------
+
+
+The ``coll/accelerator`` component supports collective operations on
+ROCm device buffers for many commonly used collective
+operations. The component works by copying data into a temporary host
+buffer, executing the collective operation on the host buffer, and
+copying the result back to the device buffer at completion. This
+component will lead to adequate performance for short to medium data
+sizes, but performance is often suboptimal especially for large reduction
+operations.
+
+The `UCC <https://github.com/openucx/ucc>`_ based collective component
+in Open MPI can be configured and compiled to include ROCm support,
+and will typically lead to significantly better performance for large
+reductions.
+
+An example for configure UCC and Open MPI with ROCm is shown below:
+
+.. code-block:: sh
+
+   # Configure and compile UCC with ROCm support
+   shell$ cd ucc
+   shell$ ./configure --with-rocm=/opt/rocm                \
+                      --with-ucx=/path/to/ucx-rocm-install \
+                      --prefix=/path/to/ucc-rocm-install
+   shell$ make -j && make install
+
+   # Configure and compile Open MPI with UCX, UCC, and ROCm support
+   shell$ cd ompi
+   shell$ ./configure --with-rocm=/opt/rocm                \
+                      --with-ucx=/path/to/ucx-rocm-install \
+                      --with-ucc=/path/to/ucc-rocm-install 
+   
+To use the UCC component in an applicatin requires setting some
+additional parameters:
+
+.. code-block::
+
+   shell$ mpirun --mca pml ucx --mca osc ucx \
+                 --mca coll_ucc_enable 1     \
+                 --mca coll_ucc_priority 100 -np 64 ./my_mpi_app
+
+.. note:: Using the UCC library for collective operations in Open MPI
+          requires using the UCX library, and hence cannot be deployed
+          e.g. when using libfabric.
diff --git a/docs/tuning-apps/coll-tuned.rst b/docs/tuning-apps/coll-tuned.rst
index 1d5549256d8..b71f4d694ef 100644
--- a/docs/tuning-apps/coll-tuned.rst
+++ b/docs/tuning-apps/coll-tuned.rst
@@ -3,7 +3,7 @@ Tuning Collectives
 
 Open MPI's ``coll`` framework provides a number of components implementing
 collective communication, including: ``han``, ``libnbc``, ``self``, ``ucc`` ``base``,
-``sync``, ``xhc``, ``accelerator``, ``basic``, ``ftagree``, ``inter``, ``portals4``,
+``sync``, ``xhc``, ``accelerator``, ``basic``, ``ftagree``, ``inter``, ``portals4``, ``acoll``,
 and ``tuned``.  Some of these components may not be available depending on how
 Open MPI was compiled and what hardware is available on the system.  A run-time
 decision based on each component's self reported priority, selects which
diff --git a/docs/tuning-apps/index.rst b/docs/tuning-apps/index.rst
index debc86a0e5e..4d77f176e52 100644
--- a/docs/tuning-apps/index.rst
+++ b/docs/tuning-apps/index.rst
@@ -9,6 +9,7 @@ components that can be tuned to affect behavior at run time.
 
    environment-var
    networking/index
+   accelerators/index
    multithreaded
    dynamic-loading
    fork-system-popen
diff --git a/docs/tuning-apps/mpi-io.rst b/docs/tuning-apps/mpi-io.rst
index ddb84d62874..d478536458c 100644
--- a/docs/tuning-apps/mpi-io.rst
+++ b/docs/tuning-apps/mpi-io.rst
@@ -1,5 +1,5 @@
-Open MPI IO ("OMPIO")
-=====================
+MPI IO
+======
 
 OMPIO is an Open MPI-native implementation of the MPI I/O functions
 defined in the MPI specification.
@@ -23,7 +23,7 @@ OMPIO is fundamentally a component of the ``io`` framework in Open
 MPI.  Upon opening a file, the OMPIO component initializes a number of
 sub-frameworks and their components, namely:
 
-*  ``fs``: responsible for all file management operations
+* ``fs``: responsible for all file management operations
 * ``fbtl``: support for blocking and non-blocking individual
   I/O operations
 * ``fcoll``: support for blocking and non-blocking collective I/O
@@ -70,8 +70,7 @@ mechanism available in Open MPI to influence a parameter value, e.g.:
    shell$ mpirun --mca fcoll dynamic -n 64 ./a.out
 
 ``fs`` and ``fbtl`` components are typically chosen based on the file
-system type utilized (e.g. the ``pvfs2`` component is chosen when the
-file is located on an PVFS2/OrangeFS file system, the ``lustre``
+system type utilized (e.g. the ``lustre``
 component is chosen for Lustre file systems, etc.). The ``ufs`` ``fs``
 component is used if no file system specific component is availabe
 (e.g. local file systems, NFS, BeefFS, etc.), and the ``posix``
@@ -154,21 +153,11 @@ operation are listed below:
 Setting stripe size and stripe width on parallel file systems
 -------------------------------------------------------------
 
-Many ``fs`` components allow you to manipulate the layout of a new
+Some ``fs`` components allow you to manipulate the layout of a new
 file on a parallel file system.  Note, that many file systems only
 allow changing these setting upon file creation, i.e. modifying these
 values for an already existing file might not be possible.
 
-#. ``fs_pvfs2_stripe_size``: Sets the number of storage servers for a
-   new file on a PVFS2/OrangeFS  file system. If not set, system default will be
-   used. Note that this parameter can also be set through the
-   ``stripe_size`` MPI Info value.
-
-#. ``fs_pvfs2_stripe_width``: Sets the size of an individual block for
-   a new file on a PVFS2 file system. If not set, system default will
-   be used. Note that this parameter can also be set through the
-   ``stripe_width`` MPI Info value.
-
 #. ``fs_lustre_stripe_size``: Sets the number of storage servers for a
    new file on a Lustre file system. If not set, system default will
    be used. Note that this parameter can also be set through the
@@ -193,6 +182,12 @@ significant influence on the performance of the file I/O operation
 from device buffers, and can be controlled using the
 ``io_ompio_pipeline_buffer_size`` MCA parameter.
 
+Furthermore, some collective file I/O components such as
+``fcoll/vulcan`` allow the user to influence whether the buffer used
+for collective aggregation is located in host or device memory through
+the ``io_ompio_use_accelerator_buffers`` MCA parameter.
+
+
 .. _label-ompio-individual-sharedfp:
 
 Using the ``individual`` ``sharedfp`` component and its limitations
diff --git a/docs/tuning-apps/networking/index.rst b/docs/tuning-apps/networking/index.rst
index 00aa0f39df5..2be844cb61a 100644
--- a/docs/tuning-apps/networking/index.rst
+++ b/docs/tuning-apps/networking/index.rst
@@ -24,5 +24,3 @@ build support for that library).
    shared-memory
    ib-and-roce
    iwarp
-   cuda
-   rocm
diff --git a/docs/tuning-apps/networking/rocm.rst b/docs/tuning-apps/networking/rocm.rst
deleted file mode 100644
index 10ee12fe9e2..00000000000
--- a/docs/tuning-apps/networking/rocm.rst
+++ /dev/null
@@ -1,134 +0,0 @@
-ROCm
-====
-
-ROCm is the name of the software stack used by AMD GPUs. It includes
-the ROCm Runtime (ROCr), the HIP programming model, and numerous
-numerical and machine learning libraries tuned for the AMD Instinct
-accelerators. More information can be found at the following
-`AMD webpages <https://www.amd.com/en/graphics/servers-solutions-rocm>`_
-
-
-Building Open MPI with ROCm support
------------------------------------
-
-ROCm-aware support means that the MPI library can send and receive
-data from AMD GPU device buffers directly. As of today, ROCm support
-is available through UCX. While other communication transports might
-work as well, UCX is the only transport formally supported in Open MPI
-|ompi_ver| for ROCm devices.
-
-Since UCX will be providing the ROCm support, it is important to
-ensure that UCX itself is built with ROCm support.
-
-To see if your UCX library was built with ROCm support, run the
-following command:
-
-.. code-block:: sh
-
- # Check if ucx was built with ROCm support
- shell$ ucx_info -v
-
- # configured with: --with-rocm=/opt/rocm --without-knem --without-cuda
-
-If you need to build the UCX library yourself to include ROCm support,
-please see the UCX documentation for `building UCX with Open MPI:
-<https://openucx.readthedocs.io/en/master/running.html#openmpi-with-ucx>`_
-
-It should look something like:
-
-.. code-block:: sh
-
-   # Configure UCX with ROCm support
-   shell$ cd ucx
-   shell$ ./configure --prefix=/path/to/ucx-rocm-install \
-                     --with-rocm=/opt/rocm --without-knem
-
-   # Configure Open MPI with UCX and ROCm support
-   shell$ cd ompi
-   shell$ ./configure --with-rocm=/opt/rocm    \
-          --with-ucx=/path/to/ucx-rocm-install \
-          <other configure params>
-
-/////////////////////////////////////////////////////////////////////////
-
-Checking that Open MPI has been built with ROCm support
--------------------------------------------------------
-
-Verify that Open MPI has been built with ROCm using the
-:ref:`ompi_info(1) <man1-ompi_info>` command:
-
-.. code-block:: sh
-
-   # Use ompi_info to verify ROCm support in Open MPI
-   shell$ ./ompi_info | grep "MPI extensions"
-          MPI extensions: affinity, cuda, ftmpi, rocm
-
-/////////////////////////////////////////////////////////////////////////
-
-
-Using ROCm-aware UCX with Open MPI
---------------------------------------------------------------------------
-
-If UCX and Open MPI have been configured with ROCm support, specifying
-the UCX pml component is sufficient to take advantage of the ROCm
-support in the libraries. For example, the command to execute the
-``osu_latency`` benchmark from the `OSU benchmarks
-<https://mvapich.cse.ohio-state.edu/benchmarks>`_ with ROCm buffers
-using Open MPI and UCX ROCm support is something like this:
-
-.. code-block::
-
-   shell$ mpirun -n 2 --mca pml ucx \
-           ./osu_latency D D
-
-Note: some additional configure flags are required to compile the OSU
-benchmark to support ROCm buffers. Please refer to the `UCX ROCm
-instructions
-<https://github.com/openucx/ucx/wiki/Build-and-run-ROCM-UCX-OpenMPI>`_
-for details.
-
-
-/////////////////////////////////////////////////////////////////////////
-
-Runtime querying of ROCm support in Open MPI
---------------------------------------------
-
-Starting with Open MPI v5.0.0 :ref:`MPIX_Query_rocm_support(3)
-<mpix_query_rocm_support>` is available as an extension to check
-the availability of ROCm support in the library. To use the
-function, the code needs to include ``mpi-ext.h``. Note that
-``mpi-ext.h`` is an Open MPI specific header file.
-
-/////////////////////////////////////////////////////////////////////////
-
-Collective component supporting ROCm device memory
---------------------------------------------------
-
-The `UCC <https://github.com/openucx/ucc>`_ based collective component
-in Open MPI can be configured and compiled to include ROCm support.
-
-An example for configure UCC and Open MPI with ROCm is shown below:
-
-.. code-block::
-
-   # Configure and compile UCC with ROCm support
-   shell$ cd ucc
-   shell$ ./configure --with-rocm=/opt/rocm                \
-                      --with-ucx=/path/to/ucx-rocm-install \
-                      --prefix=/path/to/ucc-rocm-install
-   shell$ make -j && make install
-
-   # Configure and compile Open MPI with UCX, UCC, and ROCm support
-   shell$ cd ompi
-   shell$ ./configure --with-rocm=/opt/rocm                \
-                      --with-ucx=/path/to/ucx-rocm-install \
-                      --with-ucc=/path/to/ucc-rocm-install 
-   
-To use the UCC component in an applicatin requires setting some
-additional parameters:
-
-.. code-block::
-
-   shell$ mpirun --mca pml ucx --mca osc ucx \
-                 --mca coll_ucc_enable 1     \
-                 --mca coll_ucc_priority 100 -np 64 ./my_mpi_app
diff --git a/docs/tuning-apps/networking/shared-memory.rst b/docs/tuning-apps/networking/shared-memory.rst
index 7c40693cd76..0584c554e4f 100644
--- a/docs/tuning-apps/networking/shared-memory.rst
+++ b/docs/tuning-apps/networking/shared-memory.rst
@@ -13,7 +13,7 @@ can only be used between processes executing on the same node.
           BTL was named ``vader``.  As of Open MPI version 5.0.0, the
           BTL has been renamed ``sm``.
 
-.. warning:: In Open MPI version 5.0.x, the name ``vader`` is simply
+.. warning:: In Open MPI version 6.0.x, the name ``vader`` is simply
              an alias for the ``sm`` BTL.  Similarly, all
              ``vader_``-prefixed MCA parameters are automatically
              aliased to their corresponding ``sm_``-prefixed MCA
@@ -90,7 +90,7 @@ The ``sm`` BTL supports two modes of shared memory communication:
 #. **Single copy:** In this mode, the sender or receiver makes a
    single copy of the message data from the source buffer in one
    process to the destination buffer in another process.  Open MPI
-   supports three flavors of shared memory single-copy transfers:
+   supports four flavors of shared memory single-copy transfers:
 
    * `Linux KNEM <https://knem.gitlabpages.inria.fr/>`_.  This is a
      standalone Linux kernel module, made specifically for HPC and MPI
@@ -118,6 +118,18 @@ The ``sm`` BTL supports two modes of shared memory communication:
      Open MPI must be built on a Linux system with a recent enough
      Glibc and kernel version in order to build support for Linux CMA.
 
+   * Accelerator IPC mechanism: some accelerator devices support
+     direct GPU-to-GPU data transfers that can take advantage of
+     high-speed interconnects between the accelerators. This component
+     is based on IPC abstractions introduced in the accelerator
+     framework, which allows the sm btl component to use this
+     mechanism if requested by the user. For host memory this
+     component will pass through the operation to another single-copy
+     component.
+
+     The component is disabled by default. To use this component, the
+     application has to increase the priority of the component.
+
 Which mechanism is used at run time depends both on how Open MPI was
 built and how your system is configured.  You can check to see which
 single-copy mechanisms Open MPI was built with via two mechanisms: