[doc] Brief note about RMM SAM allocator. [skip ci] (dmlc#10712)

trivialfis · web-flow · commit fd365c147ef3 · 2024-08-17T04:21:39.000+08:00
diff --git a/demo/rmm_plugin/README.rst b/demo/rmm_plugin/README.rst
@@ -58,4 +58,20 @@ Since with RMM the memory pool is pre-allocated on a specific device, changing t
 device ordinal in XGBoost can result in memory error ``cudaErrorIllegalAddress``. Use the
 ``CUDA_VISIBLE_DEVICES`` environment variable instead of the ``device="cuda:1"`` parameter
 for selecting device. For distributed training, the distributed computing frameworks like
-``dask-cuda`` are responsible for device management.
+``dask-cuda`` are responsible for device management.
+
+************************
+Memory Over-Subscription
+************************
+
+.. warning::
+
+   This feature is still experimental and is under active development.
+
+The newer NVIDIA platforms like `Grace-Hopper
+<https://www.nvidia.com/en-us/data-center/grace-hopper-superchip/>`__ use `NVLink-C2C
+<https://www.nvidia.com/en-us/data-center/nvlink-c2c/>`__, which allows the CPU and GPU to
+have a coherent memory model. Users can use the `SamHeadroomMemoryResource` in the latest
+RMM to utilize system memory for storing data. This can help XGBoost utilize memory from
+the host for GPU computation, but it may reduce performance due to slower CPU memory speed
+and page migration overhead.
diff --git a/doc/gpu/index.rst b/doc/gpu/index.rst
@@ -50,6 +50,11 @@ Multi-node Multi-GPU Training
 
 XGBoost supports fully distributed GPU training using `Dask <https://dask.org/>`_, ``Spark`` and ``PySpark``. For getting started with Dask see our tutorial :doc:`/tutorials/dask` and worked examples :doc:`/python/dask-examples/index`, also Python documentation :ref:`dask_api` for complete reference. For usage with ``Spark`` using Scala see :doc:`/jvm/xgboost4j_spark_gpu_tutorial`. Lastly for distributed GPU training with ``PySpark``, see :doc:`/tutorials/spark_estimator`.
 
+RMM integration
+===============
+
+XGBoost provides optional support for RMM integration. See :doc:`/python/rmm-examples/index` for more info.
+
 
 Memory usage
 ============