uxlfoundation · ethanglaser · Jul 30, 2025 · Jul 30, 2025 · Jul 30, 2025 · Sep 26, 2025
@@ -487,6 +487,10 @@ Classification
        - ``criterion`` != `'gini'`
        - ``oob_score`` = `True`
        - ``sample_weight`` != `None`
+
+       **Additional parameters:**
+
+       - ``local_trees_mode`` (bool, default=False): Enables local trees mode for distributed training. ``n_estimators`` is per rank, with isolated learning occurring on each processor before merging into a single model. This mode is experimental but scales better than default. This parameter is specific to the SPMD implementation and is not present in the standard scikit-learn API.
      - Multi-output and sparse data are not supported
    * - :obj:`sklearn.ensemble.ExtraTreesClassifier`
      - All parameters are supported except:
@@ -539,6 +543,10 @@ Regression
        - ``criterion`` != `'mse'`
        - ``oob_score`` = `True`
        - ``sample_weight`` != `None`
+
+       **Additional parameters:**
+
+       - ``local_trees_mode`` (bool, default=False): Enables local trees mode for distributed training. ``n_estimators`` is per rank, with isolated learning occurring on each processor before merging into a single model. This mode is experimental but scales better than default. This parameter is specific to the SPMD implementation and is not present in the standard scikit-learn API.
      - Multi-output and sparse data are not supported
    * - :obj:`sklearn.ensemble.ExtraTreesRegressor`
      - All parameters are supported except:

@@ -73,3 +73,10 @@ times, especially for larger data sets. However, due to the reduced fidelity of
 the data, the resulting model can present worse performance metrics compared to
 a model trained on the original data. In such cases, the number of bins can be
 increased with the ``max_bins`` parameter.
+
+Another parameter that can improve performance at large scale for Random Forest, 
+specifically the ``sklearnex.spmd.ensemble`` ``RandomForestClassifier`` and 
+``RandomForestRegressor`` classes, is ``local_trees_mode``. This uses an 
+alternative backend that is more conducive to scalability when running on more 
+GPUs. The default is ``False``, but setting to ``True`` enables this functionality. 
+This parameter is only available in the ``spmd`` module, for multi-GPU use.