doc: local trees parameter documentation #2636

ethanglaser · 2025-07-30T19:10:52Z

Description

Follow-up to #2615 (and uxlfoundation/oneDAL#3139). Adds documentation of additional parameter to SPMD forest estimators. Open to discussion on the best way to do this since I don't believe we have any prior references for this.

Checklist to comply with before moving PR from draft:

PR completeness and readability

I have reviewed my changes thoroughly before submitting this pull request.
I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have added a respective label(s) to PR if I have a permission for that.
I have resolved any merge conflicts that might occur with the base branch.

doc/sources/algorithms.rst

codecov · 2025-07-30T20:01:07Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag	Coverage Δ
azure	`?`
github	`73.19% <ø> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 30 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

david-cortes-intel · 2025-07-31T05:53:29Z

@ethanglaser The only section that I'm aware of where extra parameters are documented is here:
https://uxlfoundation.github.io/scikit-learn-intelex/2025.7/guide/acceleration.html#random-forest

The title of the doc section doesn't match at all with the contents, but perhaps you could put it there for now next to the other extra parameters of decision trees, and then later we can revisit the structuring of the docs.

david-cortes-intel · 2025-09-29T08:42:19Z

doc/sources/guide/acceleration.rst

 increased with the ``max_bins`` parameter.
+
+Another parameter that can improve performance at large scale for Random Forest, 
+specifically the ``sklearnex.spmd.ensemble`` ``RandomForestClassifier`` and 


Could use links to the sklearn docs of the classes here, as done elsewhere - e.g. :obj:`sklearn.ensemble.RandomForestClassifier`

david-cortes-intel · 2025-09-29T08:46:00Z

doc/sources/algorithms.rst

+
+       **Additional parameters:**
+
+       - ``local_trees_mode`` (bool, default=False): Enables local trees mode for distributed training. ``n_estimators`` is per rank, with isolated learning occurring on each processor before merging into a single model. This mode is experimental but scales better than default. This parameter is specific to the SPMD implementation and is not present in the standard scikit-learn API.


I'd say this is not very descriptive.

Does it mean that the result has n_estimators*n_ranks trees?

Does the data get moved across ranks, or does each rank use the data that it owns?

Maybe could also refer to them as 'rank/nodes' as otherwise it might not be immediately clear what a 'rank' here refers to.

Ideally we could point to oneDAL docs, where this functionality was implemented. @Alexandr-Solovev can we get this documented in oneDAL?

doc: local trees parameter documentation

19051b9

ethanglaser added the documentation label Jul 30, 2025

ethanglaser marked this pull request as ready for review July 30, 2025 19:11

ethanglaser requested review from Alexsandruss, Vika-F, david-cortes-intel, icfaust, maria-Petrova, syakov-intel and yuejiaointel as code owners July 30, 2025 19:11

ethanglaser requested a review from Alexandr-Solovev July 30, 2025 19:11

ethanglaser commented Jul 30, 2025

View reviewed changes

doc/sources/algorithms.rst Outdated Show resolved Hide resolved

Update doc/sources/algorithms.rst

1242c79

ethanglaser commented Jul 30, 2025

View reviewed changes

doc/sources/algorithms.rst Outdated Show resolved Hide resolved

Update doc/sources/algorithms.rst

f279089

ethanglaser added 2 commits September 26, 2025 10:49

Merge branch 'main' into dev/eglaser-local-trees-doc

6f0363a

add local_trees_mode details to tuning guide

1d075e5

david-cortes-intel reviewed Sep 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

doc: local trees parameter documentation #2636

doc: local trees parameter documentation #2636

Uh oh!

ethanglaser commented Jul 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jul 30, 2025 •

edited

Loading

Uh oh!

david-cortes-intel commented Jul 31, 2025

Uh oh!

david-cortes-intel Sep 29, 2025

Uh oh!

david-cortes-intel Sep 29, 2025

Uh oh!

ethanglaser Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		Additional parameters:

		- ``local_trees_mode`` (bool, default=False): Enables local trees mode for distributed training. ``n_estimators`` is per rank, with isolated learning occurring on each processor before merging into a single model. This mode is experimental but scales better than default. This parameter is specific to the SPMD implementation and is not present in the standard scikit-learn API.

doc: local trees parameter documentation #2636

Are you sure you want to change the base?

doc: local trees parameter documentation #2636

Uh oh!

Conversation

ethanglaser commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

david-cortes-intel commented Jul 31, 2025

Uh oh!

david-cortes-intel Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

ethanglaser Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ethanglaser commented Jul 30, 2025 •

edited

Loading

codecov bot commented Jul 30, 2025 •

edited

Loading