Skip to content

Conversation

ethanglaser
Copy link
Contributor

@ethanglaser ethanglaser commented Jun 17, 2025

Description

Adds spmd NearestNeighbors to API with necessary modifications to sklearnex/spmd and onedal/spmd, along with minor revisions in onedal4py and onedal itself (uxlfoundation/oneDAL#3262 - which is prerequisite for merging this). Test also added for validation.

Full list of changes:

  • Support raw inputs for kneighbors function
  • Remove weights from NearestNeighbors class (sklearn does not support this nor does it logically make sense)
  • Add NearestNeighbors to API for sklearnex and onedal spmd modules
  • Enable spmd usage of kneighbors in all knn classes (added storage of queue from fit to use if X is None in kneighbors())
  • Revert incorrect usage of _assert_unordered_allclose for _spmd_assert_allclose in neighbors comparisons in spmd test
  • Add gold and synthetic tests for spmd NearestNeighbors, including large test that revealed sycl event issue in oneDAL that has been addressed in Support spmd knn search oneDAL#3262
  • Added test scope for kneighbors with X=None (this would have failed previously)

PR completeness and readability

  • I have reviewed my changes thoroughly before submitting this pull request.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have added a respective label(s) to PR if I have a permission for that.
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

Performance

  • I have measured performance for affected algorithms using scikit-learn_bench and provided at least summary table with measured data, if performance change is expected.
  • I have provided justification why performance has changed or why changes are not expected.
  • I have provided justification why quality metrics have changed or why changes are not expected.
  • I have extended benchmarking suite and provided corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

Copy link

codecov bot commented Jun 17, 2025

Codecov Report

❌ Patch coverage is 50.00000% with 12 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
onedal/spmd/neighbors/neighbors.py 35.29% 11 Missing ⚠️
onedal/neighbors/neighbors.py 80.00% 0 Missing and 1 partial ⚠️
Flag Coverage Δ
azure 80.55% <50.00%> (-0.13%) ⬇️
github 73.17% <80.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
onedal/spmd/neighbors/__init__.py 100.00% <100.00%> (ø)
onedal/neighbors/neighbors.py 82.58% <80.00%> (-0.16%) ⬇️
onedal/spmd/neighbors/neighbors.py 54.71% <35.29%> (-9.18%) ⬇️

... and 41 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ethanglaser
Copy link
Contributor Author

ethanglaser commented Jun 17, 2025

@ethanglaser ethanglaser marked this pull request as ready for review September 12, 2025 14:01
@ethanglaser
Copy link
Contributor Author

ethanglaser commented Sep 26, 2025

Copy link
Contributor

@icfaust icfaust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to time constraints I have tried to push most of my changes into follow-on PRs. There are some nagging issues we should talk through, but I see no reason to hold off getting this in.

)

# Run each estimator without an input to kneighbors() and ensure functionality and equivalence
for CurrentEstimator in [KNeighborsClassifier, KNeighborsRegressor, NearestNeighbors]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why this was done, but is a bit painful to analyze if there is a failure. Ideally it would be parametrized over, but really isn't possible by the way it is imported. Would be worth adding some sort of message to figure out which is the CurrentEstimator (rather than having to dig through the pytest log for the CurrentEstimator current value was).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I am pretty open to ideas on this one. The loop is great because I run the exact same test on all 3 classes, but you are correct that analysis on a fail is trickier. I think scikit-learn may do things like this, I could check how they do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess its easier there because in sklearn they import at top of file

spmd_dists, spmd_indcs = spmd_model.kneighbors(local_dpt_X_train)
batch_dists, batch_indcs = batch_model.kneighbors(X_train)

tol = 0.005 if dtype == np.float32 else 1e-6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes on this float32 setting. Any info on it? Especially because there is a skip associated with it above (meaning an even worse value occurs?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true, and good observation. It's pretty tricky because this assert all close functionality will fail even if a single element is not within the threshold, hence why it is so loose - it would be nice if there was some sort of customization of that.

It's possible that we could still run the indices check for this case, but distances are more fragile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the only place in spmd test scope where drastically low thresholds are needed to support float32 tests passing though

@ethanglaser
Copy link
Contributor Author

@ethanglaser ethanglaser merged commit bae4afb into uxlfoundation:main Sep 30, 2025
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

distributed enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants