Skip to content

Conversation

@jwa7
Copy link
Collaborator

@jwa7 jwa7 commented Mar 26, 2024

Attempting to fix #206

Hello!

I have encountered the issue raised by Alex in the above issue, whilst working with the equisolve wrapper for TensorMap-based sample/feature selection (i.e. in https://github.com/lab-cosmo/equisolve/blob/main/src/equisolve/numpy/sample_selection.py and co).

I have adapted Alex's example into a few unit tests for both FPS and CUR sample/feature selection, and attempted to fix it. However, there is something I'm not understanding. While the FPS tests now pass, there are a couple of (different) CUR ones that do not.

With this PR I was hoping to get some feedback/help from the skmatter dev team on this. Thanks! :)

Contributor (creator of PR) checklist

  • Tests updated (for new features and bugfixes)?
  • Documentation updated (for new features)?
  • Issue referenced (for PRs that solve an issue)?

For Reviewer

  • CHANGELOG updated if important change?

📚 Documentation preview 📚: https://scikit-matter--224.org.readthedocs.build/en/224/

@rosecers
Copy link
Collaborator

Are you still looking for feedback on this? Good measure to tag one of us for input.

@jwa7
Copy link
Collaborator Author

jwa7 commented Jun 20, 2024

Hey! This isn't something I'm actively working on at the moment, but I'll be sure to ping you for feedback when I / we get back round to it :)

@mhellstr
Copy link

I had the problem that CUR sample selection would give duplicate selected indices, for example X.selected_idx_ ==
[ 801 936 1308 253 1000 480 183 486 303 977 43 366 734 243 363 88 859 440 798 398 709 263 796
383 214 654 854 508 865 384 413 643 413 643 413 643 413 643 413 643 413 643 413 643 413 643
413 643 413 643 413 643 413 643 413 643 413 643 413 643 413 643 413 643 413 643 413 643 413
643 413 643 413 643 413 643 413 643 413 643 413 643 413 643 413 643 413 643 413 643 413 643
413 643 413 643 413 643 413 643]

I cannot comment on the validity of the code, but this branch solves this problem to instead give unique predictions like this, X.selected_idx_ ==
[ 801 936 1308 253 1000 480 183 486 303 977 43 366 734 243 363 88 859 440 798 398 709 263 796
383 214 654 854 508 865 384 413 643 216 555 408 564 892 716 673 99 107 386 137 55 1 499
368 390 359 218 237 130 530 661 439 311 318 542 669 830 268 208 215 903 418 855 994 664 631
879 199 306 869 258 884 1418 123 700 266 1282 608 519 351 389 5 652 553 257 934 24 1455 270
825 744 1114 470 757 572 1259 15]

@ceriottm
Copy link
Collaborator

Moved over to #265

@ceriottm ceriottm closed this Jul 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Zero scores result in repeated selection and wrong scores at least for FPS

4 participants