[FIX] Avoid slowdowns by caching Domain.__eq__#6764
Merged
janezd merged 5 commits intobiolab:masterfrom Aug 23, 2024
Merged
Conversation
591f299 to
c0b0139
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6764 +/- ##
=======================================
Coverage 88.26% 88.27%
=======================================
Files 326 326
Lines 71142 71178 +36
=======================================
+ Hits 62797 62830 +33
- Misses 8345 8348 +3
|
88ec566 to
ea23d7f
Compare
975fa5e to
b87741e
Compare
Member
Author
|
This is now ready for review. |
b87741e to
eee8618
Compare
Member
Author
|
This PR is ready for review. :) |
Contributor
|
Your modifications look OK (I like the dict with max 10 items!), and tests pass, so I merged the PR. But I had my fingers tightly crossed: with a change so deep down, there may be problems we haven't thought about. On the other hand, if something was wrong in such important part of the code, it would for sure trip some test (beyond those where the test itself was incorrect). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
Per Quasars/orange-spectroscopy#713 domain comparison can be so slow that it seems that Orange crashed. They are slow especially when multiple domain transformations are chained and if there are many attributes.
A demo to reproduce the issue follows. This
__eq__implementation follows how we implemented__eq__for PCA and PLS, but there, fortunately, because scikit-learn does not properly implement__eq__s we do not see this issue.What follows is the simplest
SharedComputeValuewith__eq__that ensures that the input domain is the same (as all our domain transformations).On this branch, I see:
While on master I get (each increase in depth makes it 100 times slower):
Description of changes
I implemented caching in
Domain.__eq__. We deem domains read-only and already cache__hash__, so why not this one too?The first commit refactors id-based caching used in
table.pyfor use elsewhere.Includes