[ENH] t-SNE: Add Normalize data checkbox#3570
Merged
lanzagar merged 2 commits intobiolab:masterfrom Feb 4, 2019
Merged
Conversation
pavlin-policar
commented
Feb 1, 2019
|
|
||
| def pca_preprocessing(self): | ||
| if self.pca_data is not None and \ | ||
| self.pca_data.X.shape[1] == self.pca_components: |
Collaborator
Author
There was a problem hiding this comment.
I removed this check because I changed line 146 to invalidate the PCA projection, therefore it will be set to None whenever the number of components changes.
5221d78 to
ada884e
Compare
Codecov Report
@@ Coverage Diff @@
## master #3570 +/- ##
==========================================
+ Coverage 83.97% 83.98% +<.01%
==========================================
Files 370 370
Lines 66941 66973 +32
==========================================
+ Hits 56215 56244 +29
- Misses 10726 10729 +3 |
83e30bb to
77a938a
Compare
77a938a to
86ed479
Compare
lanzagar
reviewed
Feb 4, 2019
Contributor
Fine with me. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
Fixes #3448
Description of changes
The widget was still using SVD on sparse data. I have changed this to PCA, since our PCA does support sparse data since [ENH] Implement better randomized PCA #3532
I basically copied over the functionality from the PCA widget. PCA currently doesn't support normalization on sparse data, so this option is disabled on sparse data, just like in the PCA widget.
Lastly, I increased the error margin on one test. The test does this: it embeds a data set using t-SNE then embeds the same data onto the existing embedding (transform). Unfortunately, the points are bound to be jittered around each original corresponding point, but also how far away is also determined by the neighborhoods, so there's no clean way to check for this. But if we visualize what this actually produces, we can see that the result is still correct. And given that the space spans from -20 to 20, increasing the error margin from 1 to 3 shouldn't really impact results too much.

Includes