Skip to content

[FIX] Select Rows: Removing Unused Values for Discrete Variables in Sparse Data#2452

Merged
janezd merged 3 commits intobiolab:masterfrom
nikicc:fix-remove-unused-values-sparse
Jul 21, 2017
Merged

[FIX] Select Rows: Removing Unused Values for Discrete Variables in Sparse Data#2452
janezd merged 3 commits intobiolab:masterfrom
nikicc:fix-remove-unused-values-sparse

Conversation

@nikicc
Copy link
Contributor

@nikicc nikicc commented Jul 5, 2017

Issue

When remove unused features is checked in Select Rows and some discrete variable comes from sparse data, Orange crashes with AttributeError: ravel not found error.

The problem is in the remove_unused_values method (Orange/preprocess/remove.py) which cannot handle sparse matrices.

Issue: https://sentry.io/biolab/orange3/issues/269413491/

Description of changes
  • Implement nanunique function. nanunique returns unique values without missing (np.nan) values and works on sparse and dense matrices.
  • Fix remove_unused_values to use nanunique and hance support sparse discrete columns.
Includes
  • Code changes
  • Tests
  • Documentation

@nikicc nikicc added DH2017 bug A bug confirmed by the core team labels Jul 5, 2017
@nikicc nikicc force-pushed the fix-remove-unused-values-sparse branch from b68b8c0 to a85dddb Compare July 7, 2017 11:54
@codecov-io
Copy link

codecov-io commented Jul 7, 2017

Codecov Report

Merging #2452 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2452      +/-   ##
==========================================
- Coverage   74.51%   74.49%   -0.02%     
==========================================
  Files         321      321              
  Lines       56056    56055       -1     
==========================================
- Hits        41769    41760       -9     
- Misses      14287    14295       +8

@nikicc nikicc force-pushed the fix-remove-unused-values-sparse branch 4 times, most recently from 56aee9e to a2b2e92 Compare July 10, 2017 12:35
@nikicc nikicc force-pushed the fix-remove-unused-values-sparse branch from a2b2e92 to 103197d Compare July 18, 2017 14:00
@janezd janezd merged commit ae61acb into biolab:master Jul 21, 2017
@nikicc nikicc deleted the fix-remove-unused-values-sparse branch July 21, 2017 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug A bug confirmed by the core team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants