[FIX] Continuize: Disable normalizing sparse data#4379
[FIX] Continuize: Disable normalizing sparse data#4379markotoplak merged 1 commit intobiolab:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4379 +/- ##
==========================================
+ Coverage 87.13% 87.14% +<.01%
==========================================
Files 399 399
Lines 72901 72936 +35
==========================================
+ Hits 63521 63557 +36
+ Misses 9380 9379 -1 |
|
What kind of normalization can we still have with sparse data then? We could still do normalization that does some division or multiplication, we just have to avoid shifts (plus, minus). Another thing: I never associated sparse data and discrete values, but yes, why not... Where do we get discrete sparse data in Orange? Is ti directly read from a file or generated by some text-mining widget? |
|
🤔 I think that if you have some discrete variables in the corpus before bag of words, they would inevitably get transformed into sparse alongside words. |
|
The example is described in the issue and yes, it does not make sense but it still should not crash. |
|
Yes, but using appropriate operations for sparse data would be better than disabling options. Normalization by span is something that could still be done for sparse data, it should just not be centered. |
|
Some types of normalization can be done using a Preprocess widget. Should these be added to the Continuize widget as well? |
Issue
Fixes #4378
Description of changes
Disable
Normalize by spanandNormalize by standard deviationradio buttons for sparse datasets.Includes