[ENH] Impute: Allow setting a default value for all numeric and time variables#5102
[ENH] Impute: Allow setting a default value for all numeric and time variables#5102lanzagar merged 5 commits intobiolab:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5102 +/- ##
=======================================
Coverage 84.77% 84.77%
=======================================
Files 286 286
Lines 60074 60130 +56
=======================================
+ Hits 50927 50977 +50
- Misses 9147 9153 +6 |
a129d16 to
d07b43c
Compare
|
Try entering a value with a decimal comma. It breaks the widget. |
|
|
||
| class Default(BaseImputeMethod): | ||
| name = "Value" | ||
| name = "Fixed value" |
There was a problem hiding this comment.
A new class was introduced below with name = "Fixed value". Was this one intentionally changed to the same name too (short names remain different)?
There was a problem hiding this comment.
Yes. It's more informative, looks better in the widget.
Orange/widgets/data/owimpute.py
Outdated
| hlayout.addWidget(button) | ||
|
|
||
| le = gui.lineEdit( | ||
| None, self, "default_numeric", valueType=float, |
There was a problem hiding this comment.
self.default_numeric is a string by default, but this changes it to a float. This is inconsistent and probably helped with the below bug on line 308 (and others?).
There was a problem hiding this comment.
I initially planned to have a float, but then decided for str and forgot to remove this argument.
Orange/widgets/data/owimpute.py
Outdated
| return m | ||
| elif method == Method.Default and not args: # global default values | ||
| return impute.FixedValueByType( | ||
| default_continuous=float(self.default_numeric or np.nan), |
There was a problem hiding this comment.
self.default_numeric can be 0 which evaluates to False so np.nan is used instead!
d07b43c to
b240aae
Compare
|
Not imputing anything: it works for me. Which data did you try and how? Could it be that you tried to impute 0 (the bug discovered by @lanzagar) or you used a wrong delimiter (see next point)? Decimal comma: I now changed the widget to use the current locale. Select Rows does the same. But this is inconsistent: I now need to enter Discrete values: they should come from a combo, I guess. Which values would it contain? For, say, heart disease data? Also, this should then be a context setting that would depend upon whether the selected value is present or not. Using a new value for all discrete variables is also problematic. What if this particular value name already exist for some variables? I found this too messy. |
You mean a bug discovered by me and reported by Mr @lanzagar? :D Yes, it was the case of 0. Discrete values: this is why I deleted a comment, because I realized it is too difficult to implement. So, my use case is usually having N discrete variables with values 0, 1. Some values are missing (sometimes there are just 1, no 0). And I want an easy way to impute in Orange, just fill all the missing values with 0. This is why I asked about discrete and this is why I tested with 0. I was told it is difficult to provide a reasonable solution for discrete. |
|
This data set still breaks impute. Well, I should probably say it breaks Data Table badly and other widgets less badly. Seems like TimeVariable is not formatted properly? |
73e3578 to
52792c2
Compare
52792c2 to
a7abf4f
Compare
a7abf4f to
8286474
Compare
|
I believe this to be fixed now. (It worked at some point, but now it was broken for any file. Shame on me - I apparently changed something and haven't tested it again.) It can be merged now, but has to wait for #5117 (fix for xlsx), because it' s based on it. |
|
I managed to crash something while testing. But for the life of me I can't reproduce now, so... |
Issue
Fixes #5065.
Includes