[FIX] table_from_frame: replace nan with String.Unknown for string variable#5795
Conversation
8b5b505 to
3c317f8
Compare
|
Could you better explain what is the reasoning here? Does not Orange use the empty string OR nan? I'd lean towards using nan wherever possible. And this goes further from what I'd like to see. :) |
3c317f8 to
4d8953d
Compare
I added a better explanation of the issue.
I am not sure if nan is supported by all widgets (at least it does not work in Orange3-text addon). I can start working on supporting it there. I changed my implementation that it uses StringVariable.Unknow instead of hardcoded "" such that it will still work when we decide to use nan instead of "". |
4d8953d to
5e57278
Compare
5e57278 to
fefdb46
Compare
fefdb46 to
34a2777
Compare
Codecov Report
@@ Coverage Diff @@
## master #5795 +/- ##
=======================================
Coverage 86.12% 86.12%
=======================================
Files 316 316
Lines 66386 66386
=======================================
Hits 57173 57173
Misses 9213 9213 |
34a2777 to
3e6f5bf
Compare
Issue
When transforming pandas data frame to table string columns that contain nan values will
keep them after the transformation but Orange uses an empty string for the unknown value in the StringVarable.
For example:
df = pd.DataFrame(
[["a", "b"], ["c", "d"], ["e", "f"], [np.nan, np.nan]],
)
will be transformed to the table with two string variables and nan values will be kept even
String.Unknow = ""When a column is recognized to be a string and has an object type, it still can contain some values that are not strings. Cast column to string.
Description of changes
Changed that nan values are transformed to
String.Unknownfor columns that will be transformed to the string variable and values are transformed to strings.Includes