Skip to content

Commit 2ef7b8d

Browse files
committed
Fix NaN handling for strings (ref biolab#6670)
In fixing this, switched string handling from fixed-length to variable length https://docs.h5py.org/en/stable/special.html#variable-length-strings
1 parent 9a52609 commit 2ef7b8d

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

Orange/data/io.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121

2222
import h5py
2323
import numpy as np
24+
import pandas as pd
2425

2526
import xlrd
2627
import xlsxwriter
@@ -601,6 +602,8 @@ def parse(attr):
601602
f.create_dataset("Y", data=data.Y)
602603
if data.metas.size:
603604
for i, attr in enumerate(data.domain.metas):
604-
col_type = 'S' if isinstance(attr, StringVariable) else 'f'
605+
col_type = h5py.string_dtype() if isinstance(attr, StringVariable) else 'f'
605606
col_data = data.metas[:, [i]].astype(col_type)
606-
f.create_dataset(f'metas/{i}', data=col_data)
607+
if col_type is not 'f':
608+
col_data[pd.isnull(col_data)] = ""
609+
f.create_dataset(f'metas/{i}', data=col_data, dtype=col_type)

0 commit comments

Comments
 (0)