Skip to content

Commit 82d8ce0

Browse files
committed
Stats: Fix counting nans for non-numeric data
Stats returned negative numbers of nans on non-numberic data (metas). Hence, for metas, table widget always showed 'no missing values'. The problem was in priorities. We should first negate binary array and only then sum. Without brackets binary array was first summed and only then ~ was applied. On integers ~ returns -(x+1) which lead to negative values.
1 parent 9730c7a commit 82d8ce0

File tree

2 files changed

+11
-1
lines changed

2 files changed

+11
-1
lines changed

Orange/statistics/util.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@ def stats(X, weights=None, compute_variance=False):
195195
X.shape[1] - non_zero,
196196
non_zero))
197197
else:
198-
nans = ~X.astype(bool).sum(axis=0) if X.size else np.zeros(X.shape[1])
198+
nans = (~X.astype(bool)).sum(axis=0) if X.size else np.zeros(X.shape[1])
199199
return np.column_stack((
200200
np.tile(np.inf, X.shape[1]),
201201
np.tile(-np.inf, X.shape[1]),

Orange/tests/test_statistics.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,3 +69,13 @@ def test_stats_weights_sparse(self):
6969
weights = np.array([1, 3])
7070
np.testing.assert_equal(stats(X, weights), [[0, 2, 1.5, 0, 1, 1],
7171
[1, 3, 2.5, 0, 0, 2]])
72+
73+
def test_stats_non_numeric(self):
74+
X = np.array([
75+
['', 'a', 'b'],
76+
['a', '', 'b'],
77+
['a', 'b', ''],
78+
], dtype=object)
79+
np.testing.assert_equal(stats(X), [[np.inf, -np.inf, 0, 0, 1, 2],
80+
[np.inf, -np.inf, 0, 0, 1, 2],
81+
[np.inf, -np.inf, 0, 0, 1, 2]])

0 commit comments

Comments
 (0)