Skip to content

[MNT] Fix warnings in widget tests#3502

Merged
lanzagar merged 40 commits intobiolab:masterfrom
janezd:fix-test-warnings
Jan 18, 2019
Merged

[MNT] Fix warnings in widget tests#3502
lanzagar merged 40 commits intobiolab:masterfrom
janezd:fix-test-warnings

Conversation

@janezd
Copy link
Contributor

@janezd janezd commented Dec 24, 2018

Issue

Logs are polluted with warnings. Some are relevant and are fixed, and some are expected by tests and should be silenced. Logs are also printed out instead of being tested.

Description of changes
Includes
  • Code changes

@janezd janezd changed the title [MNT] Fix warnings in widget tests [WIP] [MNT] Fix warnings in widget tests Dec 24, 2018
@janezd janezd force-pushed the fix-test-warnings branch 4 times, most recently from d14f153 to d52122c Compare December 25, 2018 13:43
@codecov
Copy link

codecov bot commented Dec 25, 2018

Codecov Report

Merging #3502 into master will increase coverage by <.01%.
The diff coverage is 95.55%.

@@            Coverage Diff             @@
##           master    #3502      +/-   ##
==========================================
+ Coverage   83.67%   83.68%   +<.01%     
==========================================
  Files         370      370              
  Lines       66176    66293     +117     
==========================================
+ Hits        55372    55476     +104     
- Misses      10804    10817      +13

@codecov
Copy link

codecov bot commented Dec 25, 2018

Codecov Report

Merging #3502 into master will decrease coverage by 0.04%.
The diff coverage is 99.1%.

@@            Coverage Diff             @@
##           master    #3502      +/-   ##
==========================================
- Coverage   83.59%   83.55%   -0.05%     
==========================================
  Files         367      367              
  Lines       65570    65751     +181     
==========================================
+ Hits        54814    54935     +121     
- Misses      10756    10816      +60

@janezd janezd force-pushed the fix-test-warnings branch 4 times, most recently from 515bdc6 to 399bc33 Compare December 25, 2018 22:11
@janezd janezd changed the title [WIP] [MNT] Fix warnings in widget tests [MNT] Fix warnings in widget tests Dec 25, 2018
@janezd janezd force-pushed the fix-test-warnings branch 6 times, most recently from 612c6a8 to c5de535 Compare December 26, 2018 13:25
def test_report_widgets_evaluate(self):
rep = OWReport.get_instance()
data = Table("zoo")
data = Table("titanic")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is changed to avoid warning about <4 instances in each fold of CV.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zoo has a class value that appears only 4 times. If CV is run with k=3, everything should be fine.
Running CV with less folds would probably be good practice even when not needed, since it is just as good in almost all tests and 3x faster.
Changing to titanic instead is probably quite a slowdown since it has 20x as many instances.
(not sure this makes any difference here)

@janezd janezd force-pushed the fix-test-warnings branch from c85d987 to 26d3022 Compare January 11, 2019 15:20
def test_report_widgets_evaluate(self):
rep = OWReport.get_instance()
data = Table("zoo")
data = Table("titanic")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zoo has a class value that appears only 4 times. If CV is run with k=3, everything should be fine.
Running CV with less folds would probably be good practice even when not needed, since it is just as good in almost all tests and 3x faster.
Changing to titanic instead is probably quite a slowdown since it has 20x as many instances.
(not sure this makes any difference here)

@janezd janezd force-pushed the fix-test-warnings branch from 19a4816 to 583e87a Compare January 11, 2019 20:44
is_enabled = self.data is not None and \
not self.data.is_sparse() and \
len(self.xy_model) > 2 and len(self.data[self.valid_data]) > 1 \
and np.all(np.nan_to_num(np.nanstd(self.data.X, 0)) != 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I am starting to dislike filterwarnings more and more.
It may be the simplest solution and acceptable in some tests, but in actual code I think in most cases it points to some problems that we should either fix or explicitly check for and not trigger warnings. Filtering them out just makes it too easy to end up with bad code and make it harder to notice problems.

This line specifically is really interesting to take a closer look at.
Why would we trigger warnings in nanstd when the data is all-nan, just to change those nans to zeros and check that there are no zeros (together with checking that no std is zero). Instead of adding 4 lines to filter these warnings (in the whole function), we can add 1 line with an explicit check that no column should be all-nan before computing std and make it easier to read too.

And then looking at that line, we ask ourselves - why should a single feature with std=0 or all-nan prevent vizrank to be used at all? It would still be useful to score other, normal features.
Also, the check tests the whole X, which is now wrong for 2 reasons: meta features have been added to the list and should be treated as other features; categorical vars are no longer used in scatter plot and should be excluded from this check

Conclusion: I think we should remove this change and fix this in a separate PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I removed this from the commit.

(self.observed - self.expected) / np.sqrt(self.expected)
with np.errstate(divide="ignore", invalid="ignore"):
self.residuals = \
(self.observed - self.expected) / np.sqrt(self.expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not divide by zero instead of ignoring warnings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's a vector. I could create a mask, but it would be annoying. It's simpler this way.

janezd and others added 22 commits January 17, 2019 19:41
…t; replace zoo with titanic to ensure enough instances in each class
@janezd janezd force-pushed the fix-test-warnings branch from 5f5847a to 23d674d Compare January 17, 2019 20:23
def _nan_min_max(x, func, axis=0):
if not sp.issparse(x):
return func(x, axis=axis)
with warnings.catch_warnings():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently still changes the behaviour of nanmin/nanmax compared to numpy's version.
I thought we agreed that for numpy arrays our functions should be equivalent to np.nanmin/max

@lanzagar lanzagar merged commit 0637c96 into biolab:master Jan 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants