[ENH] ROC analysis: show thresholds by matejklemen · Pull Request #3172 · biolab/orange3

matejklemen · 2018-07-30T23:45:47Z

Issue

Implements #3057.

Description of changes

When mouse is put over a point on ROC curve, the threshold that was used for calculation now appears. If threshold value is nan, the value does not get shown. Also, the displaying of thresholds is currently not implemented when showing individual curves.

My questions:

currently thresholds are displayed when mouse goes 'perfectly' over a point - should this be changed so that they get displayed if mouse is in some very small radius of the point?
what kind of a test do I write for this functionality (i. e. mouse events) - if any?

General comments are also welcome.

Includes

Code changes
Tests
Documentation

ales-erjavec

If I enable "Show convex ROC curves" and hover over the graph I get:

-------------------------- AttributeError Exception ---------------------------
Traceback (most recent call last):
  File "/Users/aleserjavec/workspace/orange3/Orange/widgets/evaluate/owrocanalysis.py", line 614, in _on_mouse_moved
    pts = sp.pointsAt(act_pos)
AttributeError: 'PlotCurveItem' object has no attribute 'pointsAt'
-------------------------------------------------------------------------------

ales-erjavec · 2018-07-31T06:34:25Z

Orange/widgets/evaluate/owrocanalysis.py

                graphics = curve.avg_threshold()
                self.plot.addItem(graphics.curve_item)
                self.plot.addItem(graphics.confint_item)
+                graphics.curve_item.scene().sigMouseMoved.connect(self._on_mouse_moved)


Connect the signal once in the __init__ method:
self.plotview.scene().sigMouseMoved.connect(self._on_mouse_moved)

ales-erjavec · 2018-07-31T06:39:40Z

Orange/widgets/evaluate/owrocanalysis.py

+
+                thresh = curve_pts.thresholds[idx_closest]
+                if not numpy.isnan(thresh):
+                    item = pg.TextItem(text="{:.3f}".format(thresh), color=(0, 0, 0))


Why not QToolTip.showText?

Mainly because I am new to QT and was just following a few resources online (I've changed it now).
I've fixed most of the things that you've mentioned and some other bugs.
Thank you for the helpful comments!

codecov-io · 2018-07-31T22:37:04Z

Codecov Report

Merging #3172 into master will increase coverage by 0.01%.
The diff coverage is 90.47%.

@@            Coverage Diff             @@
##           master    #3172      +/-   ##
==========================================
+ Coverage   82.74%   82.76%   +0.01%     
==========================================
  Files         344      344              
  Lines       59304    59397      +93     
==========================================
+ Hits        49073    49158      +85     
- Misses      10231    10239       +8

ales-erjavec · 2018-08-02T06:42:41Z

Orange/widgets/evaluate/owrocanalysis.py

+    def _on_mouse_moved(self, pos):
+        curves = [crv for crv in self.plot.curves if isinstance(crv, pg.PlotCurveItem)]
+        for i in range(len(self.selected_classifiers)):
+            sp = curves[i].childItems()[0]


This assumes implicitly that the order of the curves in the self.plot.curves corresponds to the self.selected_classifiers. This seems fragile.

You can use:

curves = [(clsf_idx, self.plot_curves(self.target_index, clsf_idx), self.curve_data(self.target_index, clsf_idx)) for clsf_idx in self.selected_classifiers] # type: List[Tuple[int, plot_curves, ROCData]]

to retrieve triples of (classifier_index, plot_curves, curve_data). Use plot_curves similarly as curve_data in ROC averaging dispatch below, but use

plot_curves.merge().curve_item plot_curves.avg_vertical().curve_item plot_curves.avg_threshold().curve_item

to get the pg.PlotCurveItem instances

My bad - should be fixed now. I've also fixed some other bugs:

if mouse was hovered over a point on curve for some classifier and then the same classifier's curve would be hidden and mouse hovered over the same point as before, the cache would not reset and therefore an invalid threshold would be shown,

if tooltip was not being shown and the cache point was "valid" (i. e. same point as previously shown), a tooltip with message "" would be created, which resulted in hiding a tooltip.

I have also encountered the following exception (just once), which I can not seem to be able to reproduce anymore:

--------------------------- RuntimeError Exception ---------------------------- Traceback (most recent call last): File "/home/matej/.local/lib/python3.5/site-packages/pyqtgraph/GraphicsScene/GraphicsScene.py", line 162, in mouseMoveEvent self.sendHoverEvents(ev) File "/home/matej/.local/lib/python3.5/site-packages/pyqtgraph/GraphicsScene/GraphicsScene.py", line 228, in sendHoverEvents items = self.itemsNearEvent(event, hoverable=True) File "/home/matej/.local/lib/python3.5/site-packages/pyqtgraph/GraphicsScene/GraphicsScene.py", line 424, in itemsNearEvent if hoverable and not hasattr(item, 'hoverEvent'): RuntimeError: wrapped C/C++ object of type ScatterPlotItem has been deleted

When browsing through old issues, I've found out that there's been some instances of similar issues (#1316), though I can't say I completely understand your fix for this.

ales-erjavec · 2018-08-06T08:10:04Z

Orange/widgets/evaluate/owrocanalysis.py

+                # Find closest point on curve and display it
+                idx_closest = numpy.argmin(
+                    [numpy.linalg.norm(mouse_pt - [curve_pts.fpr[idx], curve_pts.tpr[idx]])
+                     for idx in range(len(curve_pts.thresholds))])


This loop could be vectorized

roc_points = numpy.column_stack((curve_pts.fpr, curve_pts.tpr)) query_pt = numpy.array([[mouse_pt.x(), mouse_pt.y()]]) diff = roc_points - query_pt idx_closest = numpy.argmin(numpy.linalg.norm(diff, axis=1))

(... although a slightly moot point as the sp.pointsAt call above already runs through all the points in a python loop)

True - fixed.
I've also made another minor change: now QCursor.pos() is directly used everywhere instead of using it directly in some cases and in the other cases saving it and then using the variable (just a consistency fix).

ales-erjavec · 2018-08-09T07:37:19Z

currently thresholds are displayed when mouse goes 'perfectly' over a point - should this be changed so that they get displayed if mouse is in some very small radius of the point?

I think the hit radius is sufficient as it is. However it might be better to collect points from all the curves under the mouse and display a list of all thresholds with classifier names. Right now if two or more curves are close to each other it is not clear to which does the shown threshold belong to.

what kind of a test do I write for this functionality (i. e. mouse events) - if any?

I would use QTest.mouseMove targeting the QGraphicsView's viewport (plotview.viewport() in the OWROCAnalysis). Try to map one of the curve's points to scene coordinates (QGraphicsItem.mapToScene) and those to the viewport (QGraphicsWidget.mapFromScene).

matejklemen · 2018-08-10T12:23:10Z

I've improved the tooltips so that it is clear, which classifier a threshold belongs to. Also, the tooltip now shows thresholds for all overlapping points.

I'll see what I can do about the tests.

Edit: I'm having a bit of trouble creating the tests - so far this is what I've got:

# setup
data_in = Orange.data.Table("titanic")
res = Orange.evaluation.TestOnTrainingData(
    data=data_in,
    learners=[Orange.classification.KNNLearner(),
              Orange.classification.LogisticRegressionLearner()],
    store_data=True
)

self.send_signal(self.widget.Inputs.evaluation_results, res)
self.widget.roc_averaging = OWROCAnalysis.Merge
self.widget.target_index = 0
self.widget.selected_classifiers = [0, 1]
self.widget._replot()

# curve data for classifier 0
curve = self.widget.plot_curves(self.widget.target_index, 0)
curve_merge = curve.merge()

QTest.mouseMove(self.widget.plotview.viewport(), curve_merge.curve_item.mapToScene(0, 0))
# example assertion
self.assertTrue(QToolTip.isVisible())

The QTest.mouseMove(...) seems to work ok when manually testing - it throws the cursor to (0, 0) (in plot coordinates), but does not trigger the mouse callback in the test.

ales-erjavec · 2018-08-13T12:31:21Z

The QTest.mouseMove(...) seems to work ok when manually testing - it throws the cursor to (0, 0) (in plot coordinates), but does not trigger the mouse callback in the test.

Maybe QTBUG-5232. I swear I saw this when writing https://github.com/biolab/orange3/pull/3014/files#diff-da80849ceff100f955c9ec05920ba089R138.

If I use that function and modify your test like this:

        ...
        # curve data for classifier 0
        curve = self.widget.plot_curves(self.widget.target_index, 0)
        curve_merge = curve.merge()

        view = self.widget.plotview
        item = curve_merge.curve_item  # type: pg.PlotCurveItem
        xs, ys = item.getData()
        x, y = xs[len(xs) // 2], ys[len(ys) // 2]
        pos = item.mapToScene(x, y)
        pos = view.mapFromScene(pos)
        mouseMove(view.viewport(), pos)
        # example assertion
        self.assertTrue(QToolTip.isVisible())

then the test passes. Can you rebase on current master and use the mouseMove function from test_combobox.py (move it to e.g. Orange/widgets/tests/utils.py and import it from there)

* fix crash when 'Show convex ROC curves' is enabled * register mouse event only once * use QToolTip instead of TextItem for displaying thresholds * fix newly caused pylint warning

* removed dangerous retrieval of ROC curve data * changed tooltip caching so that it works smoother * fixed bug where cache value was not correct and would not reset properly

…inor consistency fix

matejklemen · 2018-08-16T23:44:39Z

Thank you, that does fix the issue.

I've stumbled upon another issue though - the points which I'm trying to move the mouse to are not being mapped correctly. For example, if I try to simulate a mouse move to (x=0.0, y=0.0) and (x=1.0, y=1.0), it produces the same threshold/tooltip.

I've noticed that these two points get mapped to very similar positions inside the test (QPoint(56, 1) and QPoint(57, 2) in my case), while during "manual" testing they get mapped far apart and correctly.
My guess is that this is happening because the widget, created inside the test, is tiny, and therefore everything is getting mapped to pretty much the same coordinates - but I'm not sure how to fix it.

ales-erjavec · 2018-08-17T12:01:17Z

Try adding

vb = self.widget.plot.getViewBox()
vb.childTransform()  # Force pyqtgraph to update transforms

somewhere before mapping the points.

matejklemen · 2018-08-20T00:37:03Z

Thanks once again. 😃
I've moved the mouseMove(...) function and added tests for tooltips - hopefully I didn't mess anything up.

ales-erjavec · 2018-08-23T06:43:36Z

Orange/widgets/evaluate/owrocanalysis.py

        self._plot_curves = {}
        self._rocch = None
        self._perf_line = None
+        self._tooltip_cache = None


I do not understand why the tooltip cache is necessary. Can you add a comment explaining the reasoning for its use?

My reasoning behind it was that instead of checking all points of a curve to find the closest one, it would first check the tooltip cache, which contains at most 1 threshold (point) per (selected) classifier.

It's hard to say if it's worth the additional 20-ish lines of code though.

But it already checks all points in pts = sp.pointsAt(act_pos)

Right, but it also does it again later (although it is vectorized): idx_closest = numpy.argmin(numpy.linalg.norm(diff, axis=1)).

Considering your point, it does seem that the performance gain from tooltip cache might not be as big as I thought. Should I remove it and make the code clearer that way?

Edit: also, the tooltip cache provides a convenient way to test the contents of tooltips - the first alternative way I can think of would be to test the entire tooltip text, e.g. assertEqual("Tooltips:\n(#1) 0.400", QTooltip.text()).

ales-erjavec · 2018-08-31T10:33:10Z

Should I remove it and make the code clearer that way?

It does not matter really, you can leave it.

ales-erjavec reviewed Jul 31, 2018

View reviewed changes

ales-erjavec reviewed Aug 2, 2018

View reviewed changes

lanzagar added this to the 3.16 milestone Aug 3, 2018

ales-erjavec reviewed Aug 6, 2018

View reviewed changes

matejklemen added 6 commits August 16, 2018 01:11

owrocanalysis: show thresholds on mouse-over

f53deb1

owrocanalysis: fixes for showing ROC thresholds

05fd573

* fix crash when 'Show convex ROC curves' is enabled * register mouse event only once * use QToolTip instead of TextItem for displaying thresholds * fix newly caused pylint warning

owrocanalysis: fix index mixup

fb1591d

owrocanalysis: batch 2 of fixes for showing ROC thresholds

8b2efa7

* removed dangerous retrieval of ROC curve data * changed tooltip caching so that it works smoother * fixed bug where cache value was not correct and would not reset properly

owrocanalysis: vectorize calculation of closest point to cursor and m…

57f6b85

…inor consistency fix

owrocanalysis: improve tooltips for overlapping points

ddddded

matejklemen added 2 commits August 20, 2018 02:14

tests: move mouseMove from test_combobox.py to utils

835f902

tests: test tooltips for ROC analysis

099ee74

matejklemen force-pushed the enh_roc_thresholds branch from 7438477 to 099ee74 Compare August 20, 2018 00:28

ales-erjavec reviewed Aug 23, 2018

View reviewed changes

lanzagar assigned ales-erjavec Sep 12, 2018

ales-erjavec changed the title ~~[RFC] ROC analysis: show thresholds~~ [ENH] ROC analysis: show thresholds Sep 12, 2018

ales-erjavec merged commit eb9c1e0 into biolab:master Sep 12, 2018

matejklemen deleted the enh_roc_thresholds branch September 17, 2018 18:18

ales-erjavec mentioned this pull request Jan 31, 2019

ROC Analysis widget: show thresholds #3057

Closed

Uh oh!

Conversation

matejklemen commented Jul 30, 2018

Issue

Description of changes

Includes

Uh oh!

ales-erjavec left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matejklemen Aug 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov-io commented Jul 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ales-erjavec commented Aug 9, 2018

Uh oh!

matejklemen commented Aug 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ales-erjavec commented Aug 13, 2018

Uh oh!

matejklemen commented Aug 16, 2018

Uh oh!

ales-erjavec commented Aug 17, 2018

Uh oh!

matejklemen commented Aug 20, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matejklemen Aug 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ales-erjavec commented Aug 31, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

matejklemen Aug 1, 2018 •

edited

Loading

codecov-io commented Jul 31, 2018 •

edited

Loading

matejklemen commented Aug 10, 2018 •

edited

Loading

matejklemen Aug 23, 2018 •

edited

Loading