source: orange/docs/widgets/rst/evaluate/rocanalysis.rst @ 11778:ecd4beec2099

Revision 11778:ecd4beec2099, 4.4 KB checked in by Ales Erjavec <ales.erjavec@…>, 5 months ago (diff)

Use new SVG icons in the widget documentation.

Line 
1.. _ROC Analysis:
2
3ROC Analysis
4============
5
6.. image:: ../../../../Orange/OrangeWidgets/Evaluate/icons/ROCAnalysis.svg
7
8Shows the ROC curves and analyzes them.
9
10Signals
11-------
12
13Inputs:
14
15
16   - Evaluation Results (orngTest.ExperimentResults)
17      Results of classifiers' tests on data
18
19
20Outputs:
21
22None
23
24Description
25-----------
26
27The widget show ROC curves for the tested models and the corresponding convex
28hull. Given the costs of false positives and false negatives, it can also
29determine the optimal classifier and threshold.
30
31.. image:: images/ROCAnalysis.png
32
33Option :obj:`Target class` chooses the positive class. In case there are
34more than two classes, the widget considers all other classes as a single,
35negative class.
36
37If the test results contain more than one classifier, the user can choose
38which curves she or he wants to see plotted.
39
40.. image:: images/ROCAnalysis-Convex.png
41
42Option :obj:`Show convex curves` refers to convex curves over each individual
43classifier (the thin lines on the cutout on the left). :obj:`Show convex hull`
44plots a convex hull over ROC curves for all classifiers (the thick yellow
45line). Plotting both types of convex curves them makes sense since selecting a
46threshold in a concave part of the curve cannot yield optimal results,
47disregarding the cost matrix. Besides, it is possible to reach any point
48on the convex curve by combining the classifiers represented by the points
49at the border of the concave region.
50
51The diagonal line represents the behaviour of a random classifier.
52
53When the data comes from multiple iterations of training and testing, such
54as k-fold cross validation, the results can be (and usually are) averaged.
55The averaging options are:
56
57   - :obj:`Merge (expected ROC perf.)` treats all the test data as if it
58     came from a single iteration
59   - :obj:`Vertical` averages the curves vertically, showing the corresponding
60     confidence intervals
61   - :obj:`Threshold` traverses over threshold, averages the curves positions
62     at them and shows horizontal and vertical confidence intervals
63   - :obj:`None` does not average but prints all the curves instead
64
65
66
67.. image:: images/ROCAnalysis-Vertical.png
68
69.. image:: images/ROCAnalysis-Threshold.png
70
71.. image:: images/ROCAnalysis-None.png
72
73.. image:: images/ROCAnalysis-Analysis.png
74
75The second sheet of settings is dedicated to analysis of the curve. The user
76can specify the cost of false positives and false negatives, and the prior
77target class probability. :obj:`Compute from Data` sets it to the proportion
78of examples of this class in the data.
79
80Iso-performance line is a line in the ROC space such that all points on the
81line give the same profit/loss. The line to the upper left are better those
82down and right. The direction of the line depends upon the above costs and
83probabilities. Put together, this gives a recipe for depicting the optimal
84threshold for the given costs: it is the point where the tangent with the
85given inclination touches the curve. If we go higher or more to the left,
86the points on the isoperformance line cannot be reached by the learner.
87Going down or to the right, decreases the performance.
88
89The widget can show the performance line, which changes as the user
90changes the parameters. The points where the line touches any of the
91curves - that is, the optimal point for any of the given classifiers -
92is also marked and the corresponding threshold (the needed probability
93of the target class for the example to be classified into that class) is
94shown besides.
95
96The widget allows setting costs from 1 to 1000. The units are not important,
97as are not the magnitudes. What matters is the relation between the two costs,
98so setting them to 100 and 200 will give the same result as 400 and 800.
99
100.. image:: images/ROCAnalysis-Performance2.png
101
102Defaults: both costs equal (500), Prior target class probability 44%
103(from the data)
104
105.. image:: images/ROCAnalysis-Performance1.png
106
107False positive cost: 838, False negative cost 650, Prior target class
108probability 73%
109
110:obj:`Default threshold (0.5) point` shows the point on the ROC curve
111achieved by the classifier if it predicts the target class if its probability
112equals or exceeds 0.5.
113
114Example
115-------
116
117At the moment, the only widget which give the right type of the signal
118needed by ROC Analysis is :ref:`Test Learners`. The ROC Analysis will hence
119always follow Test Learners and, since it has no outputs, no other widgets
120follow it. Here is a typical example.
121
122.. image:: images/ROCLiftCalibration-Schema.png
Note: See TracBrowser for help on using the repository browser.