source: orange/docs/widgets/rst/evaluate/rocanalysis.rst @ 11050:e3c4699ca155

Revision 11050:e3c4699ca155, 4.4 KB checked in by Miha Stajdohar <miha.stajdohar@…>, 16 months ago (diff)

Widget docs From HTML to Sphinx.

Line 
1.. _ROC Analysis:
2
3ROC Analysis
4============
5
6.. image:: ../icons/ROCAnalysis.png
7
8Shows the ROC curves and analyzes them.
9
10Signals
11-------
12
13Inputs:
14
15
16   - Evaluation Results (orngTest.ExperimentResults)
17      Results of classifiers' tests on data
18
19
20Outputs:
21
22None
23
24Description
25-----------
26
27The widget show ROC curves for the tested models and the corresponding convex hull. Given the costs of false positives and false negatives, it can also determine the optimal classifier and threshold.
28
29.. image:: images/ROCAnalysis.png
30
31Option :obj:`Target class` chooses the positive class. In case there are more than two classes, the widget considers all other classes as a single, negative class.
32
33If the test results contain more than one classifier, the user can choose which curves she or he wants to see plotted.
34
35.. image:: images/ROCAnalysis-Convex.png
36
37Option :obj:`Show convex curves` refers to convex curves over each individual classifier (the thin lines on the cutout on the left). :obj:`Show convex hull` plots a convex hull over ROC curves for all classifiers (the thick yellow line). Plotting both types of convex curves them makes sense since selecting a threshold in a concave part of the curve cannot yield optimal results, disregarding the cost matrix. Besides, it is possible to reach any point on the convex curve by combining the classifiers represented by the points at the border of the concave region.
38
39The diagonal line represents the behaviour of a random classifier.
40
41When the data comes from multiple iterations of training and testing, such as k-fold cross validation, the results can be (and usually are) averaged. The averaging options are:
42
43   - :obj:`Merge (expected ROC perf.)` treats all the test data as if it came from a single iteration
44   - :obj:`Vertical` averages the curves vertically, showing the corresponding confidence intervals
45   - :obj:`Threshold` traverses over threshold, averages the curves positions at them and shows horizontal and vertical confidence intervals
46   - :obj:`None` does not average but prints all the curves instead
47
48
49
50.. image:: images/ROCAnalysis-Vertical.png
51
52.. image:: images/ROCAnalysis-Threshold.png
53
54.. image:: images/ROCAnalysis-None.png
55
56.. image:: images/ROCAnalysis-Analysis.png
57
58The second sheet of settings is dedicated to analysis of the curve. The user can specify the cost of false positives and false negatives, and the prior target class probability. :obj:`Compute from Data` sets it to the proportion of examples of this class in the data.
59
60Iso-performance line is a line in the ROC space such that all points on the line give the same profit/loss. The line to the upper left are better those down and right. The direction of the line depends upon the above costs and probabilities. Put together, this gives a recipe for depicting the optimal threshold for the given costs: it is the point where the tangent with the given inclination touches the curve. If we go higher or more to the left, the points on the isoperformance line cannot be reached by the learner. Going down or to the right, decreases the performance.
61
62The widget can show the performance line, which changes as the user changes the parameters. The points where the line touches any of the curves - that is, the optimal point for any of the given classifiers - is also marked and the corresponding threshold (the needed probability of the target class for the example to be classified into that class) is shown besides.
63
64The widget allows setting costs from 1 to 1000. The units are not important, as are not the magnitudes. What matters is the relation between the two costs, so setting them to 100 and 200 will give the same result as 400 and 800.
65
66.. image:: images/ROCAnalysis-Performance2.png
67
68Defaults: both costs equal (500), Prior target class probability 44% (from the data)
69
70.. image:: images/ROCAnalysis-Performance1.png
71
72False positive cost: 838, False negative cost 650, Prior target class probability 73%
73
74:obj:`Default threshold (0.5) point` shows the point on the ROC curve achieved by the classifier if it predicts the target class if its probability equals or exceeds 0.5.
75
76Example
77-------
78
79At the moment, the only widget which give the right type of the signal needed by ROC Analysis is `Test Learners <TestLearners.htm>`_. The ROC Analysis will hence always follow Test Learners and, since it has no outputs, no other widgets follow it. Here is a typical example.
80
81.. image:: images/ROCLiftCalibration-Schema.png
Note: See TracBrowser for help on using the repository browser.