#
source:
orange/docs/widgets/rst/evaluate/rocanalysis.rst
@
11778:ecd4beec2099

Revision 11778:ecd4beec2099, 4.4 KB checked in by Ales Erjavec <ales.erjavec@…>, 5 months ago (diff) |
---|

Line | |
---|---|

1 | .. _ROC Analysis: |

2 | |

3 | ROC Analysis |

4 | ============ |

5 | |

6 | .. image:: ../../../../Orange/OrangeWidgets/Evaluate/icons/ROCAnalysis.svg |

7 | |

8 | Shows the ROC curves and analyzes them. |

9 | |

10 | Signals |

11 | ------- |

12 | |

13 | Inputs: |

14 | |

15 | |

16 | - Evaluation Results (orngTest.ExperimentResults) |

17 | Results of classifiers' tests on data |

18 | |

19 | |

20 | Outputs: |

21 | |

22 | None |

23 | |

24 | Description |

25 | ----------- |

26 | |

27 | The widget show ROC curves for the tested models and the corresponding convex |

28 | hull. Given the costs of false positives and false negatives, it can also |

29 | determine the optimal classifier and threshold. |

30 | |

31 | .. image:: images/ROCAnalysis.png |

32 | |

33 | Option :obj:`Target class` chooses the positive class. In case there are |

34 | more than two classes, the widget considers all other classes as a single, |

35 | negative class. |

36 | |

37 | If the test results contain more than one classifier, the user can choose |

38 | which curves she or he wants to see plotted. |

39 | |

40 | .. image:: images/ROCAnalysis-Convex.png |

41 | |

42 | Option :obj:`Show convex curves` refers to convex curves over each individual |

43 | classifier (the thin lines on the cutout on the left). :obj:`Show convex hull` |

44 | plots a convex hull over ROC curves for all classifiers (the thick yellow |

45 | line). Plotting both types of convex curves them makes sense since selecting a |

46 | threshold in a concave part of the curve cannot yield optimal results, |

47 | disregarding the cost matrix. Besides, it is possible to reach any point |

48 | on the convex curve by combining the classifiers represented by the points |

49 | at the border of the concave region. |

50 | |

51 | The diagonal line represents the behaviour of a random classifier. |

52 | |

53 | When the data comes from multiple iterations of training and testing, such |

54 | as k-fold cross validation, the results can be (and usually are) averaged. |

55 | The averaging options are: |

56 | |

57 | - :obj:`Merge (expected ROC perf.)` treats all the test data as if it |

58 | came from a single iteration |

59 | - :obj:`Vertical` averages the curves vertically, showing the corresponding |

60 | confidence intervals |

61 | - :obj:`Threshold` traverses over threshold, averages the curves positions |

62 | at them and shows horizontal and vertical confidence intervals |

63 | - :obj:`None` does not average but prints all the curves instead |

64 | |

65 | |

66 | |

67 | .. image:: images/ROCAnalysis-Vertical.png |

68 | |

69 | .. image:: images/ROCAnalysis-Threshold.png |

70 | |

71 | .. image:: images/ROCAnalysis-None.png |

72 | |

73 | .. image:: images/ROCAnalysis-Analysis.png |

74 | |

75 | The second sheet of settings is dedicated to analysis of the curve. The user |

76 | can specify the cost of false positives and false negatives, and the prior |

77 | target class probability. :obj:`Compute from Data` sets it to the proportion |

78 | of examples of this class in the data. |

79 | |

80 | Iso-performance line is a line in the ROC space such that all points on the |

81 | line give the same profit/loss. The line to the upper left are better those |

82 | down and right. The direction of the line depends upon the above costs and |

83 | probabilities. Put together, this gives a recipe for depicting the optimal |

84 | threshold for the given costs: it is the point where the tangent with the |

85 | given inclination touches the curve. If we go higher or more to the left, |

86 | the points on the isoperformance line cannot be reached by the learner. |

87 | Going down or to the right, decreases the performance. |

88 | |

89 | The widget can show the performance line, which changes as the user |

90 | changes the parameters. The points where the line touches any of the |

91 | curves - that is, the optimal point for any of the given classifiers - |

92 | is also marked and the corresponding threshold (the needed probability |

93 | of the target class for the example to be classified into that class) is |

94 | shown besides. |

95 | |

96 | The widget allows setting costs from 1 to 1000. The units are not important, |

97 | as are not the magnitudes. What matters is the relation between the two costs, |

98 | so setting them to 100 and 200 will give the same result as 400 and 800. |

99 | |

100 | .. image:: images/ROCAnalysis-Performance2.png |

101 | |

102 | Defaults: both costs equal (500), Prior target class probability 44% |

103 | (from the data) |

104 | |

105 | .. image:: images/ROCAnalysis-Performance1.png |

106 | |

107 | False positive cost: 838, False negative cost 650, Prior target class |

108 | probability 73% |

109 | |

110 | :obj:`Default threshold (0.5) point` shows the point on the ROC curve |

111 | achieved by the classifier if it predicts the target class if its probability |

112 | equals or exceeds 0.5. |

113 | |

114 | Example |

115 | ------- |

116 | |

117 | At the moment, the only widget which give the right type of the signal |

118 | needed by ROC Analysis is :ref:`Test Learners`. The ROC Analysis will hence |

119 | always follow Test Learners and, since it has no outputs, no other widgets |

120 | follow it. Here is a typical example. |

121 | |

122 | .. image:: images/ROCLiftCalibration-Schema.png |

**Note:**See TracBrowser for help on using the repository browser.