source: orange/docs/reference/rst/Orange.evaluation.scoring.rst @ 10339:01a949be2367

Revision 10339:01a949be2367, 5.1 KB checked in by Lan Zagar <lan.zagar@…>, 2 years ago (diff)

Added multitarget scoring to documentation.

Line 
1.. automodule:: Orange.evaluation.scoring
2
3############################
4Method scoring (``scoring``)
5############################
6
7.. index: scoring
8
9Scoring plays and integral role in evaluation of any prediction model. Orange
10implements various scores for evaluation of classification,
11regression and multi-label models. Most of the methods needs to be called
12with an instance of :obj:`~Orange.evaluation.testing.ExperimentResults`.
13
14.. literalinclude:: code/scoring-example.py
15
16==============
17Classification
18==============
19
20Calibration scores
21==================
22Many scores for evaluation of the classification models measure whether the
23model assigns the correct class value to the test instances. Many of these
24scores can be computed solely from the confusion matrix constructed manually
25with the :obj:`confusion_matrices` function. If class variable has more than
26two values, the index of the value to calculate the confusion matrix for should
27be passed as well.
28
29.. autoclass:: CA
30.. autofunction:: sens
31.. autofunction:: spec
32.. autofunction:: PPV
33.. autofunction:: NPV
34.. autofunction:: precision
35.. autofunction:: recall
36.. autofunction:: F1
37.. autofunction:: Falpha
38.. autofunction:: MCC
39.. autofunction:: AP
40.. autofunction:: IS
41.. autofunction:: confusion_chi_square
42
43Discriminatory scores
44=====================
45Scores that measure how good can the prediction model separate instances with
46different classes are called discriminatory scores.
47
48.. autofunction:: Brier_score
49
50.. autoclass:: AUC
51    :members: by_weighted_pairs, by_pairs,
52              weighted_one_against_all, one_against_all, single_class, pair,
53              matrix
54
55.. autofunction:: AUCWilcoxon
56
57.. autofunction:: compute_ROC
58
59.. autofunction:: confusion_matrices
60
61.. autoclass:: ConfusionMatrix
62
63
64Comparison of Algorithms
65========================
66
67.. autofunction:: McNemar
68
69.. autofunction:: McNemar_of_two
70
71==========
72Regression
73==========
74
75Several alternative measures, as given below, can be used to evaluate
76the sucess of numeric prediction:
77
78.. image:: files/statRegression.png
79
80.. autofunction:: MSE
81
82.. autofunction:: RMSE
83
84.. autofunction:: MAE
85
86.. autofunction:: RSE
87
88.. autofunction:: RRSE
89
90.. autofunction:: RAE
91
92.. autofunction:: R2
93
94The following code (:download:`statExamples.py <code/statExamples.py>`) uses most of the above measures to
95score several regression methods.
96
97The code above produces the following output::
98
99    Learner   MSE     RMSE    MAE     RSE     RRSE    RAE     R2
100    maj       84.585  9.197   6.653   1.002   1.001   1.001  -0.002
101    rt        40.015  6.326   4.592   0.474   0.688   0.691   0.526
102    knn       21.248  4.610   2.870   0.252   0.502   0.432   0.748
103    lr        24.092  4.908   3.425   0.285   0.534   0.515   0.715
104
105=================
106Ploting functions
107=================
108
109.. autofunction:: graph_ranks
110
111The following script (:download:`statExamplesGraphRanks.py <code/statExamplesGraphRanks.py>`) shows hot to plot a graph:
112
113.. literalinclude:: code/statExamplesGraphRanks.py
114
115Code produces the following graph:
116
117.. image:: files/statExamplesGraphRanks1.png
118
119.. autofunction:: compute_CD
120
121.. autofunction:: compute_friedman
122
123=================
124Utility Functions
125=================
126
127.. autofunction:: split_by_iterations
128
129
130.. _mt-scoring:
131
132============
133Multi-target
134============
135
136:doc:`Multi-target <Orange.multitarget>` classifiers predict values for
137multiple target classes. They can be used with standard
138:obj:`~Orange.evaluation.testing` procedures (e.g.
139:obj:`~Orange.evaluation.testing.Evaluation.cross_validation`), but require special
140scoring functions to compute a single score from the obtained
141:obj:`~Orange.evaluation.testing.ExperimentResults`.
142
143.. autofunction:: mt_flattened_score
144.. autofunction:: mt_average_score
145
146==========================
147Multi-label classification
148==========================
149
150Multi-label classification requires different metrics than those used in
151traditional single-label classification. This module presents the various
152metrics that have been proposed in the literature. Let :math:`D` be a
153multi-label evaluation data set, conisting of :math:`|D|` multi-label examples
154:math:`(x_i,Y_i)`, :math:`i=1..|D|`, :math:`Y_i \\subseteq L`. Let :math:`H`
155be a multi-label classifier and :math:`Z_i=H(x_i)` be the set of labels
156predicted by :math:`H` for example :math:`x_i`.
157
158.. autofunction:: mlc_hamming_loss
159.. autofunction:: mlc_accuracy
160.. autofunction:: mlc_precision
161.. autofunction:: mlc_recall
162
163The following script demonstrates the use of those evaluation measures:
164
165.. literalinclude:: code/mlc-evaluate.py
166
167The output should look like this::
168
169    loss= [0.9375]
170    accuracy= [0.875]
171    precision= [1.0]
172    recall= [0.875]
173
174References
175==========
176
177Boutell, M.R., Luo, J., Shen, X. & Brown, C.M. (2004), 'Learning multi-label scene classification',
178Pattern Recogintion, vol.37, no.9, pp:1757-71
179
180Godbole, S. & Sarawagi, S. (2004), 'Discriminative Methods for Multi-labeled Classification', paper
181presented to Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining
182(PAKDD 2004)
183
184Schapire, R.E. & Singer, Y. (2000), 'Boostexter: a bossting-based system for text categorization',
185Machine Learning, vol.39, no.2/3, pp:135-68.
Note: See TracBrowser for help on using the repository browser.