source: orange/docs/reference/rst/Orange.evaluation.scoring.rst @ 10282:68cbf304dfdc

Revision 10282:68cbf304dfdc, 4.6 KB checked in by anzeh <anze.staric@…>, 2 years ago (diff)

Fixed documentation for scoring.CA.

Line 
1.. automodule:: Orange.evaluation.scoring
2
3############################
4Method scoring (``scoring``)
5############################
6
7.. index: scoring
8
9Scoring plays and integral role in evaluation of any prediction model. Orange
10implements various scores for evaluation of classification,
11regression and multi-label models. Most of the methods needs to be called
12with an instance of :obj:`~Orange.evaluation.testing.ExperimentResults`.
13
14.. literalinclude:: code/scoring-example.py
15
16==============
17Classification
18==============
19
20Calibration scores
21==================
22Many scores for evaluation of the classification models measure whether the
23model assigns the correct class value to the test instances. Many of these
24scores can be computed solely from the confusion matrix constructed manually
25with the :obj:`confusion_matrices` function. If class variable has more than
26two values, the index of the value to calculate the confusion matrix for should
27be passed as well.
28
29.. autoclass:: CA
30.. autofunction:: sens
31.. autofunction:: spec
32.. autofunction:: PPV
33.. autofunction:: NPV
34.. autofunction:: precision
35.. autofunction:: recall
36.. autofunction:: F1
37.. autofunction:: Falpha
38.. autofunction:: MCC
39.. autofunction:: AP
40.. autofunction:: IS
41.. autofunction:: confusion_chi_square
42
43Discriminatory scores
44=====================
45Scores that measure how good can the prediction model separate instances with
46different classes are called discriminatory scores.
47
48.. autofunction:: Brier_score
49
50.. autoclass:: AUC
51    :members: by_weighted_pairs, by_pairs,
52              weighted_one_against_all, one_against_all, single_class, pair,
53              matrix
54
55.. autofunction:: AUCWilcoxon
56
57.. autofunction:: compute_ROC
58
59.. autofunction:: confusion_matrices
60
61.. autoclass:: ConfusionMatrix
62
63
64Comparison of Algorithms
65========================
66
67.. autofunction:: McNemar
68
69.. autofunction:: McNemar_of_two
70
71==========
72Regression
73==========
74
75Several alternative measures, as given below, can be used to evaluate
76the sucess of numeric prediction:
77
78.. image:: files/statRegression.png
79
80.. autofunction:: MSE
81
82.. autofunction:: RMSE
83
84.. autofunction:: MAE
85
86.. autofunction:: RSE
87
88.. autofunction:: RRSE
89
90.. autofunction:: RAE
91
92.. autofunction:: R2
93
94The following code (:download:`statExamples.py <code/statExamples.py>`) uses most of the above measures to
95score several regression methods.
96
97The code above produces the following output::
98
99    Learner   MSE     RMSE    MAE     RSE     RRSE    RAE     R2
100    maj       84.585  9.197   6.653   1.002   1.001   1.001  -0.002
101    rt        40.015  6.326   4.592   0.474   0.688   0.691   0.526
102    knn       21.248  4.610   2.870   0.252   0.502   0.432   0.748
103    lr        24.092  4.908   3.425   0.285   0.534   0.515   0.715
104
105=================
106Ploting functions
107=================
108
109.. autofunction:: graph_ranks
110
111The following script (:download:`statExamplesGraphRanks.py <code/statExamplesGraphRanks.py>`) shows hot to plot a graph:
112
113.. literalinclude:: code/statExamplesGraphRanks.py
114
115Code produces the following graph:
116
117.. image:: files/statExamplesGraphRanks1.png
118
119.. autofunction:: compute_CD
120
121.. autofunction:: compute_friedman
122
123=================
124Utility Functions
125=================
126
127.. autofunction:: split_by_iterations
128
129=====================================
130Scoring for multilabel classification
131=====================================
132
133Multi-label classification requires different metrics than those used in
134traditional single-label classification. This module presents the various
135metrics that have been proposed in the literature. Let :math:`D` be a
136multi-label evaluation data set, conisting of :math:`|D|` multi-label examples
137:math:`(x_i,Y_i)`, :math:`i=1..|D|`, :math:`Y_i \\subseteq L`. Let :math:`H`
138be a multi-label classifier and :math:`Z_i=H(x_i)` be the set of labels
139predicted by :math:`H` for example :math:`x_i`.
140
141.. autofunction:: mlc_hamming_loss
142.. autofunction:: mlc_accuracy
143.. autofunction:: mlc_precision
144.. autofunction:: mlc_recall
145
146The following script demonstrates the use of those evaluation measures:
147
148.. literalinclude:: code/mlc-evaluate.py
149
150The output should look like this::
151
152    loss= [0.9375]
153    accuracy= [0.875]
154    precision= [1.0]
155    recall= [0.875]
156
157References
158==========
159
160Boutell, M.R., Luo, J., Shen, X. & Brown, C.M. (2004), 'Learning multi-label scene classification',
161Pattern Recogintion, vol.37, no.9, pp:1757-71
162
163Godbole, S. & Sarawagi, S. (2004), 'Discriminative Methods for Multi-labeled Classification', paper
164presented to Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining
165(PAKDD 2004)
166
167Schapire, R.E. & Singer, Y. (2000), 'Boostexter: a bossting-based system for text categorization',
168Machine Learning, vol.39, no.2/3, pp:135-68.
Note: See TracBrowser for help on using the repository browser.