Ignore:
Files:
1 added
1 deleted
10 edited

Legend:

Unmodified
Added
Removed
  • Orange/evaluation/scoring.py

    r9725 r9892  
    1 """ 
    2 ############################ 
    3 Method scoring (``scoring``) 
    4 ############################ 
    5  
    6 .. index: scoring 
    7  
    8 This module contains various measures of quality for classification and 
    9 regression. Most functions require an argument named :obj:`res`, an instance of 
    10 :class:`Orange.evaluation.testing.ExperimentResults` as computed by 
    11 functions from :mod:`Orange.evaluation.testing` and which contains  
    12 predictions obtained through cross-validation, 
    13 leave one-out, testing on training data or test set instances. 
    14  
    15 ============== 
    16 Classification 
    17 ============== 
    18  
    19 To prepare some data for examples on this page, we shall load the voting data 
    20 set (problem of predicting the congressman's party (republican, democrat) 
    21 based on a selection of votes) and evaluate naive Bayesian learner, 
    22 classification trees and majority classifier using cross-validation. 
    23 For examples requiring a multivalued class problem, we shall do the same 
    24 with the vehicle data set (telling whether a vehicle described by the features 
    25 extracted from a picture is a van, bus, or Opel or Saab car). 
    26  
    27 Basic cross validation example is shown in the following part of  
    28 (:download:`statExamples.py <code/statExamples.py>`, uses :download:`voting.tab <code/voting.tab>` and :download:`vehicle.tab <code/vehicle.tab>`): 
    29  
    30 .. literalinclude:: code/statExample0.py 
    31  
    32 If instances are weighted, weights are taken into account. This can be 
    33 disabled by giving :obj:`unweighted=1` as a keyword argument. Another way of 
    34 disabling weights is to clear the 
    35 :class:`Orange.evaluation.testing.ExperimentResults`' flag weights. 
    36  
    37 General Measures of Quality 
    38 =========================== 
    39  
    40 .. autofunction:: CA 
    41  
    42 .. autofunction:: AP 
    43  
    44 .. autofunction:: Brier_score 
    45  
    46 .. autofunction:: IS 
    47  
    48 So, let's compute all this in part of  
    49 (:download:`statExamples.py <code/statExamples.py>`, uses :download:`voting.tab <code/voting.tab>` and :download:`vehicle.tab <code/vehicle.tab>`) and print it out: 
    50  
    51 .. literalinclude:: code/statExample1.py 
    52    :lines: 13- 
    53  
    54 The output should look like this:: 
    55  
    56     method  CA      AP      Brier    IS 
    57     bayes   0.903   0.902   0.175    0.759 
    58     tree    0.846   0.845   0.286    0.641 
    59     majrty  0.614   0.526   0.474   -0.000 
    60  
    61 Script :download:`statExamples.py <code/statExamples.py>` contains another example that also prints out  
    62 the standard errors. 
    63  
    64 Confusion Matrix 
    65 ================ 
    66  
    67 .. autofunction:: confusion_matrices 
    68  
    69    **A positive-negative confusion matrix** is computed (a) if the class is 
    70    binary unless :obj:`classIndex` argument is -2, (b) if the class is 
    71    multivalued and the :obj:`classIndex` is non-negative. Argument 
    72    :obj:`classIndex` then tells which class is positive. In case (a), 
    73    :obj:`classIndex` may be omitted; the first class 
    74    is then negative and the second is positive, unless the :obj:`baseClass` 
    75    attribute in the object with results has non-negative value. In that case, 
    76    :obj:`baseClass` is an index of the target class. :obj:`baseClass` 
    77    attribute of results object should be set manually. The result of a 
    78    function is a list of instances of class :class:`ConfusionMatrix`, 
    79    containing the (weighted) number of true positives (TP), false 
    80    negatives (FN), false positives (FP) and true negatives (TN). 
    81     
    82    We can also add the keyword argument :obj:`cutoff` 
    83    (e.g. confusion_matrices(results, cutoff=0.3); if we do, :obj:`confusion_matrices` 
    84    will disregard the classifiers' class predictions and observe the predicted 
    85    probabilities, and consider the prediction "positive" if the predicted 
    86    probability of the positive class is higher than the :obj:`cutoff`. 
    87  
    88    The example (part of :download:`statExamples.py <code/statExamples.py>`) below shows how setting the 
    89    cut off threshold from the default 0.5 to 0.2 affects the confusion matrics  
    90    for naive Bayesian classifier:: 
    91     
    92        cm = Orange.evaluation.scoring.confusion_matrices(res)[0] 
    93        print "Confusion matrix for naive Bayes:" 
    94        print "TP: %i, FP: %i, FN: %s, TN: %i" % (cm.TP, cm.FP, cm.FN, cm.TN) 
    95         
    96        cm = Orange.evaluation.scoring.confusion_matrices(res, cutoff=0.2)[0] 
    97        print "Confusion matrix for naive Bayes:" 
    98        print "TP: %i, FP: %i, FN: %s, TN: %i" % (cm.TP, cm.FP, cm.FN, cm.TN) 
    99  
    100    The output:: 
    101     
    102        Confusion matrix for naive Bayes: 
    103        TP: 238, FP: 13, FN: 29.0, TN: 155 
    104        Confusion matrix for naive Bayes: 
    105        TP: 239, FP: 18, FN: 28.0, TN: 150 
    106     
    107    shows that the number of true positives increases (and hence the number of 
    108    false negatives decreases) by only a single instance, while five instances 
    109    that were originally true negatives become false positives due to the 
    110    lower threshold. 
    111     
    112    To observe how good are the classifiers in detecting vans in the vehicle 
    113    data set, we would compute the matrix like this:: 
    114     
    115       cm = Orange.evaluation.scoring.confusion_matrices(resVeh, \ 
    116 vehicle.domain.classVar.values.index("van")) 
    117     
    118    and get the results like these:: 
    119     
    120        TP: 189, FP: 241, FN: 10.0, TN: 406 
    121     
    122    while the same for class "opel" would give:: 
    123     
    124        TP: 86, FP: 112, FN: 126.0, TN: 522 
    125         
    126    The main difference is that there are only a few false negatives for the 
    127    van, meaning that the classifier seldom misses it (if it says it's not a 
    128    van, it's almost certainly not a van). Not so for the Opel car, where the 
    129    classifier missed 126 of them and correctly detected only 86. 
    130     
    131    **General confusion matrix** is computed (a) in case of a binary class, 
    132    when :obj:`classIndex` is set to -2, (b) when we have multivalued class and  
    133    the caller doesn't specify the :obj:`classIndex` of the positive class. 
    134    When called in this manner, the function cannot use the argument 
    135    :obj:`cutoff`. 
    136     
    137    The function then returns a three-dimensional matrix, where the element 
    138    A[:obj:`learner`][:obj:`actual_class`][:obj:`predictedClass`] 
    139    gives the number of instances belonging to 'actual_class' for which the 
    140    'learner' predicted 'predictedClass'. We shall compute and print out 
    141    the matrix for naive Bayesian classifier. 
    142     
    143    Here we see another example from :download:`statExamples.py <code/statExamples.py>`:: 
    144     
    145        cm = Orange.evaluation.scoring.confusion_matrices(resVeh)[0] 
    146        classes = vehicle.domain.classVar.values 
    147        print "\t"+"\t".join(classes) 
    148        for className, classConfusions in zip(classes, cm): 
    149            print ("%s" + ("\t%i" * len(classes))) % ((className, ) + tuple(classConfusions)) 
    150     
    151    So, here's what this nice piece of code gives:: 
    152     
    153               bus   van  saab opel 
    154        bus     56   95   21   46 
    155        van     6    189  4    0 
    156        saab    3    75   73   66 
    157        opel    4    71   51   86 
    158         
    159    Van's are clearly simple: 189 vans were classified as vans (we know this 
    160    already, we've printed it out above), and the 10 misclassified pictures 
    161    were classified as buses (6) and Saab cars (4). In all other classes, 
    162    there were more instances misclassified as vans than correctly classified 
    163    instances. The classifier is obviously quite biased to vans. 
    164     
    165    .. method:: sens(confm)  
    166    .. method:: spec(confm) 
    167    .. method:: PPV(confm) 
    168    .. method:: NPV(confm) 
    169    .. method:: precision(confm) 
    170    .. method:: recall(confm) 
    171    .. method:: F2(confm) 
    172    .. method:: Falpha(confm, alpha=2.0) 
    173    .. method:: MCC(conf) 
    174  
    175    With the confusion matrix defined in terms of positive and negative 
    176    classes, you can also compute the  
    177    `sensitivity <http://en.wikipedia.org/wiki/Sensitivity_(tests)>`_ 
    178    [TP/(TP+FN)], `specificity \ 
    179 <http://en.wikipedia.org/wiki/Specificity_%28tests%29>`_ 
    180    [TN/(TN+FP)], `positive predictive value \ 
    181 <http://en.wikipedia.org/wiki/Positive_predictive_value>`_ 
    182    [TP/(TP+FP)] and `negative predictive value \ 
    183 <http://en.wikipedia.org/wiki/Negative_predictive_value>`_ [TN/(TN+FN)].  
    184    In information retrieval, positive predictive value is called precision 
    185    (the ratio of the number of relevant records retrieved to the total number 
    186    of irrelevant and relevant records retrieved), and sensitivity is called 
    187    `recall <http://en.wikipedia.org/wiki/Information_retrieval>`_  
    188    (the ratio of the number of relevant records retrieved to the total number 
    189    of relevant records in the database). The  
    190    `harmonic mean <http://en.wikipedia.org/wiki/Harmonic_mean>`_ of precision 
    191    and recall is called an  
    192    `F-measure <http://en.wikipedia.org/wiki/F-measure>`_, where, depending 
    193    on the ratio of the weight between precision and recall is implemented 
    194    as F1 [2*precision*recall/(precision+recall)] or, for a general case, 
    195    Falpha [(1+alpha)*precision*recall / (alpha*precision + recall)]. 
    196    The `Matthews correlation coefficient \ 
    197 <http://en.wikipedia.org/wiki/Matthews_correlation_coefficient>`_ 
    198    in essence a correlation coefficient between 
    199    the observed and predicted binary classifications; it returns a value 
    200    between -1 and +1. A coefficient of +1 represents a perfect prediction, 
    201    0 an average random prediction and -1 an inverse prediction. 
    202     
    203    If the argument :obj:`confm` is a single confusion matrix, a single 
    204    result (a number) is returned. If confm is a list of confusion matrices, 
    205    a list of scores is returned, one for each confusion matrix. 
    206     
    207    Note that weights are taken into account when computing the matrix, so 
    208    these functions don't check the 'weighted' keyword argument. 
    209     
    210    Let us print out sensitivities and specificities of our classifiers in 
    211    part of :download:`statExamples.py <code/statExamples.py>`:: 
    212     
    213        cm = Orange.evaluation.scoring.confusion_matrices(res) 
    214        print 
    215        print "method\tsens\tspec" 
    216        for l in range(len(learners)): 
    217            print "%s\t%5.3f\t%5.3f" % (learners[l].name, Orange.evaluation.scoring.sens(cm[l]), Orange.evaluation.scoring.spec(cm[l])) 
    218     
    219 ROC Analysis 
    220 ============ 
    221  
    222 `Receiver Operating Characteristic \ 
    223 <http://en.wikipedia.org/wiki/Receiver_operating_characteristic>`_  
    224 (ROC) analysis was initially developed for 
    225 a binary-like problems and there is no consensus on how to apply it in 
    226 multi-class problems, nor do we know for sure how to do ROC analysis after 
    227 cross validation and similar multiple sampling techniques. If you are 
    228 interested in the area under the curve, function AUC will deal with those 
    229 problems as specifically described below. 
    230  
    231 .. autofunction:: AUC 
    232     
    233    .. attribute:: AUC.ByWeightedPairs (or 0) 
    234        
    235       Computes AUC for each pair of classes (ignoring instances of all other 
    236       classes) and averages the results, weighting them by the number of 
    237       pairs of instances from these two classes (e.g. by the product of 
    238       probabilities of the two classes). AUC computed in this way still 
    239       behaves as concordance index, e.g., gives the probability that two 
    240       randomly chosen instances from different classes will be correctly 
    241       recognized (this is of course true only if the classifier knows 
    242       from which two classes the instances came). 
    243     
    244    .. attribute:: AUC.ByPairs (or 1) 
    245     
    246       Similar as above, except that the average over class pairs is not 
    247       weighted. This AUC is, like the binary, independent of class 
    248       distributions, but it is not related to concordance index any more. 
    249        
    250    .. attribute:: AUC.WeightedOneAgainstAll (or 2) 
    251        
    252       For each class, it computes AUC for this class against all others (that 
    253       is, treating other classes as one class). The AUCs are then averaged by 
    254       the class probabilities. This is related to concordance index in which 
    255       we test the classifier's (average) capability for distinguishing the 
    256       instances from a specified class from those that come from other classes. 
    257       Unlike the binary AUC, the measure is not independent of class 
    258       distributions. 
    259        
    260    .. attribute:: AUC.OneAgainstAll (or 3) 
    261     
    262       As above, except that the average is not weighted. 
    263     
    264    In case of multiple folds (for instance if the data comes from cross 
    265    validation), the computation goes like this. When computing the partial 
    266    AUCs for individual pairs of classes or singled-out classes, AUC is 
    267    computed for each fold separately and then averaged (ignoring the number 
    268    of instances in each fold, it's just a simple average). However, if a 
    269    certain fold doesn't contain any instances of a certain class (from the 
    270    pair), the partial AUC is computed treating the results as if they came 
    271    from a single-fold. This is not really correct since the class 
    272    probabilities from different folds are not necessarily comparable, 
    273    yet this will most often occur in a leave-one-out experiments, 
    274    comparability shouldn't be a problem. 
    275     
    276    Computing and printing out the AUC's looks just like printing out 
    277    classification accuracies (except that we call AUC instead of 
    278    CA, of course):: 
    279     
    280        AUCs = Orange.evaluation.scoring.AUC(res) 
    281        for l in range(len(learners)): 
    282            print "%10s: %5.3f" % (learners[l].name, AUCs[l]) 
    283             
    284    For vehicle, you can run exactly this same code; it will compute AUCs 
    285    for all pairs of classes and return the average weighted by probabilities 
    286    of pairs. Or, you can specify the averaging method yourself, like this:: 
    287     
    288        AUCs = Orange.evaluation.scoring.AUC(resVeh, Orange.evaluation.scoring.AUC.WeightedOneAgainstAll) 
    289     
    290    The following snippet tries out all four. (We don't claim that this is 
    291    how the function needs to be used; it's better to stay with the default.):: 
    292     
    293        methods = ["by pairs, weighted", "by pairs", "one vs. all, weighted", "one vs. all"] 
    294        print " " *25 + "  \tbayes\ttree\tmajority" 
    295        for i in range(4): 
    296            AUCs = Orange.evaluation.scoring.AUC(resVeh, i) 
    297            print "%25s: \t%5.3f\t%5.3f\t%5.3f" % ((methods[i], ) + tuple(AUCs)) 
    298     
    299    As you can see from the output:: 
    300     
    301                                    bayes   tree    majority 
    302               by pairs, weighted:  0.789   0.871   0.500 
    303                         by pairs:  0.791   0.872   0.500 
    304            one vs. all, weighted:  0.783   0.800   0.500 
    305                      one vs. all:  0.783   0.800   0.500 
    306  
    307 .. autofunction:: AUC_single 
    308  
    309 .. autofunction:: AUC_pair 
    310  
    311 .. autofunction:: AUC_matrix 
    312  
    313 The remaining functions, which plot the curves and statistically compare 
    314 them, require that the results come from a test with a single iteration, 
    315 and they always compare one chosen class against all others. If you have 
    316 cross validation results, you can either use split_by_iterations to split the 
    317 results by folds, call the function for each fold separately and then sum 
    318 the results up however you see fit, or you can set the ExperimentResults' 
    319 attribute number_of_iterations to 1, to cheat the function - at your own 
    320 responsibility for the statistical correctness. Regarding the multi-class 
    321 problems, if you don't chose a specific class, Orange.evaluation.scoring will use the class 
    322 attribute's baseValue at the time when results were computed. If baseValue 
    323 was not given at that time, 1 (that is, the second class) is used as default. 
    324  
    325 We shall use the following code to prepare suitable experimental results:: 
    326  
    327     ri2 = Orange.core.MakeRandomIndices2(voting, 0.6) 
    328     train = voting.selectref(ri2, 0) 
    329     test = voting.selectref(ri2, 1) 
    330     res1 = Orange.evaluation.testing.learnAndTestOnTestData(learners, train, test) 
    331  
    332  
    333 .. autofunction:: AUCWilcoxon 
    334  
    335 .. autofunction:: compute_ROC 
    336  
    337 Comparison of Algorithms 
    338 ------------------------ 
    339  
    340 .. autofunction:: McNemar 
    341  
    342 .. autofunction:: McNemar_of_two 
    343  
    344 ========== 
    345 Regression 
    346 ========== 
    347  
    348 General Measure of Quality 
    349 ========================== 
    350  
    351 Several alternative measures, as given below, can be used to evaluate 
    352 the sucess of numeric prediction: 
    353  
    354 .. image:: files/statRegression.png 
    355  
    356 .. autofunction:: MSE 
    357  
    358 .. autofunction:: RMSE 
    359  
    360 .. autofunction:: MAE 
    361  
    362 .. autofunction:: RSE 
    363  
    364 .. autofunction:: RRSE 
    365  
    366 .. autofunction:: RAE 
    367  
    368 .. autofunction:: R2 
    369  
    370 The following code (:download:`statExamples.py <code/statExamples.py>`) uses most of the above measures to 
    371 score several regression methods. 
    372  
    373 .. literalinclude:: code/statExamplesRegression.py 
    374  
    375 The code above produces the following output:: 
    376  
    377     Learner   MSE     RMSE    MAE     RSE     RRSE    RAE     R2 
    378     maj       84.585  9.197   6.653   1.002   1.001   1.001  -0.002 
    379     rt        40.015  6.326   4.592   0.474   0.688   0.691   0.526 
    380     knn       21.248  4.610   2.870   0.252   0.502   0.432   0.748 
    381     lr        24.092  4.908   3.425   0.285   0.534   0.515   0.715 
    382      
    383 ================= 
    384 Ploting functions 
    385 ================= 
    386  
    387 .. autofunction:: graph_ranks 
    388  
    389 The following script (:download:`statExamplesGraphRanks.py <code/statExamplesGraphRanks.py>`) shows hot to plot a graph: 
    390  
    391 .. literalinclude:: code/statExamplesGraphRanks.py 
    392  
    393 Code produces the following graph:  
    394  
    395 .. image:: files/statExamplesGraphRanks1.png 
    396  
    397 .. autofunction:: compute_CD 
    398  
    399 .. autofunction:: compute_friedman 
    400  
    401 ================= 
    402 Utility Functions 
    403 ================= 
    404  
    405 .. autofunction:: split_by_iterations 
    406  
    407 ===================================== 
    408 Scoring for multilabel classification 
    409 ===================================== 
    410  
    411 Multi-label classification requries different metrics than those used in traditional single-label  
    412 classification. This module presents the various methrics that have been proposed in the literature.  
    413 Let :math:`D` be a multi-label evaluation data set, conisting of :math:`|D|` multi-label examples  
    414 :math:`(x_i,Y_i)`, :math:`i=1..|D|`, :math:`Y_i \\subseteq L`. Let :math:`H` be a multi-label classifier  
    415 and :math:`Z_i=H(x_i)` be the set of labels predicted by :math:`H` for example :math:`x_i`. 
    416  
    417 .. autofunction:: mlc_hamming_loss  
    418 .. autofunction:: mlc_accuracy 
    419 .. autofunction:: mlc_precision 
    420 .. autofunction:: mlc_recall 
    421  
    422 So, let's compute all this and print it out (part of 
    423 :download:`mlc-evaluate.py <code/mlc-evaluate.py>`, uses 
    424 :download:`emotions.tab <code/emotions.tab>`): 
    425  
    426 .. literalinclude:: code/mlc-evaluate.py 
    427    :lines: 1-15 
    428  
    429 The output should look like this:: 
    430  
    431     loss= [0.9375] 
    432     accuracy= [0.875] 
    433     precision= [1.0] 
    434     recall= [0.875] 
    435  
    436 References 
    437 ========== 
    438  
    439 Boutell, M.R., Luo, J., Shen, X. & Brown, C.M. (2004), 'Learning multi-label scene classification', 
    440 Pattern Recogintion, vol.37, no.9, pp:1757-71 
    441  
    442 Godbole, S. & Sarawagi, S. (2004), 'Discriminative Methods for Multi-labeled Classification', paper  
    443 presented to Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining  
    444 (PAKDD 2004) 
    445   
    446 Schapire, R.E. & Singer, Y. (2000), 'Boostexter: a bossting-based system for text categorization',  
    447 Machine Learning, vol.39, no.2/3, pp:135-68. 
    448  
    449 """ 
    450  
    4511import operator, math 
    4522from operator import add 
     
    4555import Orange 
    4566from Orange import statc 
    457  
     7from Orange.misc import deprecated_keywords 
    4588 
    4599#### Private stuff 
     
    53383 
    53484 
    535 def statistics_by_folds(stats, foldN, reportSE, iterationIsOuter): 
     85@deprecated_keywords({ 
     86    "foldN": "fold_n", 
     87    "reportSE": "report_se", 
     88    "iterationIsOuter": "iteration_is_outer"}) 
     89def statistics_by_folds(stats, fold_n, report_se, iteration_is_outer): 
    53690    # remove empty folds, turn the matrix so that learner is outer 
    537     if iterationIsOuter: 
     91    if iteration_is_outer: 
    53892        if not stats: 
    53993            raise ValueError, "Cannot compute the score: no examples or sum of weights is 0.0." 
    54094        number_of_learners = len(stats[0]) 
    541         stats = filter(lambda (x, fN): fN>0.0, zip(stats,foldN)) 
     95        stats = filter(lambda (x, fN): fN>0.0, zip(stats,fold_n)) 
    54296        stats = [ [x[lrn]/fN for x, fN in stats] for lrn in range(number_of_learners)] 
    54397    else: 
    544         stats = [ [x/Fn for x, Fn in filter(lambda (x, Fn): Fn > 0.0, zip(lrnD, foldN))] for lrnD in stats] 
     98        stats = [ [x/Fn for x, Fn in filter(lambda (x, Fn): Fn > 0.0, zip(lrnD, fold_n))] for lrnD in stats] 
    54599 
    546100    if not stats: 
     
    549103        raise ValueError, "Cannot compute the score: no examples or sum of weights is 0.0." 
    550104     
    551     if reportSE: 
     105    if report_se: 
    552106        return [(statc.mean(x), statc.sterr(x)) for x in stats] 
    553107    else: 
     
    751305# Scores for evaluation of classifiers 
    752306 
    753 def CA(res, reportSE = False, **argkw): 
     307@deprecated_keywords({"reportSE": "report_se"}) 
     308def CA(res, report_se = False, **argkw): 
    754309    """ Computes classification accuracy, i.e. percentage of matches between 
    755310    predicted and actual class. The function returns a list of classification 
     
    793348            ca = [x/totweight for x in CAs] 
    794349             
    795         if reportSE: 
     350        if report_se: 
    796351            return [(x, x*(1-x)/math.sqrt(totweight)) for x in ca] 
    797352        else: 
     
    813368                foldN[tex.iteration_number] += tex.weight 
    814369 
    815         return statistics_by_folds(CAsByFold, foldN, reportSE, False) 
     370        return statistics_by_folds(CAsByFold, foldN, report_se, False) 
    816371 
    817372 
     
    820375    return CA(res, True, **argkw) 
    821376 
    822  
    823 def AP(res, reportSE = False, **argkw): 
     377@deprecated_keywords({"reportSE": "report_se"}) 
     378def AP(res, report_se = False, **argkw): 
    824379    """ Computes the average probability assigned to the correct class. """ 
    825380    if res.number_of_iterations == 1: 
     
    848403            foldN[tex.iteration_number] += tex.weight 
    849404 
    850     return statistics_by_folds(APsByFold, foldN, reportSE, True) 
    851  
    852  
    853 def Brier_score(res, reportSE = False, **argkw): 
     405    return statistics_by_folds(APsByFold, foldN, report_se, True) 
     406 
     407 
     408@deprecated_keywords({"reportSE": "report_se"}) 
     409def Brier_score(res, report_se = False, **argkw): 
    854410    """ Computes the Brier's score, defined as the average (over test examples) 
    855411    of sumx(t(x)-p(x))2, where x is a class, t(x) is 1 for the correct class 
     
    881437            totweight = gettotweight(res) 
    882438        check_non_zero(totweight) 
    883         if reportSE: 
     439        if report_se: 
    884440            return [(max(x/totweight+1.0, 0), 0) for x in MSEs]  ## change this, not zero!!! 
    885441        else: 
     
    900456            foldN[tex.iteration_number] += tex.weight 
    901457 
    902     stats = statistics_by_folds(BSs, foldN, reportSE, True) 
    903     if reportSE: 
     458    stats = statistics_by_folds(BSs, foldN, report_se, True) 
     459    if report_se: 
    904460        return [(x+1.0, y) for x, y in stats] 
    905461    else: 
     
    915471    else: 
    916472        return -(-log2(1-P)+log2(1-Pc)) 
    917      
    918 def IS(res, apriori=None, reportSE = False, **argkw): 
     473 
     474 
     475@deprecated_keywords({"reportSE": "report_se"}) 
     476def IS(res, apriori=None, report_se = False, **argkw): 
    919477    """ Computes the information score as defined by  
    920478    `Kononenko and Bratko (1991) \ 
     
    941499                    ISs[i] += IS_ex(tex.probabilities[i][cls], apriori[cls]) * tex.weight 
    942500            totweight = gettotweight(res) 
    943         if reportSE: 
     501        if report_se: 
    944502            return [(IS/totweight,0) for IS in ISs] 
    945503        else: 
     
    964522            foldN[tex.iteration_number] += tex.weight 
    965523 
    966     return statistics_by_folds(ISs, foldN, reportSE, False) 
     524    return statistics_by_folds(ISs, foldN, report_se, False) 
    967525 
    968526 
     
    1026584 
    1027585 
    1028 def confusion_matrices(res, classIndex=-1, **argkw): 
     586@deprecated_keywords({"classIndex": "class_index"}) 
     587def confusion_matrices(res, class_index=-1, **argkw): 
    1029588    """ This function can compute two different forms of confusion matrix: 
    1030589    one in which a certain class is marked as positive and the other(s) 
     
    1035594    tfpns = [ConfusionMatrix() for i in range(res.number_of_learners)] 
    1036595     
    1037     if classIndex<0: 
     596    if class_index<0: 
    1038597        numberOfClasses = len(res.class_values) 
    1039         if classIndex < -1 or numberOfClasses > 2: 
     598        if class_index < -1 or numberOfClasses > 2: 
    1040599            cm = [[[0.0] * numberOfClasses for i in range(numberOfClasses)] for l in range(res.number_of_learners)] 
    1041600            if argkw.get("unweighted", 0) or not res.weights: 
     
    1056615             
    1057616        elif res.baseClass>=0: 
    1058             classIndex = res.baseClass 
    1059         else: 
    1060             classIndex = 1 
     617            class_index = res.baseClass 
     618        else: 
     619            class_index = 1 
    1061620             
    1062621    cutoff = argkw.get("cutoff") 
     
    1064623        if argkw.get("unweighted", 0) or not res.weights: 
    1065624            for lr in res.results: 
    1066                 isPositive=(lr.actual_class==classIndex) 
     625                isPositive=(lr.actual_class==class_index) 
    1067626                for i in range(res.number_of_learners): 
    1068                     tfpns[i].addTFPosNeg(lr.probabilities[i][classIndex]>cutoff, isPositive) 
     627                    tfpns[i].addTFPosNeg(lr.probabilities[i][class_index]>cutoff, isPositive) 
    1069628        else: 
    1070629            for lr in res.results: 
    1071                 isPositive=(lr.actual_class==classIndex) 
     630                isPositive=(lr.actual_class==class_index) 
    1072631                for i in range(res.number_of_learners): 
    1073                     tfpns[i].addTFPosNeg(lr.probabilities[i][classIndex]>cutoff, isPositive, lr.weight) 
     632                    tfpns[i].addTFPosNeg(lr.probabilities[i][class_index]>cutoff, isPositive, lr.weight) 
    1074633    else: 
    1075634        if argkw.get("unweighted", 0) or not res.weights: 
    1076635            for lr in res.results: 
    1077                 isPositive=(lr.actual_class==classIndex) 
     636                isPositive=(lr.actual_class==class_index) 
    1078637                for i in range(res.number_of_learners): 
    1079                     tfpns[i].addTFPosNeg(lr.classes[i]==classIndex, isPositive) 
     638                    tfpns[i].addTFPosNeg(lr.classes[i]==class_index, isPositive) 
    1080639        else: 
    1081640            for lr in res.results: 
    1082                 isPositive=(lr.actual_class==classIndex) 
     641                isPositive=(lr.actual_class==class_index) 
    1083642                for i in range(res.number_of_learners): 
    1084                     tfpns[i].addTFPosNeg(lr.classes[i]==classIndex, isPositive, lr.weight) 
     643                    tfpns[i].addTFPosNeg(lr.classes[i]==class_index, isPositive, lr.weight) 
    1085644    return tfpns 
    1086645 
     
    1090649 
    1091650 
    1092 def confusion_chi_square(confusionMatrix): 
    1093     dim = len(confusionMatrix) 
    1094     rowPriors = [sum(r) for r in confusionMatrix] 
    1095     colPriors = [sum([r[i] for r in confusionMatrix]) for i in range(dim)] 
     651@deprecated_keywords({"confusionMatrix": "confusion_matrix"}) 
     652def confusion_chi_square(confusion_matrix): 
     653    dim = len(confusion_matrix) 
     654    rowPriors = [sum(r) for r in confusion_matrix] 
     655    colPriors = [sum([r[i] for r in confusion_matrix]) for i in range(dim)] 
    1096656    total = sum(rowPriors) 
    1097657    rowPriors = [r/total for r in rowPriors] 
    1098658    colPriors = [r/total for r in colPriors] 
    1099659    ss = 0 
    1100     for ri, row in enumerate(confusionMatrix): 
     660    for ri, row in enumerate(confusion_matrix): 
    1101661        for ci, o in enumerate(row): 
    1102662            e = total * rowPriors[ri] * colPriors[ci] 
     
    1229789    return r 
    1230790 
    1231 def scotts_pi(confm, bIsListOfMatrices=True): 
     791 
     792@deprecated_keywords({"bIsListOfMatrices": "b_is_list_of_matrices"}) 
     793def scotts_pi(confm, b_is_list_of_matrices=True): 
    1232794   """Compute Scott's Pi for measuring inter-rater agreement for nominal data 
    1233795 
     
    1240802                           Orange.evaluation.scoring.compute_confusion_matrices and set the 
    1241803                           classIndex parameter to -2. 
    1242    @param bIsListOfMatrices: specifies whether confm is list of matrices. 
     804   @param b_is_list_of_matrices: specifies whether confm is list of matrices. 
    1243805                           This function needs to operate on non-binary 
    1244806                           confusion matrices, which are represented by python 
     
    1247809   """ 
    1248810 
    1249    if bIsListOfMatrices: 
     811   if b_is_list_of_matrices: 
    1250812       try: 
    1251            return [scotts_pi(cm, bIsListOfMatrices=False) for cm in confm] 
     813           return [scotts_pi(cm, b_is_list_of_matrices=False) for cm in confm] 
    1252814       except TypeError: 
    1253815           # Nevermind the parameter, maybe this is a "conventional" binary 
     
    1276838       return ret 
    1277839 
    1278 def AUCWilcoxon(res, classIndex=-1, **argkw): 
     840@deprecated_keywords({"classIndex": "class_index"}) 
     841def AUCWilcoxon(res, class_index=-1, **argkw): 
    1279842    """ Computes the area under ROC (AUC) and its standard error using 
    1280843    Wilcoxon's approach proposed by Hanley and McNeal (1982). If  
     
    1285848    import corn 
    1286849    useweights = res.weights and not argkw.get("unweighted", 0) 
    1287     problists, tots = corn.computeROCCumulative(res, classIndex, useweights) 
     850    problists, tots = corn.computeROCCumulative(res, class_index, useweights) 
    1288851 
    1289852    results=[] 
     
    1313876AROC = AUCWilcoxon # for backward compatibility, AROC is obsolote 
    1314877 
    1315 def compare_2_AUCs(res, lrn1, lrn2, classIndex=-1, **argkw): 
     878 
     879@deprecated_keywords({"classIndex": "class_index"}) 
     880def compare_2_AUCs(res, lrn1, lrn2, class_index=-1, **argkw): 
    1316881    import corn 
    1317     return corn.compare2ROCs(res, lrn1, lrn2, classIndex, res.weights and not argkw.get("unweighted")) 
     882    return corn.compare2ROCs(res, lrn1, lrn2, class_index, res.weights and not argkw.get("unweighted")) 
    1318883 
    1319884compare_2_AROCs = compare_2_AUCs # for backward compatibility, compare_2_AROCs is obsolote 
    1320885 
    1321      
    1322 def compute_ROC(res, classIndex=-1): 
     886 
     887@deprecated_keywords({"classIndex": "class_index"}) 
     888def compute_ROC(res, class_index=-1): 
    1323889    """ Computes a ROC curve as a list of (x, y) tuples, where x is  
    1324890    1-specificity and y is sensitivity. 
    1325891    """ 
    1326892    import corn 
    1327     problists, tots = corn.computeROCCumulative(res, classIndex) 
     893    problists, tots = corn.computeROCCumulative(res, class_index) 
    1328894 
    1329895    results = [] 
     
    1357923    return (P1y - P2y) / (P1x - P2x) 
    1358924 
    1359 def ROC_add_point(P, R, keepConcavities=1): 
     925 
     926@deprecated_keywords({"keepConcavities": "keep_concavities"}) 
     927def ROC_add_point(P, R, keep_concavities=1): 
    1360928    if keepConcavities: 
    1361929        R.append(P) 
     
    1374942    return R 
    1375943 
    1376 def TC_compute_ROC(res, classIndex=-1, keepConcavities=1): 
     944 
     945@deprecated_keywords({"classIndex": "class_index", 
     946                      "keepConcavities": "keep_concavities"}) 
     947def TC_compute_ROC(res, class_index=-1, keep_concavities=1): 
    1377948    import corn 
    1378     problists, tots = corn.computeROCCumulative(res, classIndex) 
     949    problists, tots = corn.computeROCCumulative(res, class_index) 
    1379950 
    1380951    results = [] 
     
    1399970                else: 
    1400971                    fpr = 0.0 
    1401                 curve = ROC_add_point((fpr, tpr, fPrev), curve, keepConcavities) 
     972                curve = ROC_add_point((fpr, tpr, fPrev), curve, keep_concavities) 
    1402973                fPrev = f 
    1403974            thisPos, thisNeg = prob[1][1], prob[1][0] 
     
    1412983        else: 
    1413984            fpr = 0.0 
    1414         curve = ROC_add_point((fpr, tpr, f), curve, keepConcavities) ## ugly 
     985        curve = ROC_add_point((fpr, tpr, f), curve, keep_concavities) ## ugly 
    1415986        results.append(curve) 
    1416987 
     
    14721043## for each (sub)set of input ROC curves 
    14731044## returns the average ROC curve and an array of (vertical) standard deviations 
    1474 def TC_vertical_average_ROC(ROCcurves, samples = 10): 
     1045@deprecated_keywords({"ROCcurves": "roc_curves"}) 
     1046def TC_vertical_average_ROC(roc_curves, samples = 10): 
    14751047    def INTERPOLATE((P1x, P1y, P1fscore), (P2x, P2y, P2fscore), X): 
    14761048        if (P1x == P2x) or ((X > P1x) and (X > P2x)) or ((X < P1x) and (X < P2x)): 
     
    15011073    average = [] 
    15021074    stdev = [] 
    1503     for ROCS in ROCcurves: 
     1075    for ROCS in roc_curves: 
    15041076        npts = [] 
    15051077        for c in ROCS: 
     
    15311103## for each (sub)set of input ROC curves 
    15321104## returns the average ROC curve, an array of vertical standard deviations and an array of horizontal standard deviations 
    1533 def TC_threshold_average_ROC(ROCcurves, samples = 10): 
     1105@deprecated_keywords({"ROCcurves": "roc_curves"}) 
     1106def TC_threshold_average_ROC(roc_curves, samples = 10): 
    15341107    def POINT_AT_THRESH(ROC, npts, thresh): 
    15351108        i = 0 
     
    15451118    stdevV = [] 
    15461119    stdevH = [] 
    1547     for ROCS in ROCcurves: 
     1120    for ROCS in roc_curves: 
    15481121        npts = [] 
    15491122        for c in ROCS: 
     
    15961169##  - yesClassRugPoints is an array of (x, 1) points 
    15971170##  - noClassRugPoints is an array of (x, 0) points 
    1598 def compute_calibration_curve(res, classIndex=-1): 
     1171@deprecated_keywords({"classIndex": "class_index"}) 
     1172def compute_calibration_curve(res, class_index=-1): 
    15991173    import corn 
    16001174    ## merge multiple iterations into one 
     
    16031177        mres.results.append( te ) 
    16041178 
    1605     problists, tots = corn.computeROCCumulative(mres, classIndex) 
     1179    problists, tots = corn.computeROCCumulative(mres, class_index) 
    16061180 
    16071181    results = [] 
     
    16581232## returns an array of curve elements, where: 
    16591233##  - curve is an array of points ((TP+FP)/(P + N), TP/P, (th, FP/N)) on the Lift Curve 
    1660 def compute_lift_curve(res, classIndex=-1): 
     1234@deprecated_keywords({"classIndex": "class_index"}) 
     1235def compute_lift_curve(res, class_index=-1): 
    16611236    import corn 
    16621237    ## merge multiple iterations into one 
     
    16651240        mres.results.append( te ) 
    16661241 
    1667     problists, tots = corn.computeROCCumulative(mres, classIndex) 
     1242    problists, tots = corn.computeROCCumulative(mres, class_index) 
    16681243 
    16691244    results = [] 
     
    16931268 
    16941269 
    1695 def compute_CDT(res, classIndex=-1, **argkw): 
     1270@deprecated_keywords({"classIndex": "class_index"}) 
     1271def compute_CDT(res, class_index=-1, **argkw): 
    16961272    """Obsolete, don't use""" 
    16971273    import corn 
    1698     if classIndex<0: 
     1274    if class_index<0: 
    16991275        if res.baseClass>=0: 
    1700             classIndex = res.baseClass 
    1701         else: 
    1702             classIndex = 1 
     1276            class_index = res.baseClass 
     1277        else: 
     1278            class_index = 1 
    17031279             
    17041280    useweights = res.weights and not argkw.get("unweighted", 0) 
     
    17091285        iterationExperiments = split_by_iterations(res) 
    17101286        for exp in iterationExperiments: 
    1711             expCDTs = corn.computeCDT(exp, classIndex, useweights) 
     1287            expCDTs = corn.computeCDT(exp, class_index, useweights) 
    17121288            for i in range(len(CDTs)): 
    17131289                CDTs[i].C += expCDTs[i].C 
     
    17161292        for i in range(res.number_of_learners): 
    17171293            if is_CDT_empty(CDTs[0]): 
    1718                 return corn.computeCDT(res, classIndex, useweights) 
     1294                return corn.computeCDT(res, class_index, useweights) 
    17191295         
    17201296        return CDTs 
    17211297    else: 
    1722         return corn.computeCDT(res, classIndex, useweights) 
     1298        return corn.computeCDT(res, class_index, useweights) 
    17231299 
    17241300## THIS FUNCTION IS OBSOLETE AND ITS AVERAGING OVER FOLDS IS QUESTIONABLE 
     
    17641340# are divided by 'divideByIfIte'. Additional flag is returned which is True in 
    17651341# the former case, or False in the latter. 
    1766 def AUC_x(cdtComputer, ite, all_ite, divideByIfIte, computerArgs): 
    1767     cdts = cdtComputer(*(ite, ) + computerArgs) 
     1342@deprecated_keywords({"divideByIfIte": "divide_by_if_ite", 
     1343                      "computerArgs": "computer_args"}) 
     1344def AUC_x(cdtComputer, ite, all_ite, divide_by_if_ite, computer_args): 
     1345    cdts = cdtComputer(*(ite, ) + computer_args) 
    17681346    if not is_CDT_empty(cdts[0]): 
    1769         return [(cdt.C+cdt.T/2)/(cdt.C+cdt.D+cdt.T)/divideByIfIte for cdt in cdts], True 
     1347        return [(cdt.C+cdt.T/2)/(cdt.C+cdt.D+cdt.T)/divide_by_if_ite for cdt in cdts], True 
    17701348         
    17711349    if all_ite: 
    1772         cdts = cdtComputer(*(all_ite, ) + computerArgs) 
     1350        cdts = cdtComputer(*(all_ite, ) + computer_args) 
    17731351        if not is_CDT_empty(cdts[0]): 
    17741352            return [(cdt.C+cdt.T/2)/(cdt.C+cdt.D+cdt.T) for cdt in cdts], False 
     
    17781356     
    17791357# computes AUC between classes i and j as if there we no other classes 
    1780 def AUC_ij(ite, classIndex1, classIndex2, useWeights = True, all_ite = None, divideByIfIte = 1.0): 
     1358@deprecated_keywords({"classIndex1": "class_index1", 
     1359                      "classIndex2": "class_index2", 
     1360                      "useWeights": "use_weights", 
     1361                      "divideByIfIte": "divide_by_if_ite"}) 
     1362def AUC_ij(ite, class_index1, class_index2, use_weights = True, all_ite = None, divide_by_if_ite = 1.0): 
    17811363    import corn 
    1782     return AUC_x(corn.computeCDTPair, ite, all_ite, divideByIfIte, (classIndex1, classIndex2, useWeights)) 
     1364    return AUC_x(corn.computeCDTPair, ite, all_ite, divide_by_if_ite, (class_index1, class_index2, use_weights)) 
    17831365 
    17841366 
    17851367# computes AUC between class i and the other classes (treating them as the same class) 
    1786 def AUC_i(ite, classIndex, useWeights = True, all_ite = None, divideByIfIte = 1.0): 
     1368@deprecated_keywords({"classIndex": "class_index", 
     1369                      "useWeights": "use_weights", 
     1370                      "divideByIfIte": "divide_by_if_ite"}) 
     1371def AUC_i(ite, class_index, use_weights = True, all_ite = None, divide_by_if_ite = 1.0): 
    17871372    import corn 
    1788     return AUC_x(corn.computeCDT, ite, all_ite, divideByIfIte, (classIndex, useWeights)) 
    1789     
     1373    return AUC_x(corn.computeCDT, ite, all_ite, divide_by_if_ite, (class_index, use_weights)) 
     1374 
    17901375 
    17911376# computes the average AUC over folds using a "AUCcomputer" (AUC_i or AUC_ij) 
     
    17931378# fold the computer has to resort to computing over all folds or even this failed; 
    17941379# in these cases the result is returned immediately 
    1795 def AUC_iterations(AUCcomputer, iterations, computerArgs): 
     1380 
     1381@deprecated_keywords({"AUCcomputer": "auc_computer", 
     1382                      "computerArgs": "computer_args"}) 
     1383def AUC_iterations(auc_computer, iterations, computer_args): 
    17961384    subsum_aucs = [0.] * iterations[0].number_of_learners 
    17971385    for ite in iterations: 
    1798         aucs, foldsUsed = AUCcomputer(*(ite, ) + computerArgs) 
     1386        aucs, foldsUsed = auc_computer(*(ite, ) + computer_args) 
    17991387        if not aucs: 
    18001388            return None 
     
    18061394 
    18071395# AUC for binary classification problems 
    1808 def AUC_binary(res, useWeights = True): 
     1396@deprecated_keywords({"useWeights": "use_weights"}) 
     1397def AUC_binary(res, use_weights = True): 
    18091398    if res.number_of_iterations > 1: 
    1810         return AUC_iterations(AUC_i, split_by_iterations(res), (-1, useWeights, res, res.number_of_iterations)) 
    1811     else: 
    1812         return AUC_i(res, -1, useWeights)[0] 
     1399        return AUC_iterations(AUC_i, split_by_iterations(res), (-1, use_weights, res, res.number_of_iterations)) 
     1400    else: 
     1401        return AUC_i(res, -1, use_weights)[0] 
    18131402 
    18141403# AUC for multiclass problems 
    1815 def AUC_multi(res, useWeights = True, method = 0): 
     1404@deprecated_keywords({"useWeights": "use_weights"}) 
     1405def AUC_multi(res, use_weights = True, method = 0): 
    18161406    numberOfClasses = len(res.class_values) 
    18171407     
     
    18331423        for classIndex1 in range(numberOfClasses): 
    18341424            for classIndex2 in range(classIndex1): 
    1835                 subsum_aucs = AUC_iterations(AUC_ij, iterations, (classIndex1, classIndex2, useWeights, all_ite, res.number_of_iterations)) 
     1425                subsum_aucs = AUC_iterations(AUC_ij, iterations, (classIndex1, classIndex2, use_weights, all_ite, res.number_of_iterations)) 
    18361426                if subsum_aucs: 
    18371427                    if method == 0: 
     
    18441434    else: 
    18451435        for classIndex in range(numberOfClasses): 
    1846             subsum_aucs = AUC_iterations(AUC_i, iterations, (classIndex, useWeights, all_ite, res.number_of_iterations)) 
     1436            subsum_aucs = AUC_iterations(AUC_i, iterations, (classIndex, use_weights, all_ite, res.number_of_iterations)) 
    18471437            if subsum_aucs: 
    18481438                if method == 0: 
     
    18661456# Computes AUC, possibly for multiple classes (the averaging method can be specified) 
    18671457# Results over folds are averages; if some folds examples from one class only, the folds are merged 
    1868 def AUC(res, method = AUC.ByWeightedPairs, useWeights = True): 
     1458@deprecated_keywords({"useWeights": "use_weights"}) 
     1459def AUC(res, method = AUC.ByWeightedPairs, use_weights = True): 
    18691460    """ Returns the area under ROC curve (AUC) given a set of experimental 
    18701461    results. For multivalued class problems, it will compute some sort of 
     
    18741465        raise ValueError("Cannot compute AUC on a single-class problem") 
    18751466    elif len(res.class_values) == 2: 
    1876         return AUC_binary(res, useWeights) 
    1877     else: 
    1878         return AUC_multi(res, useWeights, method) 
     1467        return AUC_binary(res, use_weights) 
     1468    else: 
     1469        return AUC_multi(res, use_weights, method) 
    18791470 
    18801471AUC.ByWeightedPairs = 0 
     
    18861477# Computes AUC; in multivalued class problem, AUC is computed as one against all 
    18871478# Results over folds are averages; if some folds examples from one class only, the folds are merged 
    1888 def AUC_single(res, classIndex = -1, useWeights = True): 
     1479@deprecated_keywords({"classIndex": "class_index", 
     1480                      "useWeights": "use_weights"}) 
     1481def AUC_single(res, class_index = -1, use_weights = True): 
    18891482    """ Computes AUC where the class given classIndex is singled out, and 
    18901483    all other classes are treated as a single class. To find how good our 
     
    18951488classIndex = vehicle.domain.classVar.values.index("van")) 
    18961489    """ 
    1897     if classIndex<0: 
     1490    if class_index<0: 
    18981491        if res.baseClass>=0: 
    1899             classIndex = res.baseClass 
    1900         else: 
    1901             classIndex = 1 
     1492            class_index = res.baseClass 
     1493        else: 
     1494            class_index = 1 
    19021495 
    19031496    if res.number_of_iterations > 1: 
    1904         return AUC_iterations(AUC_i, split_by_iterations(res), (classIndex, useWeights, res, res.number_of_iterations)) 
    1905     else: 
    1906         return AUC_i( res, classIndex, useWeights)[0] 
     1497        return AUC_iterations(AUC_i, split_by_iterations(res), (class_index, use_weights, res, res.number_of_iterations)) 
     1498    else: 
     1499        return AUC_i( res, class_index, use_weights)[0] 
    19071500 
    19081501# Computes AUC for a pair of classes (as if there were no other classes) 
    19091502# Results over folds are averages; if some folds have examples from one class only, the folds are merged 
    1910 def AUC_pair(res, classIndex1, classIndex2, useWeights = True): 
     1503@deprecated_keywords({"classIndex1": "class_index1", 
     1504                      "classIndex2": "class_index2", 
     1505                      "useWeights": "use_weights"}) 
     1506def AUC_pair(res, class_index1, class_index2, use_weights = True): 
    19111507    """ Computes AUC between a pair of instances, ignoring instances from all 
    19121508    other classes. 
    19131509    """ 
    19141510    if res.number_of_iterations > 1: 
    1915         return AUC_iterations(AUC_ij, split_by_iterations(res), (classIndex1, classIndex2, useWeights, res, res.number_of_iterations)) 
    1916     else: 
    1917         return AUC_ij(res, classIndex1, classIndex2, useWeights) 
     1511        return AUC_iterations(AUC_ij, split_by_iterations(res), (class_index1, class_index2, use_weights, res, res.number_of_iterations)) 
     1512    else: 
     1513        return AUC_ij(res, class_index1, class_index2, use_weights) 
    19181514   
    19191515 
    19201516# AUC for multiclass problems 
    1921 def AUC_matrix(res, useWeights = True): 
     1517@deprecated_keywords({"useWeights": "use_weights"}) 
     1518def AUC_matrix(res, use_weights = True): 
    19221519    """ Computes a (lower diagonal) matrix with AUCs for all pairs of classes. 
    19231520    If there are empty classes, the corresponding elements in the matrix 
     
    19441541    for classIndex1 in range(numberOfClasses): 
    19451542        for classIndex2 in range(classIndex1): 
    1946             pair_aucs = AUC_iterations(AUC_ij, iterations, (classIndex1, classIndex2, useWeights, all_ite, res.number_of_iterations)) 
     1543            pair_aucs = AUC_iterations(AUC_ij, iterations, (classIndex1, classIndex2, use_weights, all_ite, res.number_of_iterations)) 
    19471544            if pair_aucs: 
    19481545                for lrn in range(number_of_learners): 
     
    20801677 
    20811678 
    2082 def plot_learning_curve_learners(file, allResults, proportions, learners, noConfidence=0): 
    2083     plot_learning_curve(file, allResults, proportions, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))], noConfidence) 
    2084      
    2085 def plot_learning_curve(file, allResults, proportions, legend, noConfidence=0): 
     1679@deprecated_keywords({"allResults": "all_results", 
     1680                      "noConfidence": "no_confidence"}) 
     1681def plot_learning_curve_learners(file, all_results, proportions, learners, no_confidence=0): 
     1682    plot_learning_curve(file, all_results, proportions, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))], no_confidence) 
     1683 
     1684 
     1685@deprecated_keywords({"allResults": "all_results", 
     1686                      "noConfidence": "no_confidence"}) 
     1687def plot_learning_curve(file, all_results, proportions, legend, no_confidence=0): 
    20861688    import types 
    20871689    fopened=0 
    2088     if (type(file)==types.StringType): 
     1690    if type(file)==types.StringType: 
    20891691        file=open(file, "wt") 
    20901692        fopened=1 
     
    20931695    file.write("set xrange [%f:%f]\n" % (proportions[0], proportions[-1])) 
    20941696    file.write("set multiplot\n\n") 
    2095     CAs = [CA_dev(x) for x in allResults] 
     1697    CAs = [CA_dev(x) for x in all_results] 
    20961698 
    20971699    file.write("plot \\\n") 
    20981700    for i in range(len(legend)-1): 
    2099         if not noConfidence: 
     1701        if not no_confidence: 
    21001702            file.write("'-' title '' with yerrorbars pointtype %i,\\\n" % (i+1)) 
    21011703        file.write("'-' title '%s' with linespoints pointtype %i,\\\n" % (legend[i], i+1)) 
    2102     if not noConfidence: 
     1704    if not no_confidence: 
    21031705        file.write("'-' title '' with yerrorbars pointtype %i,\\\n" % (len(legend))) 
    21041706    file.write("'-' title '%s' with linespoints pointtype %i\n" % (legend[-1], len(legend))) 
    21051707 
    21061708    for i in range(len(legend)): 
    2107         if not noConfidence: 
     1709        if not no_confidence: 
    21081710            for p in range(len(proportions)): 
    21091711                file.write("%f\t%f\t%f\n" % (proportions[p], CAs[p][i][0], 1.96*CAs[p][i][1])) 
     
    21621764 
    21631765 
    2164  
    2165 def plot_McNemar_curve_learners(file, allResults, proportions, learners, reference=-1): 
    2166     plot_McNemar_curve(file, allResults, proportions, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))], reference) 
    2167  
    2168 def plot_McNemar_curve(file, allResults, proportions, legend, reference=-1): 
     1766@deprecated_keywords({"allResults": "all_results"}) 
     1767def plot_McNemar_curve_learners(file, all_results, proportions, learners, reference=-1): 
     1768    plot_McNemar_curve(file, all_results, proportions, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))], reference) 
     1769 
     1770 
     1771@deprecated_keywords({"allResults": "all_results"}) 
     1772def plot_McNemar_curve(file, all_results, proportions, legend, reference=-1): 
    21691773    if reference<0: 
    21701774        reference=len(legend)-1 
     
    21881792    for i in tmap: 
    21891793        for p in range(len(proportions)): 
    2190             file.write("%f\t%f\n" % (proportions[p], McNemar_of_two(allResults[p], i, reference))) 
     1794            file.write("%f\t%f\n" % (proportions[p], McNemar_of_two(all_results[p], i, reference))) 
    21911795        file.write("e\n\n") 
    21921796 
     
    21971801default_line_types=("\\setsolid", "\\setdashpattern <4pt, 2pt>", "\\setdashpattern <8pt, 2pt>", "\\setdashes", "\\setdots") 
    21981802 
    2199 def learning_curve_learners_to_PiCTeX(file, allResults, proportions, **options): 
    2200     return apply(learning_curve_to_PiCTeX, (file, allResults, proportions), options) 
    2201      
    2202 def learning_curve_to_PiCTeX(file, allResults, proportions, **options): 
     1803@deprecated_keywords({"allResults": "all_results"}) 
     1804def learning_curve_learners_to_PiCTeX(file, all_results, proportions, **options): 
     1805    return apply(learning_curve_to_PiCTeX, (file, all_results, proportions), options) 
     1806 
     1807 
     1808@deprecated_keywords({"allResults": "all_results"}) 
     1809def learning_curve_to_PiCTeX(file, all_results, proportions, **options): 
    22031810    import types 
    22041811    fopened=0 
     
    22071814        fopened=1 
    22081815 
    2209     nexamples=len(allResults[0].results) 
    2210     CAs = [CA_dev(x) for x in allResults] 
     1816    nexamples=len(all_results[0].results) 
     1817    CAs = [CA_dev(x) for x in all_results] 
    22111818 
    22121819    graphsize=float(options.get("graphsize", 10.0)) #cm 
  • Orange/feature/__init__.py

    r9671 r9895  
    1010import imputation 
    1111 
     12from Orange.core import Variable as Descriptor 
     13from Orange.core import EnumVariable as Discrete 
     14from Orange.core import FloatVariable as Continuous 
     15from Orange.core import PythonVariable as Python 
     16from Orange.core import StringVariable as String 
     17 
     18from Orange.core import VarList as Descriptors 
     19 
     20from Orange.core import newmetaid as new_meta_id 
     21 
     22from Orange.core import Variable as V 
     23make = V.make 
     24retrieve = V.get_existing 
     25MakeStatus = V.MakeStatus 
     26del V 
     27 
    1228__docformat__ = 'restructuredtext' 
  • docs/reference/rst/Orange.data.rst

    r9900 r9901  
    55.. toctree:: 
    66 
    7     Orange.data.variable 
    87    Orange.data.domain 
    98    Orange.data.value 
  • docs/reference/rst/Orange.evaluation.scoring.rst

    r9372 r9892  
    11.. automodule:: Orange.evaluation.scoring 
     2 
     3############################ 
     4Method scoring (``scoring``) 
     5############################ 
     6 
     7.. index: scoring 
     8 
     9This module contains various measures of quality for classification and 
     10regression. Most functions require an argument named :obj:`res`, an instance of 
     11:class:`Orange.evaluation.testing.ExperimentResults` as computed by 
     12functions from :mod:`Orange.evaluation.testing` and which contains 
     13predictions obtained through cross-validation, 
     14leave one-out, testing on training data or test set instances. 
     15 
     16============== 
     17Classification 
     18============== 
     19 
     20To prepare some data for examples on this page, we shall load the voting data 
     21set (problem of predicting the congressman's party (republican, democrat) 
     22based on a selection of votes) and evaluate naive Bayesian learner, 
     23classification trees and majority classifier using cross-validation. 
     24For examples requiring a multivalued class problem, we shall do the same 
     25with the vehicle data set (telling whether a vehicle described by the features 
     26extracted from a picture is a van, bus, or Opel or Saab car). 
     27 
     28Basic cross validation example is shown in the following part of 
     29(:download:`statExamples.py <code/statExamples.py>`, uses :download:`voting.tab <code/voting.tab>` and :download:`vehicle.tab <code/vehicle.tab>`): 
     30 
     31.. literalinclude:: code/statExample0.py 
     32 
     33If instances are weighted, weights are taken into account. This can be 
     34disabled by giving :obj:`unweighted=1` as a keyword argument. Another way of 
     35disabling weights is to clear the 
     36:class:`Orange.evaluation.testing.ExperimentResults`' flag weights. 
     37 
     38General Measures of Quality 
     39=========================== 
     40 
     41.. autofunction:: CA 
     42 
     43.. autofunction:: AP 
     44 
     45.. autofunction:: Brier_score 
     46 
     47.. autofunction:: IS 
     48 
     49So, let's compute all this in part of 
     50(:download:`statExamples.py <code/statExamples.py>`, uses :download:`voting.tab <code/voting.tab>` and :download:`vehicle.tab <code/vehicle.tab>`) and print it out: 
     51 
     52.. literalinclude:: code/statExample1.py 
     53   :lines: 13- 
     54 
     55The output should look like this:: 
     56 
     57    method  CA      AP      Brier    IS 
     58    bayes   0.903   0.902   0.175    0.759 
     59    tree    0.846   0.845   0.286    0.641 
     60    majorty  0.614   0.526   0.474   -0.000 
     61 
     62Script :download:`statExamples.py <code/statExamples.py>` contains another example that also prints out 
     63the standard errors. 
     64 
     65Confusion Matrix 
     66================ 
     67 
     68.. autofunction:: confusion_matrices 
     69 
     70   **A positive-negative confusion matrix** is computed (a) if the class is 
     71   binary unless :obj:`classIndex` argument is -2, (b) if the class is 
     72   multivalued and the :obj:`classIndex` is non-negative. Argument 
     73   :obj:`classIndex` then tells which class is positive. In case (a), 
     74   :obj:`classIndex` may be omitted; the first class 
     75   is then negative and the second is positive, unless the :obj:`baseClass` 
     76   attribute in the object with results has non-negative value. In that case, 
     77   :obj:`baseClass` is an index of the target class. :obj:`baseClass` 
     78   attribute of results object should be set manually. The result of a 
     79   function is a list of instances of class :class:`ConfusionMatrix`, 
     80   containing the (weighted) number of true positives (TP), false 
     81   negatives (FN), false positives (FP) and true negatives (TN). 
     82 
     83   We can also add the keyword argument :obj:`cutoff` 
     84   (e.g. confusion_matrices(results, cutoff=0.3); if we do, :obj:`confusion_matrices` 
     85   will disregard the classifiers' class predictions and observe the predicted 
     86   probabilities, and consider the prediction "positive" if the predicted 
     87   probability of the positive class is higher than the :obj:`cutoff`. 
     88 
     89   The example (part of :download:`statExamples.py <code/statExamples.py>`) below shows how setting the 
     90   cut off threshold from the default 0.5 to 0.2 affects the confusion matrics 
     91   for naive Bayesian classifier:: 
     92 
     93       cm = Orange.evaluation.scoring.confusion_matrices(res)[0] 
     94       print "Confusion matrix for naive Bayes:" 
     95       print "TP: %i, FP: %i, FN: %s, TN: %i" % (cm.TP, cm.FP, cm.FN, cm.TN) 
     96 
     97       cm = Orange.evaluation.scoring.confusion_matrices(res, cutoff=0.2)[0] 
     98       print "Confusion matrix for naive Bayes:" 
     99       print "TP: %i, FP: %i, FN: %s, TN: %i" % (cm.TP, cm.FP, cm.FN, cm.TN) 
     100 
     101   The output:: 
     102 
     103       Confusion matrix for naive Bayes: 
     104       TP: 238, FP: 13, FN: 29.0, TN: 155 
     105       Confusion matrix for naive Bayes: 
     106       TP: 239, FP: 18, FN: 28.0, TN: 150 
     107 
     108   shows that the number of true positives increases (and hence the number of 
     109   false negatives decreases) by only a single instance, while five instances 
     110   that were originally true negatives become false positives due to the 
     111   lower threshold. 
     112 
     113   To observe how good are the classifiers in detecting vans in the vehicle 
     114   data set, we would compute the matrix like this:: 
     115 
     116      cm = Orange.evaluation.scoring.confusion_matrices(resVeh, \ 
     117vehicle.domain.classVar.values.index("van")) 
     118 
     119   and get the results like these:: 
     120 
     121       TP: 189, FP: 241, FN: 10.0, TN: 406 
     122 
     123   while the same for class "opel" would give:: 
     124 
     125       TP: 86, FP: 112, FN: 126.0, TN: 522 
     126 
     127   The main difference is that there are only a few false negatives for the 
     128   van, meaning that the classifier seldom misses it (if it says it's not a 
     129   van, it's almost certainly not a van). Not so for the Opel car, where the 
     130   classifier missed 126 of them and correctly detected only 86. 
     131 
     132   **General confusion matrix** is computed (a) in case of a binary class, 
     133   when :obj:`classIndex` is set to -2, (b) when we have multivalued class and 
     134   the caller doesn't specify the :obj:`classIndex` of the positive class. 
     135   When called in this manner, the function cannot use the argument 
     136   :obj:`cutoff`. 
     137 
     138   The function then returns a three-dimensional matrix, where the element 
     139   A[:obj:`learner`][:obj:`actual_class`][:obj:`predictedClass`] 
     140   gives the number of instances belonging to 'actual_class' for which the 
     141   'learner' predicted 'predictedClass'. We shall compute and print out 
     142   the matrix for naive Bayesian classifier. 
     143 
     144   Here we see another example from :download:`statExamples.py <code/statExamples.py>`:: 
     145 
     146       cm = Orange.evaluation.scoring.confusion_matrices(resVeh)[0] 
     147       classes = vehicle.domain.classVar.values 
     148       print "\t"+"\t".join(classes) 
     149       for className, classConfusions in zip(classes, cm): 
     150           print ("%s" + ("\t%i" * len(classes))) % ((className, ) + tuple(classConfusions)) 
     151 
     152   So, here's what this nice piece of code gives:: 
     153 
     154              bus   van  saab opel 
     155       bus     56   95   21   46 
     156       van     6    189  4    0 
     157       saab    3    75   73   66 
     158       opel    4    71   51   86 
     159 
     160   Van's are clearly simple: 189 vans were classified as vans (we know this 
     161   already, we've printed it out above), and the 10 misclassified pictures 
     162   were classified as buses (6) and Saab cars (4). In all other classes, 
     163   there were more instances misclassified as vans than correctly classified 
     164   instances. The classifier is obviously quite biased to vans. 
     165 
     166   .. method:: sens(confm) 
     167   .. method:: spec(confm) 
     168   .. method:: PPV(confm) 
     169   .. method:: NPV(confm) 
     170   .. method:: precision(confm) 
     171   .. method:: recall(confm) 
     172   .. method:: F2(confm) 
     173   .. method:: Falpha(confm, alpha=2.0) 
     174   .. method:: MCC(conf) 
     175 
     176   With the confusion matrix defined in terms of positive and negative 
     177   classes, you can also compute the 
     178   `sensitivity <http://en.wikipedia.org/wiki/Sensitivity_(tests)>`_ 
     179   [TP/(TP+FN)], `specificity \ 
     180<http://en.wikipedia.org/wiki/Specificity_%28tests%29>`_ 
     181   [TN/(TN+FP)], `positive predictive value \ 
     182<http://en.wikipedia.org/wiki/Positive_predictive_value>`_ 
     183   [TP/(TP+FP)] and `negative predictive value \ 
     184<http://en.wikipedia.org/wiki/Negative_predictive_value>`_ [TN/(TN+FN)]. 
     185   In information retrieval, positive predictive value is called precision 
     186   (the ratio of the number of relevant records retrieved to the total number 
     187   of irrelevant and relevant records retrieved), and sensitivity is called 
     188   `recall <http://en.wikipedia.org/wiki/Information_retrieval>`_ 
     189   (the ratio of the number of relevant records retrieved to the total number 
     190   of relevant records in the database). The 
     191   `harmonic mean <http://en.wikipedia.org/wiki/Harmonic_mean>`_ of precision 
     192   and recall is called an 
     193   `F-measure <http://en.wikipedia.org/wiki/F-measure>`_, where, depending 
     194   on the ratio of the weight between precision and recall is implemented 
     195   as F1 [2*precision*recall/(precision+recall)] or, for a general case, 
     196   Falpha [(1+alpha)*precision*recall / (alpha*precision + recall)]. 
     197   The `Matthews correlation coefficient \ 
     198<http://en.wikipedia.org/wiki/Matthews_correlation_coefficient>`_ 
     199   in essence a correlation coefficient between 
     200   the observed and predicted binary classifications; it returns a value 
     201   between -1 and +1. A coefficient of +1 represents a perfect prediction, 
     202   0 an average random prediction and -1 an inverse prediction. 
     203 
     204   If the argument :obj:`confm` is a single confusion matrix, a single 
     205   result (a number) is returned. If confm is a list of confusion matrices, 
     206   a list of scores is returned, one for each confusion matrix. 
     207 
     208   Note that weights are taken into account when computing the matrix, so 
     209   these functions don't check the 'weighted' keyword argument. 
     210 
     211   Let us print out sensitivities and specificities of our classifiers in 
     212   part of :download:`statExamples.py <code/statExamples.py>`:: 
     213 
     214       cm = Orange.evaluation.scoring.confusion_matrices(res) 
     215       print 
     216       print "method\tsens\tspec" 
     217       for l in range(len(learners)): 
     218           print "%s\t%5.3f\t%5.3f" % (learners[l].name, Orange.evaluation.scoring.sens(cm[l]), Orange.evaluation.scoring.spec(cm[l])) 
     219 
     220ROC Analysis 
     221============ 
     222 
     223`Receiver Operating Characteristic \ 
     224<http://en.wikipedia.org/wiki/Receiver_operating_characteristic>`_ 
     225(ROC) analysis was initially developed for 
     226a binary-like problems and there is no consensus on how to apply it in 
     227multi-class problems, nor do we know for sure how to do ROC analysis after 
     228cross validation and similar multiple sampling techniques. If you are 
     229interested in the area under the curve, function AUC will deal with those 
     230problems as specifically described below. 
     231 
     232.. autofunction:: AUC 
     233 
     234   .. attribute:: AUC.ByWeightedPairs (or 0) 
     235 
     236      Computes AUC for each pair of classes (ignoring instances of all other 
     237      classes) and averages the results, weighting them by the number of 
     238      pairs of instances from these two classes (e.g. by the product of 
     239      probabilities of the two classes). AUC computed in this way still 
     240      behaves as concordance index, e.g., gives the probability that two 
     241      randomly chosen instances from different classes will be correctly 
     242      recognized (this is of course true only if the classifier knows 
     243      from which two classes the instances came). 
     244 
     245   .. attribute:: AUC.ByPairs (or 1) 
     246 
     247      Similar as above, except that the average over class pairs is not 
     248      weighted. This AUC is, like the binary, independent of class 
     249      distributions, but it is not related to concordance index any more. 
     250 
     251   .. attribute:: AUC.WeightedOneAgainstAll (or 2) 
     252 
     253      For each class, it computes AUC for this class against all others (that 
     254      is, treating other classes as one class). The AUCs are then averaged by 
     255      the class probabilities. This is related to concordance index in which 
     256      we test the classifier's (average) capability for distinguishing the 
     257      instances from a specified class from those that come from other classes. 
     258      Unlike the binary AUC, the measure is not independent of class 
     259      distributions. 
     260 
     261   .. attribute:: AUC.OneAgainstAll (or 3) 
     262 
     263      As above, except that the average is not weighted. 
     264 
     265   In case of multiple folds (for instance if the data comes from cross 
     266   validation), the computation goes like this. When computing the partial 
     267   AUCs for individual pairs of classes or singled-out classes, AUC is 
     268   computed for each fold separately and then averaged (ignoring the number 
     269   of instances in each fold, it's just a simple average). However, if a 
     270   certain fold doesn't contain any instances of a certain class (from the 
     271   pair), the partial AUC is computed treating the results as if they came 
     272   from a single-fold. This is not really correct since the class 
     273   probabilities from different folds are not necessarily comparable, 
     274   yet this will most often occur in a leave-one-out experiments, 
     275   comparability shouldn't be a problem. 
     276 
     277   Computing and printing out the AUC's looks just like printing out 
     278   classification accuracies (except that we call AUC instead of 
     279   CA, of course):: 
     280 
     281       AUCs = Orange.evaluation.scoring.AUC(res) 
     282       for l in range(len(learners)): 
     283           print "%10s: %5.3f" % (learners[l].name, AUCs[l]) 
     284 
     285   For vehicle, you can run exactly this same code; it will compute AUCs 
     286   for all pairs of classes and return the average weighted by probabilities 
     287   of pairs. Or, you can specify the averaging method yourself, like this:: 
     288 
     289       AUCs = Orange.evaluation.scoring.AUC(resVeh, Orange.evaluation.scoring.AUC.WeightedOneAgainstAll) 
     290 
     291   The following snippet tries out all four. (We don't claim that this is 
     292   how the function needs to be used; it's better to stay with the default.):: 
     293 
     294       methods = ["by pairs, weighted", "by pairs", "one vs. all, weighted", "one vs. all"] 
     295       print " " *25 + "  \tbayes\ttree\tmajority" 
     296       for i in range(4): 
     297           AUCs = Orange.evaluation.scoring.AUC(resVeh, i) 
     298           print "%25s: \t%5.3f\t%5.3f\t%5.3f" % ((methods[i], ) + tuple(AUCs)) 
     299 
     300   As you can see from the output:: 
     301 
     302                                   bayes   tree    majority 
     303              by pairs, weighted:  0.789   0.871   0.500 
     304                        by pairs:  0.791   0.872   0.500 
     305           one vs. all, weighted:  0.783   0.800   0.500 
     306                     one vs. all:  0.783   0.800   0.500 
     307 
     308.. autofunction:: AUC_single 
     309 
     310.. autofunction:: AUC_pair 
     311 
     312.. autofunction:: AUC_matrix 
     313 
     314The remaining functions, which plot the curves and statistically compare 
     315them, require that the results come from a test with a single iteration, 
     316and they always compare one chosen class against all others. If you have 
     317cross validation results, you can either use split_by_iterations to split the 
     318results by folds, call the function for each fold separately and then sum 
     319the results up however you see fit, or you can set the ExperimentResults' 
     320attribute number_of_iterations to 1, to cheat the function - at your own 
     321responsibility for the statistical correctness. Regarding the multi-class 
     322problems, if you don't chose a specific class, Orange.evaluation.scoring will use the class 
     323attribute's baseValue at the time when results were computed. If baseValue 
     324was not given at that time, 1 (that is, the second class) is used as default. 
     325 
     326We shall use the following code to prepare suitable experimental results:: 
     327 
     328    ri2 = Orange.core.MakeRandomIndices2(voting, 0.6) 
     329    train = voting.selectref(ri2, 0) 
     330    test = voting.selectref(ri2, 1) 
     331    res1 = Orange.evaluation.testing.learnAndTestOnTestData(learners, train, test) 
     332 
     333 
     334.. autofunction:: AUCWilcoxon 
     335 
     336.. autofunction:: compute_ROC 
     337 
     338Comparison of Algorithms 
     339------------------------ 
     340 
     341.. autofunction:: McNemar 
     342 
     343.. autofunction:: McNemar_of_two 
     344 
     345========== 
     346Regression 
     347========== 
     348 
     349General Measure of Quality 
     350========================== 
     351 
     352Several alternative measures, as given below, can be used to evaluate 
     353the sucess of numeric prediction: 
     354 
     355.. image:: files/statRegression.png 
     356 
     357.. autofunction:: MSE 
     358 
     359.. autofunction:: RMSE 
     360 
     361.. autofunction:: MAE 
     362 
     363.. autofunction:: RSE 
     364 
     365.. autofunction:: RRSE 
     366 
     367.. autofunction:: RAE 
     368 
     369.. autofunction:: R2 
     370 
     371The following code (:download:`statExamples.py <code/statExamples.py>`) uses most of the above measures to 
     372score several regression methods. 
     373 
     374.. literalinclude:: code/statExamplesRegression.py 
     375 
     376The code above produces the following output:: 
     377 
     378    Learner   MSE     RMSE    MAE     RSE     RRSE    RAE     R2 
     379    maj       84.585  9.197   6.653   1.002   1.001   1.001  -0.002 
     380    rt        40.015  6.326   4.592   0.474   0.688   0.691   0.526 
     381    knn       21.248  4.610   2.870   0.252   0.502   0.432   0.748 
     382    lr        24.092  4.908   3.425   0.285   0.534   0.515   0.715 
     383 
     384================= 
     385Ploting functions 
     386================= 
     387 
     388.. autofunction:: graph_ranks 
     389 
     390The following script (:download:`statExamplesGraphRanks.py <code/statExamplesGraphRanks.py>`) shows hot to plot a graph: 
     391 
     392.. literalinclude:: code/statExamplesGraphRanks.py 
     393 
     394Code produces the following graph: 
     395 
     396.. image:: files/statExamplesGraphRanks1.png 
     397 
     398.. autofunction:: compute_CD 
     399 
     400.. autofunction:: compute_friedman 
     401 
     402================= 
     403Utility Functions 
     404================= 
     405 
     406.. autofunction:: split_by_iterations 
     407 
     408===================================== 
     409Scoring for multilabel classification 
     410===================================== 
     411 
     412Multi-label classification requries different metrics than those used in traditional single-label 
     413classification. This module presents the various methrics that have been proposed in the literature. 
     414Let :math:`D` be a multi-label evaluation data set, conisting of :math:`|D|` multi-label examples 
     415:math:`(x_i,Y_i)`, :math:`i=1..|D|`, :math:`Y_i \\subseteq L`. Let :math:`H` be a multi-label classifier 
     416and :math:`Z_i=H(x_i)` be the set of labels predicted by :math:`H` for example :math:`x_i`. 
     417 
     418.. autofunction:: mlc_hamming_loss 
     419.. autofunction:: mlc_accuracy 
     420.. autofunction:: mlc_precision 
     421.. autofunction:: mlc_recall 
     422 
     423So, let's compute all this and print it out (part of 
     424:download:`mlc-evaluate.py <code/mlc-evaluate.py>`, uses 
     425:download:`emotions.tab <code/emotions.tab>`): 
     426 
     427.. literalinclude:: code/mlc-evaluate.py 
     428   :lines: 1-15 
     429 
     430The output should look like this:: 
     431 
     432    loss= [0.9375] 
     433    accuracy= [0.875] 
     434    precision= [1.0] 
     435    recall= [0.875] 
     436 
     437References 
     438========== 
     439 
     440Boutell, M.R., Luo, J., Shen, X. & Brown, C.M. (2004), 'Learning multi-label scene classification', 
     441Pattern Recogintion, vol.37, no.9, pp:1757-71 
     442 
     443Godbole, S. & Sarawagi, S. (2004), 'Discriminative Methods for Multi-labeled Classification', paper 
     444presented to Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining 
     445(PAKDD 2004) 
     446 
     447Schapire, R.E. & Singer, Y. (2000), 'Boostexter: a bossting-based system for text categorization', 
     448Machine Learning, vol.39, no.2/3, pp:135-68. 
  • docs/reference/rst/Orange.feature.rst

    r9372 r9896  
    88   :maxdepth: 2 
    99 
     10   Orange.feature.descriptor 
    1011   Orange.feature.scoring 
    1112   Orange.feature.selection 
  • docs/reference/rst/code/majority-classification.py

    r9823 r9894  
    1515 
    1616res = Orange.evaluation.testing.cross_validation(learners, monks) 
    17 CAs = Orange.evaluation.scoring.CA(res, reportSE=True) 
     17CAs = Orange.evaluation.scoring.CA(res, report_se=True) 
    1818 
    1919print "Tree:    %5.3f+-%5.3f" % CAs[0] 
  • docs/reference/rst/code/testing-test.py

    r9823 r9894  
    1212 
    1313def printResults(res): 
    14     CAs = Orange.evaluation.scoring.CA(res, reportSE=1) 
     14    CAs = Orange.evaluation.scoring.CA(res, report_se=1) 
    1515    for name, ca in zip(res.classifierNames, CAs): 
    1616        print "%s: %5.3f+-%5.3f" % (name, ca[0], 1.96 * ca[1]), 
  • docs/reference/rst/code/variable-get_value_from.py

    r9823 r9897  
    22# Category:    core 
    33# Uses:        monks-1 
    4 # Referenced:  Orange.data.variable 
    5 # Classes:     Orange.data.variable.Discrete 
     4# Referenced:  Orange.feature 
     5# Classes:     Orange.feature.Discrete 
    66 
    77import Orange 
     
    1414 
    1515monks = Orange.data.Table("monks-1") 
    16 e2 = Orange.data.variable.Discrete("e2", values=["not 1", "1"])     
     16e2 = Orange.feature.Discrete("e2", values=["not 1", "1"])     
    1717e2.get_value_from = checkE  
    1818 
    19 print Orange.core.MeasureAttribute_info(e2, monks) 
     19print Orange.feature.scoring.InfoGain(e2, monks) 
    2020 
    2121dist = Orange.core.Distribution(e2, monks) 
  • docs/reference/rst/index.rst

    r9729 r9897  
    77 
    88   Orange.data 
     9 
     10   Orange.feature 
    911 
    1012   Orange.associate 
     
    1921 
    2022   Orange.evaluation 
    21  
    22    Orange.feature 
    2323 
    2424   Orange.multilabel 
  • setup.py

    r9879 r9893  
    391391        install.run(self) 
    392392         
    393         # Create a .pth file wiht a path inside the Orange/orng directory 
     393        # Create a .pth file with a path inside the Orange/orng directory 
    394394        # so the old modules are importable 
    395395        self.path_file, self.extra_dirs = ("orange-orng-modules", "Orange/orng") 
Note: See TracChangeset for help on using the changeset viewer.