Changeset 7582:e2114f229e5e in orange


Ignore:
Timestamp:
02/04/11 23:52:33 (3 years ago)
Author:
mocnik <mocnik@…>
Branch:
default
Convert:
e5d3c728f7a39f86dd2744ce4413a2cc8cc1d168
Message:

Modifying Orange.evaluation.scoring module documentation.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/evaluation/scoring.py

    r7548 r7582  
    44 
    55This module contains various measures of quality for classification and 
    6 regression. Most functions require an argument named res, an instance of 
     6regression. Most functions require an argument named :obj:`res`, an instance of 
    77:class:`Orange.evaluation.testing.ExperimentResults` as computed by 
    8 functions from Orange.evaluation.testing and which contains predictions 
    9 obtained through 
    10 cross-validation, leave one-out, testing on training data or test set examples. 
     8functions from :mod:`Orange.evaluation.testing` and which contains  
     9predictions obtained through cross-validation, 
     10leave one-out, testing on training data or test set instances. 
    1111 
    1212============== 
     
    5959The output should look like this:: 
    6060 
    61     method  CA  AP  Brier   IS 
     61    method  CA      AP      Brier    IS 
    6262    bayes   0.903   0.902   0.175    0.759 
    6363    tree    0.846   0.845   0.286    0.641 
     
    7575 
    7676   **A positive-negative confusion matrix** is computed (a) if the class is 
    77    binary unless classIndex argument is -2, (b) if the class is multivalued 
    78    and the classIndex is non-negative. Argument classIndex then tells which 
    79    class is positive. In case (a), classIndex may be omited; the first class 
    80    is then negative and the second is positive, unless the baseClass attribute 
    81    in the object with results has non-negative value. In that case, baseClass 
    82    is an index of the traget class. baseClass attribute of results object 
    83    should be set manually. The result of a function is a list of instances 
    84    of class ConfusionMatrix, containing the (weighted) number of true 
    85    positives (TP), false negatives (FN), false positives (FP) and true 
    86    negatives (TN). 
    87     
    88    We can also add the keyword argument cutoff 
    89    (e.g. confusionMatrices(results, cutoff=0.3); if we do, confusionMatrices 
     77   binary unless :obj:`classIndex` argument is -2, (b) if the class is 
     78   multivalued and the :obj:`classIndex` is non-negative. Argument 
     79   :obj:`classIndex` then tells which class is positive. In case (a), 
     80   :obj:`classIndex` may be omitted; the first class 
     81   is then negative and the second is positive, unless the :obj:`baseClass` 
     82   attribute in the object with results has non-negative value. In that case, 
     83   :obj:`baseClass` is an index of the target class. :obj:`baseClass` 
     84   attribute of results object should be set manually. The result of a 
     85   function is a list of instances of class :class:`ConfusionMatrix`, 
     86   containing the (weighted) number of true positives (TP), false 
     87   negatives (FN), false positives (FP) and true negatives (TN). 
     88    
     89   We can also add the keyword argument :obj:`cutoff` 
     90   (e.g. confusionMatrices(results, cutoff=0.3); if we do, :obj:`confusionMatrices` 
    9091   will disregard the classifiers' class predictions and observe the predicted 
    9192   probabilities, and consider the prediction "positive" if the predicted 
    92    probability of the positive class is higher than the cutoff. 
     93   probability of the positive class is higher than the :obj:`cutoff`. 
    9394 
    9495   The example (part of `statExamples.py`_) below shows how setting the 
     
    9697   for naive Bayesian classifier:: 
    9798    
    98        cm = orngStat.confusionMatrices(res)[0] 
     99       cm = Orange.evaluation.scoring.confusionMatrices(res)[0] 
    99100       print "Confusion matrix for naive Bayes:" 
    100101       print "TP: %i, FP: %i, FN: %s, TN: %i" % (cm.TP, cm.FP, cm.FN, cm.TN) 
    101102        
    102        cm = orngStat.confusionMatrices(res, cutoff=0.2)[0] 
     103       cm = Orange.evaluation.scoring.confusionMatrices(res, cutoff=0.2)[0] 
    103104       print "Confusion matrix for naive Bayes:" 
    104105       print "TP: %i, FP: %i, FN: %s, TN: %i" % (cm.TP, cm.FP, cm.FN, cm.TN) 
     
    114115    
    115116   shows that the number of true positives increases (and hence the number of 
    116    false negatives decreases) by only a single example, while five examples 
     117   false negatives decreases) by only a single instance, while five instances 
    117118   that were originally true negatives become false positives due to the 
    118119   lower threshold. 
     
    121122   data set, we would compute the matrix like this:: 
    122123    
    123       cm = orngStat.confusionMatrices(resVeh, \ 
     124      cm = Orange.evaluation.scoring.confusionMatrices(resVeh, \ 
    124125vehicle.domain.classVar.values.index("van")) 
    125126    
     
    145146   The function then returns a three-dimensional matrix, where the element 
    146147   A[:obj:`learner`][:obj:`actualClass`][:obj:`predictedClass`] 
    147    gives the number of examples belonging to 'actualClass' for which the 
     148   gives the number of instances belonging to 'actualClass' for which the 
    148149   'learner' predicted 'predictedClass'. We shall compute and print out 
    149150   the matrix for naive Bayesian classifier. 
     
    151152   Here we see another example from `statExamples.py`_:: 
    152153    
    153        cm = orngStat.confusionMatrices(resVeh)[0] 
     154       cm = Orange.evaluation.scoring.confusionMatrices(resVeh)[0] 
    154155       classes = vehicle.domain.classVar.values 
    155156       print "\t"+"\t".join(classes) 
     
    170171   already, we've printed it out above), and the 10 misclassified pictures 
    171172   were classified as buses (6) and Saab cars (4). In all other classes, 
    172    there were more examples misclassified as vans than correctly classified 
    173    examples. The classifier is obviously quite biased to vans. 
     173   there were more instances misclassified as vans than correctly classified 
     174   instances. The classifier is obviously quite biased to vans. 
    174175    
    175176   .. method:: sens(confm)  
     
    221222   part of `statExamples.py`_:: 
    222223    
    223        cm = orngStat.confusionMatrices(res) 
     224       cm = Orange.evaluation.scoring.confusionMatrices(res) 
    224225       print 
    225226       print "method\tsens\tspec" 
    226227       for l in range(len(learners)): 
    227            print "%s\t%5.3f\t%5.3f" % (learners[l].name, orngStat.sens(cm[l]), orngStat.spec(cm[l])) 
     228           print "%s\t%5.3f\t%5.3f" % (learners[l].name, Orange.evaluation.scoring.sens(cm[l]), Orange.evaluation.scoring.spec(cm[l])) 
    228229    
    229230   .. _statExamples.py: code/statExamples.py 
     
    245246   .. attribute:: AUC.ByWeightedPairs (or 0) 
    246247       
    247       Computes AUC for each pair of classes (ignoring examples of all other 
     248      Computes AUC for each pair of classes (ignoring instances of all other 
    248249      classes) and averages the results, weighting them by the number of 
    249       pairs of examples from these two classes (e.g. by the product of 
     250      pairs of instances from these two classes (e.g. by the product of 
    250251      probabilities of the two classes). AUC computed in this way still 
    251252      behaves as concordance index, e.g., gives the probability that two 
    252       randomly chosen examples from different classes will be correctly 
     253      randomly chosen instances from different classes will be correctly 
    253254      recognized (this is of course true only if the classifier knows 
    254       from which two classes the examples came). 
     255      from which two classes the instances came). 
    255256    
    256257   .. attribute:: AUC.ByPairs (or 1) 
     
    266267      the class probabilities. This is related to concordance index in which 
    267268      we test the classifier's (average) capability for distinguishing the 
    268       examples from a specified class from those that come from other classes. 
     269      instances from a specified class from those that come from other classes. 
    269270      Unlike the binary AUC, the measure is not independent of class 
    270271      distributions. 
     
    274275      As above, except that the average is not weighted. 
    275276    
    276    In case of :obj:`multiple folds` (for instance if the data comes from cross 
     277   In case of multiple folds (for instance if the data comes from cross 
    277278   validation), the computation goes like this. When computing the partial 
    278279   AUCs for individual pairs of classes or singled-out classes, AUC is 
    279280   computed for each fold separately and then averaged (ignoring the number 
    280    of examples in each fold, it's just a simple average). However, if a 
    281    certain fold doesn't contain any examples of a certain class (from the 
     281   of instances in each fold, it's just a simple average). However, if a 
     282   certain fold doesn't contain any instances of a certain class (from the 
    282283   pair), the partial AUC is computed treating the results as if they came 
    283284   from a single-fold. This is not really correct since the class 
     
    290291   CA, of course):: 
    291292    
    292        AUCs = orngStat.AUC(res) 
     293       AUCs = Orange.evaluation.scoring.AUC(res) 
    293294       for l in range(len(learners)): 
    294295           print "%10s: %5.3f" % (learners[l].name, AUCs[l]) 
     
    298299   of pairs. Or, you can specify the averaging method yourself, like this:: 
    299300    
    300        AUCs = orngStat.AUC(resVeh, orngStat.AUC.WeightedOneAgainstAll) 
     301       AUCs = Orange.evaluation.scoring.AUC(resVeh, orngStat.AUC.WeightedOneAgainstAll) 
    301302    
    302303   The following snippet tries out all four. (We don't claim that this is 
     
    306307       print " " *25 + "  \tbayes\ttree\tmajority" 
    307308       for i in range(4): 
    308            AUCs = orngStat.AUC(resVeh, i) 
     309           AUCs = Orange.evaluation.scoring.AUC(resVeh, i) 
    309310           print "%25s: \t%5.3f\t%5.3f\t%5.3f" % ((methods[i], ) + tuple(AUCs)) 
    310311    
     
    337338We shall use the following code to prepare suitable experimental results:: 
    338339 
    339     ri2 = orange.MakeRandomIndices2(voting, 0.6) 
     340    ri2 = Orange.core.MakeRandomIndices2(voting, 0.6) 
    340341    train = voting.selectref(ri2, 0) 
    341342    test = voting.selectref(ri2, 1) 
     
    718719     
    719720    If results are from a single repetition, we assume independency of 
    720     examples and treat the classification accuracy as distributed according 
     721    instances and treat the classification accuracy as distributed according 
    721722    to binomial distribution. This can be approximated by normal distribution, 
    722723    so we report the SE of sqrt(CA*(1-CA)/N), where CA is classification 
    723     accuracy and N is number of test examples. 
     724    accuracy and N is number of test instances. 
    724725     
    725726    Instead of ExperimentResults, this function can be given a list of 
     
    873874    `Kononenko and Bratko (1991) \ 
    874875    <http://www.springerlink.com/content/g5p7473160476612/>`_. 
    875     Argument 'apriori' gives the apriori class 
     876    Argument :obj:`apriori` gives the apriori class 
    876877    distribution; if it is omitted, the class distribution is computed from 
    877     the actual classes of examples in res. 
     878    the actual classes of examples in :obj:`res`. 
    878879    """ 
    879880    if not apriori: 
     
    12271228def AUCWilcoxon(res, classIndex=-1, **argkw): 
    12281229    """ Computes the area under ROC (AUC) and its standard error using 
    1229     Wilcoxon's approach proposed by Hanley and McNeal (1982). If classIndex 
    1230     is not specified, the first class is used as "the positive" and others 
    1231     are negative. The result is a list of tuples (aROC, standard error). 
     1230    Wilcoxon's approach proposed by Hanley and McNeal (1982). If  
     1231    :obj:`classIndex` is not specified, the first class is used as 
     1232    "the positive" and others are negative. The result is a list of 
     1233    tuples (aROC, standard error). 
    12321234    """ 
    12331235    import corn 
     
    18431845    the function like this:: 
    18441846     
    1845         orngStat.AUC_single(resVeh, \ 
     1847        Orange.evaluation.scoring.AUC_single(resVeh, \ 
    18461848classIndex = vehicle.domain.classVar.values.index("van")) 
    18471849    """ 
     
    18601862# Results over folds are averages; if some folds have examples from one class only, the folds are merged 
    18611863def AUC_pair(res, classIndex1, classIndex2, useWeights = True): 
    1862     """ Computes AUC between a pair of examples, ignoring examples from all 
     1864    """ Computes AUC between a pair of instances, ignoring instances from all 
    18631865    other classes. 
    18641866    """ 
Note: See TracChangeset for help on using the changeset viewer.