Changeset 7840:69dc3fd3891a in orange
 Timestamp:
 04/14/11 09:14:06 (3 years ago)
 Branch:
 default
 Convert:
 145119a52590fa17f6552c72dfb7c019285b9e0e
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

orange/Orange/evaluation/scoring.py
r7784 r7840 43 43 .. autofunction:: AP 44 44 45 .. autofunction:: Brier Score45 .. autofunction:: Brier_score 46 46 47 47 .. autofunction:: IS … … 72 72 ================ 73 73 74 .. autofunction:: confusion Matrices74 .. autofunction:: confusion_matrices 75 75 76 76 **A positivenegative confusion matrix** is computed (a) if the class is … … 88 88 89 89 We can also add the keyword argument :obj:`cutoff` 90 (e.g. confusion Matrices(results, cutoff=0.3); if we do, :obj:`confusionMatrices`90 (e.g. confusion_matrices(results, cutoff=0.3); if we do, :obj:`confusion_matrices` 91 91 will disregard the classifiers' class predictions and observe the predicted 92 92 probabilities, and consider the prediction "positive" if the predicted … … 97 97 for naive Bayesian classifier:: 98 98 99 cm = Orange.evaluation.scoring.confusion Matrices(res)[0]99 cm = Orange.evaluation.scoring.confusion_matrices(res)[0] 100 100 print "Confusion matrix for naive Bayes:" 101 101 print "TP: %i, FP: %i, FN: %s, TN: %i" % (cm.TP, cm.FP, cm.FN, cm.TN) 102 102 103 cm = Orange.evaluation.scoring.confusion Matrices(res, cutoff=0.2)[0]103 cm = Orange.evaluation.scoring.confusion_matrices(res, cutoff=0.2)[0] 104 104 print "Confusion matrix for naive Bayes:" 105 105 print "TP: %i, FP: %i, FN: %s, TN: %i" % (cm.TP, cm.FP, cm.FN, cm.TN) … … 122 122 data set, we would compute the matrix like this:: 123 123 124 cm = Orange.evaluation.scoring.confusion Matrices(resVeh, \124 cm = Orange.evaluation.scoring.confusion_matrices(resVeh, \ 125 125 vehicle.domain.classVar.values.index("van")) 126 126 … … 152 152 Here we see another example from `statExamples.py`_:: 153 153 154 cm = Orange.evaluation.scoring.confusion Matrices(resVeh)[0]154 cm = Orange.evaluation.scoring.confusion_matrices(resVeh)[0] 155 155 classes = vehicle.domain.classVar.values 156 156 print "\t"+"\t".join(classes) … … 222 222 part of `statExamples.py`_:: 223 223 224 cm = Orange.evaluation.scoring.confusion Matrices(res)224 cm = Orange.evaluation.scoring.confusion_matrices(res) 225 225 print 226 226 print "method\tsens\tspec" … … 299 299 of pairs. Or, you can specify the averaging method yourself, like this:: 300 300 301 AUCs = Orange.evaluation.scoring.AUC(resVeh, orngStat.AUC.WeightedOneAgainstAll)301 AUCs = Orange.evaluation.scoring.AUC(resVeh, Orange.evaluation.scoring.AUC.WeightedOneAgainstAll) 302 302 303 303 The following snippet tries out all four. (We don't claim that this is … … 327 327 them, require that the results come from a test with a single iteration, 328 328 and they always compare one chosen class against all others. If you have 329 cross validation results, you can either use split ByIterations to split the329 cross validation results, you can either use split_by_iterations to split the 330 330 results by folds, call the function for each fold separately and then sum 331 331 the results up however you see fit, or you can set the ExperimentResults' 332 332 attribute numberOfIterations to 1, to cheat the function  at your own 333 333 responsibility for the statistical correctness. Regarding the multiclass 334 problems, if you don't chose a specific class, orngStatwill use the class334 problems, if you don't chose a specific class, Orange.evaluation.scoring will use the class 335 335 attribute's baseValue at the time when results were computed. If baseValue 336 336 was not given at that time, 1 (that is, the second class) is used as default. … … 346 346 .. autofunction:: AUCWilcoxon 347 347 348 .. autofunction:: compute ROC348 .. autofunction:: compute_ROC 349 349 350 350 Comparison of Algorithms … … 353 353 .. autofunction:: McNemar 354 354 355 .. autofunction:: McNemar OfTwo355 .. autofunction:: McNemar_of_two 356 356 357 357 ========== … … 420 420 ================= 421 421 422 .. autofunction:: split ByIterations422 .. autofunction:: split_by_iterations 423 423 424 424 """ … … 437 437 return math.log(x)/math.log(2) 438 438 439 def check NonZero(x):439 def check_non_zero(x): 440 440 """Throw Value Error when x = 0.0.""" 441 441 if x==0.0: … … 457 457 458 458 459 def split ByIterations(res):459 def split_by_iterations(res): 460 460 """ Splits ExperimentResults of multiple iteratation test into a list 461 461 of ExperimentResults, one for each iteration. … … 471 471 472 472 473 def class ProbabilitiesFromRes(res, **argkw):473 def class_probabilities_from_res(res, **argkw): 474 474 """Calculate class probabilities""" 475 475 probs = [0.0] * len(res.classValues) … … 483 483 probs[tex.actualClass] += tex.weight 484 484 totweight += tex.weight 485 check NonZero(totweight)485 check_non_zero(totweight) 486 486 return [prob/totweight for prob in probs] 487 487 488 488 489 def statistics ByFolds(stats, foldN, reportSE, iterationIsOuter):489 def statistics_by_folds(stats, foldN, reportSE, iterationIsOuter): 490 490 # remove empty folds, turn the matrix so that learner is outer 491 491 if iterationIsOuter: … … 530 530 # Scores for evaluation of numeric predictions 531 531 532 def check Argkw(dct, lst):533 """check Argkw(dct, lst) > returns true if any items have nonzero value in dct"""532 def check_argkw(dct, lst): 533 """check_argkw(dct, lst) > returns true if any items have nonzero value in dct""" 534 534 return reduce(lambda x,y: x or y, [dct.get(k, 0) for k in lst]) 535 535 536 def regression Error(res, **argkw):537 """regression Error(res) > regression error (default: MSE)"""536 def regression_error(res, **argkw): 537 """regression_error(res) > regression error (default: MSE)""" 538 538 if argkw.get("SE", 0) and res.numberOfIterations > 1: 539 539 # computes the scores for each iteration, then averages … … 626 626 def MSE(res, **argkw): 627 627 """ Computes meansquared error. """ 628 return regression Error(res, **argkw)628 return regression_error(res, **argkw) 629 629 630 630 def RMSE(res, **argkw): 631 631 """ Computes root meansquared error. """ 632 632 argkw.setdefault("sqrt", True) 633 return regression Error(res, **argkw)633 return regression_error(res, **argkw) 634 634 635 635 def MAE(res, **argkw): 636 636 """ Computes mean absolute error. """ 637 637 argkw.setdefault("abs", True) 638 return regression Error(res, **argkw)638 return regression_error(res, **argkw) 639 639 640 640 def RSE(res, **argkw): 641 641 """ Computes relative squared error. """ 642 642 argkw.setdefault("normsqr", True) 643 return regression Error(res, **argkw)643 return regression_error(res, **argkw) 644 644 645 645 def RRSE(res, **argkw): … … 647 647 argkw.setdefault("normsqr", True) 648 648 argkw.setdefault("sqrt", True) 649 return regression Error(res, **argkw)649 return regression_error(res, **argkw) 650 650 651 651 def RAE(res, **argkw): … … 653 653 argkw.setdefault("abs", True) 654 654 argkw.setdefault("normabs", True) 655 return regression Error(res, **argkw)655 return regression_error(res, **argkw) 656 656 657 657 def R2(res, **argkw): … … 659 659 argkw.setdefault("normsqr", True) 660 660 argkw.setdefault("R2", True) 661 return regression Error(res, **argkw)661 return regression_error(res, **argkw) 662 662 663 663 def MSE_old(res, **argkw): … … 731 731 if type(res)==ConfusionMatrix: 732 732 div = nm.TP+nm.FN+nm.FP+nm.TN 733 check NonZero(div)733 check_non_zero(div) 734 734 ca = [(nm.TP+nm.TN)/div] 735 735 else: … … 744 744 CAs = map(lambda res, cls: res+(cls==tex.actualClass and tex.weight), CAs, tex.classes) 745 745 totweight += tex.weight 746 check NonZero(totweight)746 check_non_zero(totweight) 747 747 ca = [x/totweight for x in CAs] 748 748 … … 767 767 foldN[tex.iterationNumber] += tex.weight 768 768 769 return statistics ByFolds(CAsByFold, foldN, reportSE, False)769 return statistics_by_folds(CAsByFold, foldN, reportSE, False) 770 770 771 771 … … 788 788 APs = map(lambda res, probs: res + probs[tex.actualClass]*tex.weight, APs, tex.probabilities) 789 789 totweight += tex.weight 790 check NonZero(totweight)790 check_non_zero(totweight) 791 791 return [AP/totweight for AP in APs] 792 792 … … 802 802 foldN[tex.iterationNumber] += tex.weight 803 803 804 return statistics ByFolds(APsByFold, foldN, reportSE, True)805 806 807 def Brier Score(res, reportSE = False, **argkw):804 return statistics_by_folds(APsByFold, foldN, reportSE, True) 805 806 807 def Brier_score(res, reportSE = False, **argkw): 808 808 """ Computes the Brier's score, defined as the average (over test examples) 809 809 of sumx(t(x)p(x))2, where x is a class, t(x) is 1 for the correct class … … 834 834 res + tex.weight*reduce(lambda s, pi: s+pi**2, probs, 0)  2*probs[tex.actualClass], MSEs, tex.probabilities) 835 835 totweight = gettotweight(res) 836 check NonZero(totweight)836 check_non_zero(totweight) 837 837 if reportSE: 838 838 return [(max(x/totweight+1.0, 0), 0) for x in MSEs] ## change this, not zero!!! … … 854 854 foldN[tex.iterationNumber] += tex.weight 855 855 856 stats = statistics ByFolds(BSs, foldN, reportSE, True)856 stats = statistics_by_folds(BSs, foldN, reportSE, True) 857 857 if reportSE: 858 858 return [(x+1.0, y) for x, y in stats] … … 861 861 862 862 def BSS(res, **argkw): 863 return [1x/2 for x in apply(Brier Score, (res, ), argkw)]863 return [1x/2 for x in apply(Brier_score, (res, ), argkw)] 864 864 865 865 def IS_ex(Pc, P): … … 879 879 """ 880 880 if not apriori: 881 apriori = class ProbabilitiesFromRes(res)881 apriori = class_probabilities_from_res(res) 882 882 883 883 if res.numberOfIterations==1: … … 918 918 foldN[tex.iterationNumber] += tex.weight 919 919 920 return statistics ByFolds(ISs, foldN, reportSE, False)920 return statistics_by_folds(ISs, foldN, reportSE, False) 921 921 922 922 923 923 def Friedman(res, statistics, **argkw): 924 924 sums = None 925 for ri in split ByIterations(res):925 for ri in split_by_iterations(res): 926 926 ranks = statc.rankdata(apply(statistics, (ri,), argkw)) 927 927 if sums: … … 939 939 def Wilcoxon(res, statistics, **argkw): 940 940 res1, res2 = [], [] 941 for ri in split ByIterations(res):941 for ri in split_by_iterations(res): 942 942 stats = apply(statistics, (ri,), argkw) 943 943 if (len(stats) != 2): … … 947 947 return statc.wilcoxont(res1, res2) 948 948 949 def rank Difference(res, statistics, **argkw):949 def rank_difference(res, statistics, **argkw): 950 950 if not res.results: 951 951 raise TypeError, "no experiments" … … 960 960 961 961 class ConfusionMatrix: 962 """ Class ConfusionMatrix stores data about false and true 963 predictions compared to real class. It stores the number of 964 True Negatives, False Positive, False Negatives and True Positives. 965 """ 962 966 def __init__(self): 963 967 self.TP = self.FN = self.FP = self.TN = 0.0 … … 976 980 977 981 978 def confusion Matrices(res, classIndex=1, **argkw):982 def confusion_matrices(res, classIndex=1, **argkw): 979 983 """ This function can compute two different forms of confusion matrix: 980 984 one in which a certain class is marked as positive and the other(s) … … 1037 1041 1038 1042 # obsolete (renamed) 1039 compute ConfusionMatrices = confusionMatrices1040 1041 1042 def confusion ChiSquare(confusionMatrix):1043 compute_confusion_matrices = confusion_matrices 1044 1045 1046 def confusion_chi_square(confusionMatrix): 1043 1047 dim = len(confusionMatrix) 1044 1048 rowPriors = [sum(r) for r in confusionMatrix] … … 1179 1183 return r 1180 1184 1181 def scotts Pi(confm, bIsListOfMatrices=True):1185 def scotts_pi(confm, bIsListOfMatrices=True): 1182 1186 """Compute Scott's Pi for measuring interrater agreement for nominal data 1183 1187 … … 1188 1192 @param confm: confusion matrix, or list of confusion matrices. To obtain 1189 1193 nonbinary confusion matrix, call 1190 orngStat.computeConfusionMatrices and set the1194 Orange.evaluation.scoring.compute_confusion_matrices and set the 1191 1195 classIndex parameter to 2. 1192 1196 @param bIsListOfMatrices: specifies whether confm is list of matrices. … … 1199 1203 if bIsListOfMatrices: 1200 1204 try: 1201 return [scotts Pi(cm, bIsListOfMatrices=False) for cm in confm]1205 return [scotts_pi(cm, bIsListOfMatrices=False) for cm in confm] 1202 1206 except TypeError: 1203 1207 # Nevermind the parameter, maybe this is a "conventional" binary … … 1263 1267 AROC = AUCWilcoxon # for backward compatibility, AROC is obsolote 1264 1268 1265 def compare 2AUCs(res, lrn1, lrn2, classIndex=1, **argkw):1269 def compare_2_AUCs(res, lrn1, lrn2, classIndex=1, **argkw): 1266 1270 import corn 1267 1271 return corn.compare2ROCs(res, lrn1, lrn2, classIndex, res.weights and not argkw.get("unweighted")) 1268 1272 1269 compare 2AROCs = compare2AUCs # for backward compatibility, compare2AROCs is obsolote1270 1271 1272 def compute ROC(res, classIndex=1):1273 compare_2_AROCs = compare_2_AUCs # for backward compatibility, compare_2_AROCs is obsolote 1274 1275 1276 def compute_ROC(res, classIndex=1): 1273 1277 """ Computes a ROC curve as a list of (x, y) tuples, where x is 1274 1278 1specificity and y is sensitivity. … … 1302 1306 ## TC's implementation of algorithms, taken from: 1303 1307 ## T Fawcett: ROC Graphs: Notes and Practical Considerations for Data Mining Researchers, submitted to KDD Journal. 1304 def ROC slope((P1x, P1y, P1fscore), (P2x, P2y, P2fscore)):1308 def ROC_slope((P1x, P1y, P1fscore), (P2x, P2y, P2fscore)): 1305 1309 if (P1x == P2x): 1306 1310 return 1e300 1307 1311 return (P1y  P2y) / (P1x  P2x) 1308 1312 1309 def ROC addPoint(P, R, keepConcavities=1):1313 def ROC_add_point(P, R, keepConcavities=1): 1310 1314 if keepConcavities: 1311 1315 R.append(P) … … 1318 1322 T = R.pop() 1319 1323 T2 = R[1] 1320 if ROC slope(T2, T) > ROCslope(T, P):1324 if ROC_slope(T2, T) > ROC_slope(T, P): 1321 1325 R.append(T) 1322 1326 R.append(P) … … 1324 1328 return R 1325 1329 1326 def TC computeROC(res, classIndex=1, keepConcavities=1):1330 def TC_compute_ROC(res, classIndex=1, keepConcavities=1): 1327 1331 import corn 1328 1332 problists, tots = corn.computeROCCumulative(res, classIndex) … … 1349 1353 else: 1350 1354 fpr = 0.0 1351 curve = ROC addPoint((fpr, tpr, fPrev), curve, keepConcavities)1355 curve = ROC_add_point((fpr, tpr, fPrev), curve, keepConcavities) 1352 1356 fPrev = f 1353 1357 thisPos, thisNeg = prob[1][1], prob[1][0] … … 1362 1366 else: 1363 1367 fpr = 0.0 1364 curve = ROC addPoint((fpr, tpr, f), curve, keepConcavities) ## ugly1368 curve = ROC_add_point((fpr, tpr, f), curve, keepConcavities) ## ugly 1365 1369 results.append(curve) 1366 1370 … … 1369 1373 ## returns a list of points at the intersection of the tangential isoperformance line and the given ROC curve 1370 1374 ## for given values of FPcost, FNcost and pval 1371 def TC bestThresholdsOnROCcurve(FPcost, FNcost, pval, curve):1375 def TC_best_thresholds_on_ROC_curve(FPcost, FNcost, pval, curve): 1372 1376 m = (FPcost*(1.0  pval)) / (FNcost*pval) 1373 1377 … … 1422 1426 ## for each (sub)set of input ROC curves 1423 1427 ## returns the average ROC curve and an array of (vertical) standard deviations 1424 def TC verticalAverageROC(ROCcurves, samples = 10):1428 def TC_vertical_average_ROC(ROCcurves, samples = 10): 1425 1429 def INTERPOLATE((P1x, P1y, P1fscore), (P2x, P2y, P2fscore), X): 1426 1430 if (P1x == P2x) or ((X > P1x) and (X > P2x)) or ((X < P1x) and (X < P2x)): … … 1446 1450 elif fp < FPsample and i + 1 == len(ROC): # return the last 1447 1451 return ROC[i][1] 1448 raise ValueError, "cannot compute: TP_FOR_FP in TC verticalAverageROC"1452 raise ValueError, "cannot compute: TP_FOR_FP in TC_vertical_average_ROC" 1449 1453 #return 0.0 1450 1454 … … 1481 1485 ## for each (sub)set of input ROC curves 1482 1486 ## returns the average ROC curve, an array of vertical standard deviations and an array of horizontal standard deviations 1483 def TC thresholdlAverageROC(ROCcurves, samples = 10):1487 def TC_threshold_average_ROC(ROCcurves, samples = 10): 1484 1488 def POINT_AT_THRESH(ROC, npts, thresh): 1485 1489 i = 0 … … 1546 1550 ##  yesClassRugPoints is an array of (x, 1) points 1547 1551 ##  noClassRugPoints is an array of (x, 0) points 1548 def compute CalibrationCurve(res, classIndex=1):1552 def compute_calibration_curve(res, classIndex=1): 1549 1553 import corn 1550 1554 ## merge multiple iterations into one … … 1608 1612 ## returns an array of curve elements, where: 1609 1613 ##  curve is an array of points ((TP+FP)/(P + N), TP/P, (th, FP/N)) on the Lift Curve 1610 def compute LiftCurve(res, classIndex=1):1614 def compute_lift_curve(res, classIndex=1): 1611 1615 import corn 1612 1616 ## merge multiple iterations into one … … 1639 1643 self.C, self.D, self.T = C, D, T 1640 1644 1641 def is CDTEmpty(cdt):1645 def is_CDT_empty(cdt): 1642 1646 return cdt.C + cdt.D + cdt.T < 1e20 1643 1647 1644 1648 1645 def compute CDT(res, classIndex=1, **argkw):1649 def compute_CDT(res, classIndex=1, **argkw): 1646 1650 """Obsolete, don't use""" 1647 1651 import corn … … 1657 1661 if (res.numberOfIterations>1): 1658 1662 CDTs = [CDT() for i in range(res.numberOfLearners)] 1659 iterationExperiments = split ByIterations(res)1663 iterationExperiments = split_by_iterations(res) 1660 1664 for exp in iterationExperiments: 1661 1665 expCDTs = corn.computeCDT(exp, classIndex, useweights) … … 1665 1669 CDTs[i].T += expCDTs[i].T 1666 1670 for i in range(res.numberOfLearners): 1667 if is CDTEmpty(CDTs[0]):1671 if is_CDT_empty(CDTs[0]): 1668 1672 return corn.computeCDT(res, classIndex, useweights) 1669 1673 … … 1674 1678 ## THIS FUNCTION IS OBSOLETE AND ITS AVERAGING OVER FOLDS IS QUESTIONABLE 1675 1679 ## DON'T USE IT 1676 def ROCs FromCDT(cdt, **argkw):1680 def ROCs_from_CDT(cdt, **argkw): 1677 1681 """Obsolete, don't use""" 1678 1682 if type(cdt) == list: 1679 return [ROCs FromCDT(c) for c in cdt]1683 return [ROCs_from_CDT(c) for c in cdt] 1680 1684 1681 1685 C, D, T = cdt.C, cdt.D, cdt.T … … 1705 1709 return res 1706 1710 1707 AROC FromCDT = ROCsFromCDT # for backward compatibility, AROCFromCDT is obsolote1711 AROC_from_CDT = ROCs_from_CDT # for backward compatibility, AROC_from_CDT is obsolote 1708 1712 1709 1713 … … 1716 1720 def AUC_x(cdtComputer, ite, all_ite, divideByIfIte, computerArgs): 1717 1721 cdts = cdtComputer(*(ite, ) + computerArgs) 1718 if not is CDTEmpty(cdts[0]):1722 if not is_CDT_empty(cdts[0]): 1719 1723 return [(cdt.C+cdt.T/2)/(cdt.C+cdt.D+cdt.T)/divideByIfIte for cdt in cdts], True 1720 1724 1721 1725 if all_ite: 1722 1726 cdts = cdtComputer(*(all_ite, ) + computerArgs) 1723 if not is CDTEmpty(cdts[0]):1727 if not is_CDT_empty(cdts[0]): 1724 1728 return [(cdt.C+cdt.T/2)/(cdt.C+cdt.D+cdt.T) for cdt in cdts], False 1725 1729 … … 1758 1762 def AUC_binary(res, useWeights = True): 1759 1763 if res.numberOfIterations > 1: 1760 return AUC_iterations(AUC_i, split ByIterations(res), (1, useWeights, res, res.numberOfIterations))1764 return AUC_iterations(AUC_i, split_by_iterations(res), (1, useWeights, res, res.numberOfIterations)) 1761 1765 else: 1762 1766 return AUC_i(res, 1, useWeights)[0] … … 1767 1771 1768 1772 if res.numberOfIterations > 1: 1769 iterations = split ByIterations(res)1773 iterations = split_by_iterations(res) 1770 1774 all_ite = res 1771 1775 else: … … 1778 1782 1779 1783 if method in [0, 2]: 1780 prob = class ProbabilitiesFromRes(res)1784 prob = class_probabilities_from_res(res) 1781 1785 1782 1786 if method <= 1: … … 1852 1856 1853 1857 if res.numberOfIterations > 1: 1854 return AUC_iterations(AUC_i, split ByIterations(res), (classIndex, useWeights, res, res.numberOfIterations))1858 return AUC_iterations(AUC_i, split_by_iterations(res), (classIndex, useWeights, res, res.numberOfIterations)) 1855 1859 else: 1856 1860 return AUC_i( res, classIndex, useWeights)[0] … … 1863 1867 """ 1864 1868 if res.numberOfIterations > 1: 1865 return AUC_iterations(AUC_ij, split ByIterations(res), (classIndex1, classIndex2, useWeights, res, res.numberOfIterations))1869 return AUC_iterations(AUC_ij, split_by_iterations(res), (classIndex1, classIndex2, useWeights, res, res.numberOfIterations)) 1866 1870 else: 1867 1871 return AUC_ij(res, classIndex1, classIndex2, useWeights) … … 1876 1880 1877 1881 classes = vehicle.domain.classVar.values 1878 AUCmatrix = orngStat.AUC_matrix(resVeh)[0]1882 AUCmatrix = Orange.evaluation.scoring.AUC_matrix(resVeh)[0] 1879 1883 print "\t"+"\t".join(classes[:1]) 1880 1884 for className, AUCrow in zip(classes[1:], AUCmatrix[1:]): … … 1885 1889 1886 1890 if res.numberOfIterations > 1: 1887 iterations, all_ite = split ByIterations(res), res1891 iterations, all_ite = split_by_iterations(res), res 1888 1892 else: 1889 1893 iterations, all_ite = [res], None 1890 1894 1891 1895 aucs = [[[] for i in range(numberOfClasses)] for i in range(numberOfLearners)] 1892 prob = class ProbabilitiesFromRes(res)1896 prob = class_probabilities_from_res(res) 1893 1897 1894 1898 for classIndex1 in range(numberOfClasses): … … 1951 1955 1952 1956 1953 def McNemar OfTwo(res, lrn1, lrn2):1954 """ McNemar OfTwo computes a McNemar statistics for a pair of classifier,1957 def McNemar_of_two(res, lrn1, lrn2): 1958 """ McNemar_of_two computes a McNemar statistics for a pair of classifier, 1955 1959 specified by indices learner1 and learner2. 1956 1960 """ … … 1984 1988 Returns F, p and average ranks 1985 1989 """ 1986 res_split = split ByIterations(res)1990 res_split = split_by_iterations(res) 1987 1991 res = [stat(r) for r in res_split] 1988 1992 … … 1992 1996 for r in res: 1993 1997 ranks = [kx+1 for x in statc.rankdata(r)] 1994 if stat==Brier Score: # reverse ranks for BrierScore (lower better)1998 if stat==Brier_score: # reverse ranks for Brier_score (lower better) 1995 1999 ranks = [k+1x for x in ranks] 1996 2000 sums = [ranks[i]+sums[i] for i in range(k)] … … 2004 2008 2005 2009 2006 def Wilcoxon Pairs(res, avgranks, stat=CA):2010 def Wilcoxon_pairs(res, avgranks, stat=CA): 2007 2011 """ Returns a triangular matrix, where element[i][j] stores significance of difference 2008 2012 between ith and jth classifier, as computed by Wilcoxon test. The element is positive … … 2011 2015 and, optionally, a statistics; greater values should mean better results.append 2012 2016 """ 2013 res_split = split ByIterations(res)2017 res_split = split_by_iterations(res) 2014 2018 res = [stat(r) for r in res_split] 2015 2019 … … 2030 2034 2031 2035 2032 def plot LearningCurveLearners(file, allResults, proportions, learners, noConfidence=0):2033 plot LearningCurve(file, allResults, proportions, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))], noConfidence)2034 2035 def plot LearningCurve(file, allResults, proportions, legend, noConfidence=0):2036 def plot_learning_curve_learners(file, allResults, proportions, learners, noConfidence=0): 2037 plot_learning_curve(file, allResults, proportions, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))], noConfidence) 2038 2039 def plot_learning_curve(file, allResults, proportions, legend, noConfidence=0): 2036 2040 import types 2037 2041 fopened=0 … … 2068 2072 2069 2073 2070 def print SingleROCCurveCoordinates(file, curve):2074 def print_single_ROC_curve_coordinates(file, curve): 2071 2075 import types 2072 2076 fopened=0 … … 2082 2086 2083 2087 2084 def plot ROCLearners(file, curves, learners):2085 plot ROC(file, curves, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))])2086 2087 def plot ROC(file, curves, legend):2088 def plot_ROC_learners(file, curves, learners): 2089 plot_ROC(file, curves, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))]) 2090 2091 def plot_ROC(file, curves, legend): 2088 2092 import types 2089 2093 fopened=0 … … 2113 2117 2114 2118 2115 def plot McNemarCurveLearners(file, allResults, proportions, learners, reference=1):2116 plot McNemarCurve(file, allResults, proportions, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))], reference)2117 2118 def plot McNemarCurve(file, allResults, proportions, legend, reference=1):2119 def plot_McNemar_curve_learners(file, allResults, proportions, learners, reference=1): 2120 plot_McNemar_curve(file, allResults, proportions, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))], reference) 2121 2122 def plot_McNemar_curve(file, allResults, proportions, legend, reference=1): 2119 2123 if reference<0: 2120 2124 reference=len(legend)1 … … 2138 2142 for i in tmap: 2139 2143 for p in range(len(proportions)): 2140 file.write("%f\t%f\n" % (proportions[p], McNemar OfTwo(allResults[p], i, reference)))2144 file.write("%f\t%f\n" % (proportions[p], McNemar_of_two(allResults[p], i, reference))) 2141 2145 file.write("e\n\n") 2142 2146 … … 2144 2148 file.close() 2145 2149 2146 default PointTypes=("{$\\circ$}", "{$\\diamond$}", "{$+$}", "{$\\times$}", "{$$}")+tuple([chr(x) for x in range(97, 122)])2147 default LineTypes=("\\setsolid", "\\setdashpattern <4pt, 2pt>", "\\setdashpattern <8pt, 2pt>", "\\setdashes", "\\setdots")2148 2149 def learning CurveLearners2PiCTeX(file, allResults, proportions, **options):2150 return apply(learning Curve2PiCTeX, (file, allResults, proportions), options)2151 2152 def learning Curve2PiCTeX(file, allResults, proportions, **options):2150 default_point_types=("{$\\circ$}", "{$\\diamond$}", "{$+$}", "{$\\times$}", "{$$}")+tuple([chr(x) for x in range(97, 122)]) 2151 default_line_types=("\\setsolid", "\\setdashpattern <4pt, 2pt>", "\\setdashpattern <8pt, 2pt>", "\\setdashes", "\\setdots") 2152 2153 def learning_curve_learners_to_PiCTeX(file, allResults, proportions, **options): 2154 return apply(learning_curve_to_PiCTeX, (file, allResults, proportions), options) 2155 2156 def learning_curve_to_PiCTeX(file, allResults, proportions, **options): 2153 2157 import types 2154 2158 fopened=0 … … 2167 2171 yshift=float(options.get("yshift", ntestexamples/20.)) 2168 2172 2169 pointtypes=options.get("pointtypes", default PointTypes)2170 linetypes=options.get("linetypes", default LineTypes)2173 pointtypes=options.get("pointtypes", default_point_types) 2174 linetypes=options.get("linetypes", default_line_types) 2171 2175 2172 2176 if options.has_key("numberedx"): … … 2227 2231 del file 2228 2232 2229 def legend Learners2PiCTeX(file, learners, **options):2230 return apply(legend 2PiCTeX, (file, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))]), options)2231 2232 def legend 2PiCTeX(file, legend, **options):2233 def legend_learners_to_PiCTeX(file, learners, **options): 2234 return apply(legend_to_PiCTeX, (file, [Orange.misc.getobjectname(learners[i], "Learner %i" % i) for i in range(len(learners))]), options) 2235 2236 def legend_to_PiCTeX(file, legend, **options): 2233 2237 import types 2234 2238 fopened=0 … … 2237 2241 fopened=1 2238 2242 2239 pointtypes=options.get("pointtypes", default PointTypes)2240 linetypes=options.get("linetypes", default LineTypes)2243 pointtypes=options.get("pointtypes", default_point_types) 2244 linetypes=options.get("linetypes", default_line_types) 2241 2245 2242 2246 file.write("\\mbox{\n") … … 2385 2389 return 2386 2390 2387 def print Figure(fig, *args, **kwargs):2391 def print_figure(fig, *args, **kwargs): 2388 2392 canvas = FigureCanvasAgg(fig) 2389 2393 canvas.print_figure(*args, **kwargs) … … 2424 2428 #get pairs of non significant methods 2425 2429 2426 def get Lines(sums, hsd):2430 def get_lines(sums, hsd): 2427 2431 2428 2432 #get all pairs … … 2433 2437 #keep only longest 2434 2438 2435 def no Longer((i,j), notSig):2439 def no_longer((i,j), notSig): 2436 2440 for i1,j1 in notSig: 2437 2441 if (i1 <= i and j1 > j) or (i1 < i and j1 >= j): … … 2439 2443 return True 2440 2444 2441 longest = [ (i,j) for i,j in notSig if no Longer((i,j),notSig) ]2445 longest = [ (i,j) for i,j in notSig if no_longer((i,j),notSig) ] 2442 2446 2443 2447 return longest 2444 2448 2445 lines = get Lines(ssums, cd)2449 lines = get_lines(ssums, cd) 2446 2450 linesblank = 0.2 + 0.2 + (len(lines)1)*0.1 2447 2451 … … 2526 2530 2527 2531 #non significance lines 2528 def draw Lines(lines, side=0.05, height=0.1):2532 def draw_lines(lines, side=0.05, height=0.1): 2529 2533 start = cline + 0.2 2530 2534 for l,r in lines: … … 2532 2536 start += height 2533 2537 2534 draw Lines(lines)2538 draw_lines(lines) 2535 2539 2536 2540 elif cd: … … 2541 2545 line([(end, cline + bigtick/2), (end, cline  bigtick/2)], linewidth=2.5) 2542 2546 2543 print Figure(fig, filename, **kwargs)2547 print_figure(fig, filename, **kwargs) 2544 2548 2545 2549 if __name__ == "__main__":
Note: See TracChangeset
for help on using the changeset viewer.