Changeset 42:75bf74617e81 in orange-reliability for orangecontrib/reliability/__init__.py


Ignore:
Timestamp:
10/03/13 16:29:59 (7 months ago)
Author:
markotoplak
Branch:
default
Message:

Updates to the documentation.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orangecontrib/reliability/__init__.py

    r40 r42  
    364364    :rtype: :class:`Orange.evaluation.reliability.ReferenceExpectedErrorClassifier` 
    365365 
    366     Reference reliability estimation method for classification as used in Evaluating Reliability of Single 
    367     Classifications of Neural Networks, Darko Pevec, 2011. 
    368  
    369     :math:`O_{ref} = 2 (\hat y - \hat y ^2) = 2 \hat y (1-\hat y)` 
     366    Reference reliability estimation method for classification [Pevec2011]_: 
     367 
     368    :math:`O_{ref} = 2 (\hat y - \hat y ^2) = 2 \hat y (1-\hat y)`, 
    370369 
    371370    where :math:`\hat y` is the estimated probability of the predicted class. 
    372371 
    373     Note that for this method, in contrast with all others, a greater estimate means lower reliability (greater 
    374     expected error). 
     372    Note that for this method, in contrast with all others, a greater estimate means lower reliability (greater expected error). 
    375373 
    376374    """ 
     
    685683     
    686684    Mahalanobis distance reliability estimate is defined as 
    687     `mahalanobis distance <http://en.wikipedia.org/wiki/Mahalanobis_distance>`_ 
     685    `Mahalanobis distance <http://en.wikipedia.org/wiki/Mahalanobis_distance>`_ 
    688686    to the evaluated instance's :math:`k` nearest neighbours. 
    689687 
     
    720718     
    721719    Mahalanobis distance to center reliability estimate is defined as a 
    722     `mahalanobis distance <http://en.wikipedia.org/wiki/Mahalanobis_distance>`_ 
     720    `Mahalanobis distance <http://en.wikipedia.org/wiki/Mahalanobis_distance>`_ 
    723721    between the predicted instance and the centroid of the data. 
    724722 
     
    773771    :rtype: :class:`Orange.evaluation.reliability.BaggingVarianceCNeighboursClassifier` 
    774772     
    775     BVCK is a combination (average) of Bagging variance and local modeling of 
     773    BVCK is an average of Bagging variance and local modeling of 
    776774    prediction error. 
    777775     
     
    884882        return [Estimate(DENS, ABSOLUTE, DENS_ABSOLUTE)] 
    885883 
     884 
     885def _normalize(data): 
     886    dc = Orange.core.DomainContinuizer() 
     887    dc.classTreatment = Orange.core.DomainContinuizer.Ignore 
     888    dc.continuousTreatment = Orange.core.DomainContinuizer.NormalizeByVariance 
     889    domain = dc(data) 
     890    data = data.translate(domain) 
     891    return data 
     892 
     893class _NormalizedLearner(Orange.classification.Learner): 
     894    """ 
     895    Wrapper for normalization. 
     896    """ 
     897    def __init__(self, learner): 
     898        self.learner = learner 
     899 
     900    def __call__(self, data, *args, **kwargs): 
     901        return self.learner(_normalize(data), *args, **kwargs) 
     902 
    886903class Stacking: 
    887  
    888     def __init__(self, stack_learner, estimators=None, folds=10, save_data=False): 
     904    """ 
     905 
     906    This methods develops a model that integrates reliability estimates 
     907    from all available reliability scoring techniques. To develop such 
     908    model it needs to performs internal cross-validation, similarly to :class:`ICV`. 
     909 
     910    :param stack_learner: a data modelling method. Default (if None): unregularized linear regression with prior normalization. 
     911    :type stack_learner: :obj:`Orange.classification.Learner`  
     912 
     913    :param estimators: Reliability estimation methods to choose from. Default (if None): :class:`SensitivityAnalysis`, :class:`LocalCrossValidation`, :class:`BaggingVarianceCNeighbours`, :class:`Mahalanobis`, :class:`MahalanobisToCenter`. 
     914    :type estimators: :obj:`list` of reliability estimators 
     915  
     916    :param folds: The number of fold for cross validation (default 10). 
     917    :type box_learner: :obj:`int` 
     918 
     919    :param save_data: If True, save the data used for training the 
     920        model for intergration into resulting classifier's .data attribute (default False). 
     921    :type box_learner: :obj:`bool` 
     922  
     923    """ 
     924  
     925    def __init__(self,  
     926        stack_learner=None,  
     927        estimators=None,  
     928        folds=10,  
     929        save_data=False): 
    889930        self.stack_learner = stack_learner 
    890931        self.estimators = estimators 
    891932        self.folds = folds 
    892933        self.save_data = save_data 
     934        if self.stack_learner is None: 
     935            self.stack_learner=_NormalizedLearner(Orange.regression.linear.LinearRegressionLearner(ridge_lambda=0.0)) 
    893936        if self.estimators is None: 
    894937             self.estimators = [SensitivityAnalysis(), 
     
    947990                data_cv.append(estimates + [ abs(error) ]) 
    948991 
    949             print "DCV", len(data_cv) 
    950  
    951992        lf = None 
    952993 
     
    9601001        lf = Learner(learner, estimators=self.estimators)(data) 
    9611002 
    962         if self.save_data: 
    963             self.classifier_data = classifier_data 
    964  
    965         return StackingClassifier(stack_classifier, lf, newdomain) 
     1003        return StackingClassifier(stack_classifier, lf, newdomain, data=classifier_data if self.save_data else None) 
    9661004 
    9671005 
    9681006class StackingClassifier: 
    9691007 
    970     def __init__(self, stacking_classifier, reliability_classifier, domain): 
     1008    def __init__(self, stacking_classifier, reliability_classifier, domain, data=None): 
    9711009        self.stacking_classifier = stacking_classifier 
    972         print self.stacking_classifier 
    9731010        self.domain = domain 
    9741011        self.reliability_classifier = reliability_classifier 
     1012        self.data = data 
    9751013 
    9761014    def convert(self, instance): 
     
    9901028 
    9911029class ICV: 
    992     """ Perform internal cross validation (as in Automatic selection of 
    993     reliability estimates for individual regression predictions, 
    994     Zoran Bosnic, 2010) and return id of the method 
    995     that scored best on this data. 
     1030    """ Selects the best reliability estimator for 
     1031    the given data with internal cross validation [Bosnic2010]_. 
     1032 
     1033    :param estimators: reliability estimation methods to choose from. Default (if None): :class:`SensitivityAnalysis`, :class:`LocalCrossValidation`, :class:`BaggingVarianceCNeighbours`, :class:`Mahalanobis`, :class:`MahalanobisToCenter` ] 
     1034    :type estimators: :obj:`list` of reliability estimators 
     1035  
     1036    :param folds: The number of fold for cross validation (default 10). 
     1037    :type box_learner: :obj:`int` 
     1038  
    9961039    """ 
    9971040   
     
    10521095class Learner: 
    10531096    """ 
    1054     Reliability estimation wrapper around a learner we want to test. 
    1055     Different reliability estimation algorithms can be used on the 
    1056     chosen learner. This learner works as any other and can be used as one, 
    1057     but it returns the classifier, wrapped into an instance of 
     1097    Adds reliability estimation to any learner: multiple reliability estimation  
     1098    algorithms can be used simultaneously. 
     1099    This learner can be used as any other learner, 
     1100    but returns the classifier wrapped into an instance of 
    10581101    :class:`Orange.evaluation.reliability.Classifier`. 
    10591102     
    1060     :param box_learner: Learner we want to wrap into a reliability estimation 
     1103    :param box_learner: Learner to wrap into a reliability estimation 
    10611104        classifier. 
    10621105    :type box_learner: :obj:`~Orange.classification.Learner` 
    10631106     
    1064     :param estimators: List of different reliability estimation methods we 
    1065                        want to use on the chosen learner. 
     1107    :param estimators: List of reliability estimation methods. Default (if None): :class:`SensitivityAnalysis`, :class:`LocalCrossValidation`, :class:`BaggingVarianceCNeighbours`, :class:`Mahalanobis`, :class:`MahalanobisToCenter`. 
    10661108    :type estimators: :obj:`list` of reliability estimators 
    10671109     
    1068     :param name: Name of this reliability learner 
     1110    :param name: Name of this reliability learner. 
    10691111    :type name: string 
    10701112     
     
    10911133        """Learn from the given table of data instances. 
    10921134         
    1093         :param instances: Data instances to learn from. 
     1135        :param instances: Data to learn from. 
    10941136        :type instances: Orange.data.Table 
    10951137        :param weight: Id of meta attribute with weights of instances 
    10961138        :type weight: int 
     1139 
    10971140        :rtype: :class:`Orange.evaluation.reliability.Classifier` 
    10981141        """ 
     
    11081151class Classifier: 
    11091152    """ 
    1110     A reliability estimation wrapper for classifiers. 
    1111  
    1112     What distinguishes this classifier is that the returned probabilities (if 
    1113     :obj:`Orange.classification.Classifier.GetProbabilities` or 
    1114     :obj:`Orange.classification.Classifier.GetBoth` is passed) contain an 
    1115     additional attribute :obj:`reliability_estimate`, which is an instance of 
    1116     :class:`~Orange.evaluation.reliability.Estimate`. 
    1117  
     1153    A reliability estimation wrapper for classifiers.  
     1154    The returned probabilities contain an 
     1155    additional attribute :obj:`reliability_estimate`, which is a list of 
     1156    :class:`~Orange.evaluation.reliability.Estimate` (see :obj:`~Classifier.__call__`). 
    11181157    """ 
    11191158 
     
    11391178        :obj:`Orange.classification.Classifier.GetBoth` or 
    11401179        :obj:`Orange.classification.Classifier.GetProbabilities`, 
    1141         an additional attribute :obj:`reliability_estimate`, 
    1142         which is an instance of 
    1143         :class:`~Orange.evaluation.reliability.Estimate`, 
     1180        an additional attribute :obj:`reliability_estimate` 
     1181        (a list of :class:`~Orange.evaluation.reliability.Estimate`) 
    11441182        is added to the distribution object. 
    11451183         
Note: See TracChangeset for help on using the changeset viewer.