Changeset 8862:8870cc76ab29 in orange


Ignore:
Timestamp:
08/31/11 13:50:43 (3 years ago)
Author:
anze <anze.staric@…>
Branch:
default
Convert:
e4e5eda5768b1accf93c942de9989e069de508b2
Message:

Fixed links in documentation and did some minor corrections.
Closes #933

Location:
orange
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/classification/bayes.py

    r8107 r8862  
    11import Orange 
     2import Orange.core 
    23from Orange.core import BayesClassifier as _BayesClassifier 
    34from Orange.core import BayesLearner as _BayesLearner 
     
    1011    If data instances are provided to the constructor, the learning algorithm 
    1112    is called and the resulting classifier is returned instead of the learner. 
    12      
    13     .. 
    14         :param adjust_threshold: sets the corresponding attribute 
    15         :type adjust_threshold: boolean 
    16         :param m: sets the :obj:`estimatorConstructor` to 
    17             :class:`orange.ProbabilityEstimatorConstructor_m` with specified m 
    18         :type m: integer 
    19         :param estimator_constructor: sets the corresponding attribute 
    20         :type estimator_constructor: orange.ProbabilityEstimatorConstructor 
    21         :param conditional_estimator_constructor: sets the corresponding attribute 
    22         :type conditional_estimator_constructor: 
    23                 :class:`orange.ConditionalProbabilityEstimatorConstructor` 
    24         :param conditional_estimator_constructor_continuous: sets the corresponding 
    25                 attribute 
    26         :type conditional_estimator_constructor_continuous:  
    27                 :class:`orange.ConditionalProbabilityEstimatorConstructor` 
    2813                 
    2914    :rtype: :class:`Orange.classification.bayes.NaiveLearner` or 
     
    4328     
    4429        m for m-estimate. If set, m-estimation of probabilities 
    45         will be used using :class:`orange.ProbabilityEstimatorConstructor_m`. 
    46         This attribute is ignored if you also set estimatorConstructor. 
     30        will be used using :class:`Orange.statistics.estimate.ProbabilityEstimatorConstructor_m`. 
     31        This attribute is ignored if you also set estimator_constructor. 
    4732         
    4833    .. attribute:: estimator_constructor 
     
    5035        Probability estimator constructor for 
    5136        prior class probabilities. Defaults to 
    52         :class:`orange.ProbabilityEstimatorConstructor_relative`. 
     37        :class:`Orange.statistics.estimate.ProbabilityEstimatorConstructor_relative`. 
    5338        Setting this attribute disables the above described attribute m. 
    5439         
     
    6348        Probability estimator constructor for conditional probabilities for 
    6449        continuous features. Defaults to  
    65         :class:`orange.ConditionalProbabilityEstimatorConstructor_loess`. 
    66     """ 
    67      
    68     def __new__(cls, instances = None, weight_id = 0, **argkw): 
     50        :class:`Orange.statistics.estimate.ProbabilityEstimatorConstructor_loess`. 
     51    """ 
     52     
     53    def __new__(cls, data = None, weight_id = 0, **argkw): 
    6954        self = Orange.classification.Learner.__new__(cls, **argkw) 
    70         if instances: 
     55        if data: 
    7156            self.__init__(**argkw) 
    72             return self.__call__(instances, weight_id) 
     57            return self.__call__(data, weight_id) 
    7358        else: 
    7459            return self 
     
    8469        self.__dict__.update(argkw) 
    8570 
    86     def __call__(self, instances, weight=0): 
     71    def __call__(self, data, weight=0): 
    8772        """Learn from the given table of data instances. 
    8873         
    89         :param instances: Data instances to learn from. 
    90         :type instances: Orange.data.Table 
     74        :param data: Data instances to learn from. 
     75        :type data: Orange.data.Table 
    9176        :param weight: Id of meta attribute with weights of instances 
    92         :type weight: integer 
    93         :rtype: :class:`Orange.classification.bayes.NaiveBayesClassifier` 
     77        :type weight: int 
     78        :rtype: :class:`Orange.classification.bayes.NaiveClassifier` 
    9479        """ 
    9580        bayes = _BayesLearner() 
     
    11297        if self.adjust_threshold: 
    11398            bayes.adjust_threshold = self.adjust_threshold 
    114         return NaiveClassifier(bayes(instances, weight)) 
     99        return NaiveClassifier(bayes(data, weight)) 
    115100NaiveLearner = Orange.misc.deprecated_members( 
    116101{     "adjustThreshold": "adjust_threshold", 
     
    172157         
    173158        :rtype: :class:`Orange.data.Value`,  
    174               :class:`Orange.statistics.Distribution` or a tuple with both 
     159              :class:`Orange.statistics.distribution.Distribution` or a tuple with both 
    175160        """ 
    176161        return self.native_bayes_classifier(instance, result_type, *args, **kwdargs) 
     
    190175        returned from __call__. 
    191176         
    192         :param class_: class variable for which the probability should be 
     177        :param class_: class value for which the probability should be 
    193178                output. 
    194         :type class_: :class:`Orange.data.Variable` 
     179        :type class_: :class:`Orange.data.Value` 
    195180        :param instance: instance to be classified. 
    196181        :type instance: :class:`Orange.data.Instance` 
  • orange/doc/Orange/rst/Orange.classification.bayes.rst

    r8170 r8862  
    1010********************************** 
    1111 
    12 The most primitive Bayesian classifier is :obj:`NaiveLearner`. 
    13 `Naive Bayes classification algorithm <http://en.wikipedia.org/wiki/Naive_Bayes_classifier>`_ 
    14 estimates conditional probabilities from training data and uses them 
    15 for classification of new data instances. The algorithm learns very fast if all features 
    16 in the training data set are discrete. If a number of features are continues, though, the 
    17 algorithm runs slower due to time spent to estimate continuous conditional distributions. 
     12A `Naive Bayes classifier <http://en.wikipedia.org/wiki/Naive_Bayes_classifier>`_ 
     13is a simple probabilistic classifier that estimates conditional probabilities of the dependant variable 
     14from training data and uses them for classification of new data instances. The algorithm is very 
     15fast if all features in the training data set are discrete. If a number of features are continuous, 
     16though, the algorithm runs slower due to time spent to estimate continuous conditional distributions. 
    1817 
    1918The following example demonstrates a straightforward invocation of 
     
    5857 
    5958Probabilities for continuous features are estimated with \ 
    60 :class:`ProbabilityEstimatorConstructor_loess`. 
     59:class:`Orange.statistics.estimate.ProbabilityEstimatorConstructor_loess`. 
    6160(`bayes-plot-iris.py`_, uses `iris.tab`_): 
    6261 
     
    8180.. _titanic.tab: code/iris.tab 
    8281.. _lenses.tab: code/lenses.tab 
    83  
    84 Implementation details 
    85 ====================== 
    86  
    87 The following two classes are implemented in C++ (*bayes.cpp*). They are not 
    88 intended to be used directly. Here we provide implementation details for those 
    89 interested. 
    90  
    91 Orange.core.BayesLearner 
    92 ------------------------ 
    93 Fields estimatorConstructor, conditionalEstimatorConstructor and 
    94 conditionalEstimatorConstructorContinuous are empty (None) by default. 
    95  
    96 If estimatorConstructor is left undefined, p(C) will be estimated by relative 
    97 frequencies of examples (see ProbabilityEstimatorConstructor_relative). 
    98 When conditionalEstimatorConstructor is left undefined, it will use the same 
    99 constructor as for estimating unconditional probabilities (estimatorConstructor 
    100 is used as an estimator in ConditionalProbabilityEstimatorConstructor_ByRows). 
    101 That is, by default, both will use relative frequencies. But when 
    102 estimatorConstructor is set to, for instance, estimate probabilities by 
    103 m-estimate with m=2.0, the same estimator will be used for estimation of 
    104 conditional probabilities, too. 
    105 P(c|vi) for continuous attributes are, by default, estimated with loess (a 
    106 variant of locally weighted linear regression), using 
    107 ConditionalProbabilityEstimatorConstructor_loess. 
    108 The learner first constructs an estimator for p(C). It tries to get a 
    109 precomputed distribution of probabilities; if the estimator is capable of 
    110 returning it, the distribution is stored in the classifier's field distribution 
    111 and the just constructed estimator is disposed. Otherwise, the estimator is 
    112 stored in the classifier's field estimator, while the distribution is left 
    113 empty. 
    114  
    115 The same is then done for conditional probabilities. Different constructors are 
    116 used for discrete and continuous attributes. If the constructed estimator can 
    117 return all conditional probabilities in form of Contingency, the contingency is 
    118 stored and the estimator disposed. If not, the estimator is stored. If there 
    119 are no contingencies when the learning is finished, the resulting classifier's 
    120 conditionalDistributions is None. Alternatively, if all probabilities are 
    121 stored as contingencies, the conditionalEstimators fields is None. 
    122  
    123 Field normalizePredictions is copied to the resulting classifier. 
    124  
    125 Orange.core.BayesClassifier 
    126 --------------------------- 
    127 Class NaiveClassifier represents a naive bayesian classifier. Probability of 
    128 class C, knowing that values of features :math:`F_1, F_2, ..., F_n` are 
    129 :math:`v_1, v_2, ..., v_n`, is computed as :math:`p(C|v_1, v_2, ..., v_n) = \ 
    130 p(C) \\cdot \\frac{p(C|v_1)}{p(C)} \\cdot \\frac{p(C|v_2)}{p(C)} \\cdot ... \ 
    131 \\cdot \\frac{p(C|v_n)}{p(C)}`. 
    132  
    133 Note that when relative frequencies are used to estimate probabilities, the 
    134 more usual formula (with factors of form :math:`\\frac{p(v_i|C)}{p(v_i)}`) and 
    135 the above formula are exactly equivalent (without any additional assumptions of 
    136 independency, as one could think at a first glance). The difference becomes 
    137 important when using other ways to estimate probabilities, like, for instance, 
    138 m-estimate. In this case, the above formula is much more appropriate. 
    139  
    140 When computing the formula, probabilities p(C) are read from distribution, which 
    141 is of type Distribution, and stores a (normalized) probability of each class. 
    142 When distribution is None, BayesClassifier calls estimator to assess the 
    143 probability. The former method is faster and is actually used by all existing 
    144 methods of probability estimation. The latter is more flexible. 
    145  
    146 Conditional probabilities are computed similarly. Field conditionalDistribution 
    147 is of type DomainContingency which is basically a list of instances of 
    148 Contingency, one for each attribute; the outer variable of the contingency is 
    149 the attribute and the inner is the class. Contingency can be seen as a list of 
    150 normalized probability distributions. For attributes for which there is no 
    151 contingency in conditionalDistribution a corresponding estimator in 
    152 conditionalEstimators is used. The estimator is given the attribute value and 
    153 returns distributions of classes. 
    154  
    155 If neither, nor pre-computed contingency nor conditional estimator exist, the 
    156 attribute is ignored without issuing any warning. The attribute is also ignored 
    157 if its value is undefined; this cannot be overriden by estimators. 
    158  
    159 Any field (distribution, estimator, conditionalDistributions, 
    160 conditionalEstimators) can be None. For instance, BayesLearner normally 
    161 constructs a classifier which has either distribution or estimator defined. 
    162 While it is not an error to have both, only distribution will be used in that 
    163 case. As for the other two fields, they can be both defined and used 
    164 complementarily; the elements which are missing in one are defined in the 
    165 other. However, if there is no need for estimators, BayesLearner will not 
    166 construct an empty list; it will not construct a list at all, but leave the 
    167 field conditionalEstimators empty. 
    168  
    169 If you only need probabilities of individual class call BayesClassifier's 
    170 method p(class, example) to compute the probability of this class only. Note 
    171 that this probability will not be normalized and will thus, in general, not 
    172 equal the probability returned by the call operator. 
  • orange/doc/Orange/rst/code/bayes-run.py

    r8042 r8862  
    66 
    77import Orange 
    8 table = Orange.data.Table("titanic.tab") 
     8titanic = Orange.data.Table("titanic.tab") 
    99 
    1010learner = Orange.classification.bayes.NaiveLearner() 
    11 classifier = learner(table) 
     11classifier = learner(titanic) 
    1212 
    1313for ex in table[:5]: 
Note: See TracChangeset for help on using the changeset viewer.