Changeset 7252:3a737f288391 in orange
 Timestamp:
 02/02/11 22:05:52 (3 years ago)
 Branch:
 default
 Convert:
 2a2511d74e3fa22469386886c75e8fea8f4aa8f2
 Location:
 orange
 Files:

 2 edited
Legend:
 Unmodified
 Added
 Removed

orange/Orange/classification/bayes.py
r7225 r7252 7 7 8 8 .. index:: Naive Bayesian Learner 9 .. autoclass:: Orange.classification.bayes.NaiveBayes 9 .. autoclass:: Orange.classification.bayes.NaiveBayesLearner 10 10 :members: 11 11 12 .. autoclass:: Orange.classification.bayes.NaiveBayesClassifier 13 :members: 14 15 Examples 16 ======== 17 Let us load the data, induce a classifier and see how it performs on the first 18 five examples. 19 20 >>> data = orange.ExampleTable("lenses") 21 >>> bayes = orange.BayesLearner(data) 22 >>> 23 >>> for ex in data[:5]: 24 ... print ex.getclass(), bayes(ex) 25 no no 26 no no 27 soft soft 28 no no 29 hard hard 30 31 The classifier is correct in all five cases. Interested in probabilities, 32 maybe? 33 34 >>> for ex in data[:5]: 35 ... print ex.getclass(), bayes(ex, orange.Classifier.GetProbabilities) 36 no <0.423, 0.000, 0.577> 37 no <0.000, 0.000, 1.000> 38 soft <0.000, 0.668, 0.332> 39 no <0.000, 0.000, 1.000> 40 hard <0.715, 0.000, 0.285> 41 42 While very confident about the second and the fourth example, the classifier 43 guessed the correct class of the first one only by a small margin of 42 vs. 44 58 percents. 45 46 Now, let us peek into the classifier. 47 48 >>> print bayes.estimator 49 None 50 >>> print bayes.distribution 51 <0.167, 0.208, 0.625> 52 >>> print bayes.conditionalEstimators 53 None 54 >>> print bayes.conditionalDistributions[0] 55 <'young': <0.250, 0.250, 0.500>, 'p_psby': <0.125, 0.250, 0.625>, (...) 56 >>> bayes.conditionalDistributions[0]["young"] 57 <0.250, 0.250, 0.500> 58 59 The classifier has no estimator since probabilities are stored in distribution. 60 The probability of the first class is 0.167, of the second 0.208 and the 61 probability of the third class is 0.625. Nor does it have 62 conditionalEstimators, probabilities are stored in conditionalDistributions. 63 We printed the contingency matrix for the first attribute and, in the last 64 line, conditional probabilities of the three classes when the value of the 65 first attribute is "young". 66 67 Let us now use mestimate instead of relative frequencies. 68 69 >>> bayesl = orange.BayesLearner() 70 >>> bayesl.estimatorConstructor = orange.ProbabilityEstimatorConstructor_m(m=2.0) 71 >>> bayes = bayesl(data) 72 73 The classifier is still correct for all examples. 74 75 >>> for ex in data[:5]: 76 ... print ex.getclass(), bayes(ex, no <0.375, 0.063, 0.562> 77 no <0.016, 0.003, 0.981> 78 soft <0.021, 0.607, 0.372> 79 no <0.001, 0.039, 0.960> 80 hard <0.632, 0.030, 0.338> 81 82 Observing probabilities shows a shift towards the third, more frequent class  83 as compared to probabilities above, where relative frequencies were used. 84 85 >>> print bayes.conditionalDistributions[0] 86 <'young': <0.233, 0.242, 0.525>, 'p_psby': <0.133, 0.242, 0.625>, (...) 87 88 Note that the change in error estimation did not have any effect on apriori 89 probabilities: 90 91 >>> print bayes.distribution 92 <0.167, 0.208, 0.625> 93 94 The reason for this is that this same distribution was used as apriori 95 distribution for mestimation. (How to enforce another apriori distribution? 96 While the orange C++ core supports of it, this feature has not been exported 97 to Python yet.) 98 99 Finally, let us show an example with continuous attributes. We will take iris 100 dataset that contains four continuous and no discrete attributes. 101 102 >>> data = orange.ExampleTable("iris") 103 >>> bayes = orange.BayesLearner(data) 104 >>> for exi in range(0, len(data), 20): 105 ... print data[exi].getclass(), bayes(data[exi], orange.Classifier.GetBoth) 106 107 The classifier works well. To see a glimpse of how it works, let us observe 108 conditional distributions for the first attribute. It is stored in 109 conditionalDistributions, as before, except that it now behaves as a dictionary, 110 not as a list like before (see information on distributions. 111 112 >>> print bayes.conditionalDistributions[0] 113 <4.300: <0.837, 0.137, 0.026>;, 4.333: <0.834, 0.140, 0.026>, 4.367: <0.830, (...) 114 115 For a nicer picture, we can print out the probabilities, copy and paste it to 116 some graph drawing program ... and get something like the figure below. 117 118 >>> for x, probs in bayes.conditionalDistributions[0].items(): 119 ... print "%5.3f\t%5.3f\t%5.3f\t%5.3f" % (x, probs[0], probs[1], probs[2]) 120 4.300 0.837 0.137 0.026 121 4.333 0.834 0.140 0.026 122 4.367 0.830 0.144 0.026 123 4.400 0.826 0.147 0.027 124 4.433 0.823 0.150 0.027 125 (...) 126 127 If petal lengths are shorter, the most probable class is "setosa". Irises with middle petal lengths belong to "versicolor", while longer petal lengths indicate for "virginica". Critical values where the decision would change are at about 5.4 and 6.3. 128 129 It is important to stress that the curves are relatively smooth although no fitting (either manual or automatic) of parameters took place. 130 12 131 """ 13 132 … … 16 135 from Orange.core import BayesLearner as _BayesLearner 17 136 18 class NaiveBayes(Orange.core.Learner): 19 """ 20 Naive bayes learner 137 class NaiveBayesLearner(Orange.core.Learner): 138 """ 139 Probabilistic classifier based on applying Bayes' theorem (from Bayesian 140 statistics) with strong (naive) independence assumptions. 141 If data instances are provided to the constructor, the learning algorithm 142 is called and the resulting classifier is returned instead of the learner. 143 144 :param adjustTreshold: If set and the class is binary, the classifier's 145 threshold will be set as to optimize the classification accuracy. 146 The threshold is tuned by observing the probabilities predicted on 147 learning data. Setting it to True can increase the 148 accuracy considerably. 149 :type adjustTreshold: boolean 150 :param m: m for mestimate. If set, mestimation of probabilities 151 will be used using :class:`orange.ProbabilityEstimatorConstructor_m` 152 This attribute is ignored if you also set estimatorConstructor. 153 :type m: integer 154 :param estimatorConstructor: Probability estimator constructor for 155 prior class probabilities. Defaults to 156 :`class:orange.ProbabilityEstimatorConstructor_relative` 157 Setting this attribute disables the above described attribute m. 158 :type estimatorConstructor: orange.ProbabilityEstimatorConstructor 159 :param conditionalEstimatorConstructor: Probability estimator constructor 160 for conditional probabilities for discrete features. If omitted, 161 the estimator for prior probabilities will be used. 162 :type conditionalEstimatorConstructor: orange.ConditionalProbabilityEstimatorConstructor 163 :param conditionalEstimatorConstructorContinuous: Probability estimator constructor 164 for conditional probabilities for continuous features. Defaults to 165 :class:`orange.ConditionalProbabilityEstimatorConstructor_loess` 166 :type conditionalEstimatorConstructorContinuous: orange.ConditionalProbabilityEstimatorConstructor 167 :rtype: :class:`Orange.classification.bayes.NaiveBayesLearner` or 168 :class:`Orange.classification.bayes.NaiveBayesClassifier` 21 169 """ 22 170 … … 29 177 return self 30 178 31 def __init__(self, normalizePredictions=True, adjustTreshold=False,32 m=0, estimatorConstructor=None,conditionalEstimatorConstructor=None,179 def __init__(self, adjustTreshold=False, m=0, estimatorConstructor=None, 180 conditionalEstimatorConstructor=None, 33 181 conditionalEstimatorConstructorContinuous=None,**argkw): 34 """35 :param adjustTreshold: If set and the class is binary, the classifier's36 threshold will be set as to optimize the classification accuracy.37 The threshold is tuned by observing the probabilities predicted on38 learning data. Default is False (to conform with the usual naive39 bayesian classifiers), but setting it to True can increase the40 accuracy considerably.41 :type adjustTreshold: boolean42 :param m: m for mestimate. If set, mestimation of probabilities43 will be used using :class:`orange.ProbabilityEstimatorConstructor_m`44 This attribute is ignored if you also set estimatorConstructor.45 :type m: integer46 :param estimatorConstructor: Probability estimator constructor for47 prior class probabilities. Defaults to48 :`class:orange.ProbabilityEstimatorConstructor_relative`49 Setting this attribute disables the above described attribute m.50 :type estimatorConstructor: orange.ProbabilityEstimatorConstructor51 :param conditionalEstimatorConstructor: Probability estimator constructor52 for conditional probabilities for discrete features. If omitted,53 the estimator for a priori will be used.54 class probabilities.55 :type conditionalEstimatorConstructor: orange.ConditionalProbabilityEstimatorConstructor56 :param conditionalEstimatorConstructorContinuous: Probability estimator constructor57 for conditional probabilities for continuous features. Defaults to58 :class:`orange.ConditionalProbabilityEstimatorConstructor_loess`59 :type conditionalEstimatorConstructorContinuous: orange.ConditionalProbabilityEstimatorConstructor60 """61 182 self.adjustThreshold = adjustTreshold 62 183 self.m = m … … 66 187 self.__dict__.update(argkw) 67 188 68 def __call__(self, examples, weight=0): 189 def __call__(self, instances, weight=0): 190 """Learn from the given table of data instances. 191 192 :param instances: Data instances to learn from. 193 :type instances: Orange.data.Table 194 :param weight: Id of meta attribute with weights of instances 195 :type weight: integer 196 :rtype: :class:`Orange.classification.bayes.NaiveBayesClassifier` 197 """ 69 198 bayes = _BayesLearner() 70 199 if self.estimatorConstructor: … … 86 215 bayes.conditionalEstimatorConstructorContinuous = self.conditionalEstimatorConstructorContinuous 87 216 88 return NaiveBayesClassifier(bayes( examples, weight))217 return NaiveBayesClassifier(bayes(instances, weight)) 89 218 90 219 class NaiveBayesClassifier(Orange.core.Classifier): 91 def __init__(self, nbc): 92 self.nativeBayesClassifier = nbc 220 """ 221 Wrapps a native BayesClassifier to add print method 222 :param: 223 """ 224 225 def __init__(self, nativeBayesClassifier): 226 self.nativeBayesClassifier = nativeBayesClassifier 93 227 for k, v in self.nativeBayesClassifier.__dict__.items(): 94 228 self.__dict__[k] = v 95 229 96 def __call__(self, *args, **kwdargs): 97 self.nativeBayesClassifier(*args, **kwdargs) 230 def __call__(self, instance, *args, **kwdargs): 231 """Classify a new instance 232 :param instance: instance to be classifier 233 :type instance: :class:`Orange.data.Instance` 234 :rtype: :class:Orange.data.` 235 """ 236 self.nativeBayesClassifier(instance, *args, **kwdargs) 98 237 99 238 def __setattr__(self, name, value): … … 105 244 self.__dict__[name] = value 106 245 246 def p(self, class_, instance): 247 """Return probability of single class 248 249 Probability is not normalized and can be different from probability 250 returned from __call__ 251 """ 252 return self.nativeBayesClassifier.p(class_, instance) 107 253 108 254 def printModel(self): 255 """Print classificator in human friendly format""" 109 256 nValues=len(self.classVar.values) 110 257 frmtStr=' %10.3f'*nValues 
orange/doc/Orange/rst/code/bayesrun.py
r7225 r7252 1 # Description: Self Organizing Maps on iris data set2 # Category: modelling3 # Uses: iris4 # Referenced: orngSOM.htm5 # Classes: orngSOM.SOMLearner6 7 1 import Orange 8 2 som = Orange.projection.som.SOMLearner(map_shape=(10, 20), initialize=Orange.projection.som.InitializeRandom)
Note: See TracChangeset
for help on using the changeset viewer.