Changeset 7513:e5b5b1372de2 in orange
 Timestamp:
 02/04/11 18:58:56 (3 years ago)
 Branch:
 default
 Convert:
 8b0a41b8557514b5a9f489689b0f66612385e941
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

orange/Orange/probability/distributions.py
r7452 r7513 1 1 """ 2 2 3 Orange has several classes for computing and storing basic statistics about 4 features, distributions and contingencies. 5 3 6 ================= 4 7 Basic Statistics for Continuous Features 5 8 ================= 6 9 7 8 .. class:: Orange.probability.distribution.BasicAttrStat 9 10 Orange contains two simple classes for computing basic statistics 11 for continuous features, such as their minimal and maximal value 12 or average: BasicAttrStat holds the statistics for a single feature 13 and DomainBasicAttrStat holds the statistics for all features in the domain. 14 10 The are two simple classes for computing basic statistics 11 for continuous features, such as their minimal and maximal value 12 or average: :class:`BasicAttrStat` holds the statistics for a single feature 13 and :class:`DomainBasicAttrStat` is a container storing a list of instances of 14 the above class for all features in the domain. 15 16 .. class:: BasicAttrStat 17 18 `BasicAttrStat` computes onthe fly statistics. 15 19 16 20 .. attribute:: variable … … 20 24 .. attribute:: min, max 21 25 22 Minimal and maximal feature value that wasencountered26 Minimal and maximal feature value encountered 23 27 in the data table. 24 28 … … 31 35 Number of instances for which the value was defined 32 36 (and used in the statistics). If instances were weighted, 33 nis the sum of weights of those instances.37 `n` is the sum of weights of those instances. 34 38 35 39 .. attribute:: sum, sum2 … … 38 42 squared values of this feature. 39 43 40 .. attribute:: holdRecomputation 41 42 Holds recomputation of the average and standard deviation. 43 44 .. method:: add(value[, weight]) 45 46 Adds a value to the statistics. Both arguments should be numbers; 47 weight is optional, default is 1.0. 48 49 .. method:: recompute() 50 51 Recomputes the average and deviation. 52 53 54 You most probably won't construct the class yourself, but instead call 55 DomainBasicAttrStat to compute statistics for all continuous 56 features in the dataset. 57 58 Nevertheless, here's how the class works. Values are fed into add; 59 this is usually done by DomainBasicAttrStat, but you can traverse the 60 instances and feed the values in Python, if you want to. For each value 61 it checks and, if necessary, adjusts min and max, adds the value to 62 sum and its square to sum2. The weight is added to n. If holdRecomputation 63 is false, it also computes the average and the deviation. 64 If true, this gets postponed until recompute is called. 65 It makes sense to postpone recomputation when using the class from C++, 66 while when using it from Python, the recomputation will take much much 67 less time than the Python interpreter, so you can leave it on. 68 69 You can see that the statistics does not include the median or, 70 more generally, any quantiles. That's because it only collects 71 statistics that can be computed on the fly, without remembering the data. 72 If you need quantiles, you will need to construct a ContDistribution. 73 74 75 .. _distributionsbasicstat: code/distributionsbasicstat.py 76 part of `distributionsbasicstat`_ (uses monks1.tab) 77 78 .. literalinclude:: code/distributionsbasicstat.py 79 :lines: 110 80 81 This code prints out:: 82 83 feature min max avg 84 sepal length 4.300 7.900 5.843 85 sepal width 2.000 4.400 3.054 86 petal length 1.000 6.900 3.759 87 petal width 0.100 2.500 1.199 88 89 .. class:: Orange.probability.distribution.DomainBasicAttrStat 90 91 DomainBasicAttrStat behaves as a list of BasicAttrStat except 92 for a few details. 93 94 Constructor expects an instance generator; 95 if instances are weighted, the second (otherwise optional) 96 arguments should give the id of the metaattribute with weights. 97 44 .. 45 .. attribute:: holdRecomputation 46 47 Holds recomputation of the average and standard deviation. 48 49 .. method:: add(value[, weight=1.0]) 50 51 Adds a value to the statistics. Both arguments should be numbers. 52 53 .. 54 .. method:: recompute() 55 56 Recomputes the average and deviation. 57 58 The class works as follows. Values are added by :obj:`add`, for each value 59 it checks and, if necessary, adjusts :obj:`min` and :obj:`max`, adds the value to 60 :obj:`sum` and its square to :obj:`sum2`. The weight is added to :obj:`n`. 61 62 The statistics does not include the median or any other statistics that can be computed on the fly, without remembering the data. Quantiles can be computed 63 by :obj:`ContDistribution`. !!!TODO 64 65 66 .. _distributionsbasicstat: code/distributionsbasicstat.py 67 part of `distributionsbasicstat`_ (uses monks1.tab) 68 69 .. literalinclude:: code/distributionsbasicstat.py 70 :lines: 110 71 72 This code prints out:: 73 74 feature min max avg 75 sepal length 4.300 7.900 5.843 76 sepal width 2.000 4.400 3.054 77 petal length 1.000 6.900 3.759 78 petal width 0.100 2.500 1.199 79 80 Instances of this class are seldom constructed manually; they are more often 81 returned as elements of the class :class:`DomainBasicAttrStat` described below. 82 83 .. class:: DomainBasicAttrStat 84 :param data: A table of instances 85 :type data: Orange.data.Table 86 :param weight: The id of the metaattribute with weights 87 :type data: `int` or none 88 98 89 DomainBasicAttrStat behaves like a ordinary list, except that its 99 elements can also be indexed by feature descriptors or fea ure names.90 elements can also be indexed by feature descriptors or feature names. 100 91 101 92 .. method:: purge()
Note: See TracChangeset
for help on using the changeset viewer.