Changeset 7513:e5b5b1372de2 in orange


Ignore:
Timestamp:
02/04/11 18:58:56 (3 years ago)
Author:
janezd <janez.demsar@…>
Branch:
default
Convert:
8b0a41b8557514b5a9f489689b0f66612385e941
Message:

Commiting a half-done files; needed for internal conflict resolution

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/probability/distributions.py

    r7452 r7513  
    11""" 
    22 
     3Orange has several classes for computing and storing basic statistics about 
     4features, distributions and contingencies. 
     5  
    36================= 
    47Basic Statistics for Continuous Features 
    58================= 
    69 
    7  
    8 .. class:: Orange.probability.distribution.BasicAttrStat 
    9  
    10     Orange contains two simple classes for computing basic statistics 
    11     for continuous features, such as their minimal and maximal value 
    12     or average: BasicAttrStat holds the statistics for a single feature 
    13     and DomainBasicAttrStat holds the statistics for all features in the domain. 
    14  
     10The are two simple classes for computing basic statistics 
     11for continuous features, such as their minimal and maximal value 
     12or average: :class:`BasicAttrStat` holds the statistics for a single feature 
     13and :class:`DomainBasicAttrStat` is a container storing a list of instances of 
     14the above class for all features in the domain. 
     15 
     16.. class:: BasicAttrStat 
     17 
     18    `BasicAttrStat` computes on-the fly statistics.  
    1519 
    1620    .. attribute:: variable 
     
    2024    .. attribute:: min, max 
    2125 
    22         Minimal and maximal feature value that was encountered 
     26        Minimal and maximal feature value encountered 
    2327        in the data table. 
    2428 
     
    3135        Number of instances for which the value was defined 
    3236        (and used in the statistics). If instances were weighted, 
    33         n is the sum of weights of those instances. 
     37        `n` is the sum of weights of those instances. 
    3438 
    3539    .. attribute:: sum, sum2 
     
    3842        squared values of this feature. 
    3943 
    40     .. attribute:: holdRecomputation 
    41  
    42         Holds recomputation of the average and standard deviation. 
    43  
    44     .. method:: add(value[, weight]) 
    45  
    46         Adds a value to the statistics. Both arguments should be numbers; 
    47         weight is optional, default is 1.0. 
    48          
    49     .. method:: recompute() 
    50  
    51         Recomputes the average and deviation. 
    52  
    53  
    54 You most probably won't construct the class yourself, but instead call 
    55 DomainBasicAttrStat to compute statistics for all continuous 
    56 features in the dataset. 
    57  
    58 Nevertheless, here's how the class works. Values are fed into add; 
    59 this is usually done by DomainBasicAttrStat, but you can traverse the 
    60 instances and feed the values in Python, if you want to. For each value 
    61 it checks and, if necessary, adjusts min and max, adds the value to 
    62 sum and its square to sum2. The weight is added to n. If holdRecomputation 
    63 is false, it also computes the average and the deviation. 
    64 If true, this gets postponed until recompute is called. 
    65 It makes sense to postpone recomputation when using the class from C++, 
    66 while when using it from Python, the recomputation will take much much 
    67 less time than the Python interpreter, so you can leave it on. 
    68  
    69 You can see that the statistics does not include the median or, 
    70 more generally, any quantiles. That's because it only collects 
    71 statistics that can be computed on the fly, without remembering the data. 
    72 If you need quantiles, you will need to construct a ContDistribution. 
    73  
    74  
    75 .. _distributions-basic-stat: code/distributions-basic-stat.py 
    76 part of `distributions-basic-stat`_ (uses monks-1.tab) 
    77  
    78 .. literalinclude:: code/distributions-basic-stat.py 
    79     :lines: 1-10 
    80  
    81 This code prints out:: 
    82  
    83              feature   min   max   avg 
    84         sepal length 4.300 7.900 5.843 
    85          sepal width 2.000 4.400 3.054 
    86         petal length 1.000 6.900 3.759 
    87          petal width 0.100 2.500 1.199 
    88  
    89 .. class:: Orange.probability.distribution.DomainBasicAttrStat 
    90  
    91     DomainBasicAttrStat behaves as a list of BasicAttrStat except 
    92     for a few details. 
    93  
    94     Constructor expects an instance generator; 
    95     if instances are weighted, the second (otherwise optional) 
    96     arguments should give the id of the meta-attribute with weights. 
    97  
     44    .. 
     45        .. attribute:: holdRecomputation 
     46     
     47            Holds recomputation of the average and standard deviation. 
     48 
     49    .. method:: add(value[, weight=1.0]) 
     50 
     51        Adds a value to the statistics. Both arguments should be numbers. 
     52 
     53    .. 
     54        .. method:: recompute() 
     55 
     56            Recomputes the average and deviation. 
     57 
     58    The class works as follows. Values are added by :obj:`add`, for each value 
     59    it checks and, if necessary, adjusts :obj:`min` and :obj:`max`, adds the value to 
     60    :obj:`sum` and its square to :obj:`sum2`. The weight is added to :obj:`n`. 
     61 
     62    The statistics does not include the median or any other statistics that can be computed on the fly, without remembering the data. Quantiles can be computed 
     63    by :obj:`ContDistribution`. !!!TODO 
     64 
     65 
     66    .. _distributions-basic-stat: code/distributions-basic-stat.py 
     67    part of `distributions-basic-stat`_ (uses monks-1.tab) 
     68 
     69    .. literalinclude:: code/distributions-basic-stat.py 
     70        :lines: 1-10 
     71 
     72    This code prints out:: 
     73 
     74                 feature   min   max   avg 
     75            sepal length 4.300 7.900 5.843 
     76             sepal width 2.000 4.400 3.054 
     77            petal length 1.000 6.900 3.759 
     78             petal width 0.100 2.500 1.199 
     79 
     80    Instances of this class are seldom constructed manually; they are more often 
     81    returned as elements of the class :class:`DomainBasicAttrStat`  described below. 
     82 
     83.. class:: DomainBasicAttrStat 
     84    :param data: A table of instances 
     85    :type data: Orange.data.Table 
     86    :param weight: The id of the meta-attribute with weights 
     87    :type data: `int` or none 
     88     
    9889    DomainBasicAttrStat behaves like a ordinary list, except that its 
    99     elements can also be indexed by feature descriptors or feaure names.     
     90    elements can also be indexed by feature descriptors or feature names.     
    10091 
    10192    .. method:: purge() 
Note: See TracChangeset for help on using the changeset viewer.