Changeset 7622:2c815d70f149 in orange


Ignore:
Timestamp:
02/08/11 19:23:29 (3 years ago)
Author:
janezd <janez.demsar@…>
Branch:
default
Convert:
fe1c1ca8485b1daaf7ee48b7af6334d598e7a021
Message:

Documentation for distributions.py is done, but needs further checking.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/statistics/distributions.py

    r7621 r7622  
    55 
    66     
    7 ======================================== 
    8 Basic Statistics for Continuous Features 
    9 ======================================== 
     7========================================= 
     8Basic Statistics for Continuous Variables 
     9========================================= 
    1010 
    1111The are two simple classes for computing basic statistics 
    1212for continuous features, such as their minimal and maximal value 
    13 or average: :class:`BasicStatistics` holds the statistics for a single feature 
    14 and :class:`DomainBasicStatistics` is a container storing a list of instances of 
    15 the above class for all features in the domain. 
     13or average: :class:`BasicStatistics` holds the statistics for a single variable 
     14and :class:`DomainBasicStatistics` behaves like a list of instances of 
     15the above class for all variables in the domain. 
    1616 
    1717.. class:: BasicStatistics 
    1818 
    19     `DomainBasicStatistics` computes on-the fly statistics.  
     19    ``BasicStatistics`` computes and stores minimal, maximal, average and 
     20    standard deviation of a variable. It does not include the median or any 
     21    other statistics that can be computed on the fly, without remembering the 
     22    data; such statistics can be obtained using :obj:`ContDistribution`. !!!TODO 
     23 
     24    Instances of this class are seldom constructed manually; they are more often 
     25    returned by :obj:`DomainBasicStatistics` described below. 
    2026 
    2127    .. attribute:: variable 
    2228     
    23         Descriptor for the variable to which the data applies. 
    24  
    25     .. attribute:: min, max 
    26  
    27         Minimal and maximal variable value encountered. 
    28  
    29     .. attribute:: avg, dev 
    30  
    31         Average value and standard deviation. 
     29        The variable to which the data applies. 
     30 
     31    .. attribute:: min 
     32 
     33        Minimal value encountered 
     34 
     35    .. attribute:: max 
     36 
     37        Maximal value encountered 
     38 
     39    .. attribute:: avg 
     40 
     41        Average value 
     42 
     43    .. attribute:: dev 
     44 
     45        Standard deviation 
    3246 
    3347    .. attribute:: n 
    3448 
    35         Number of instances for which the value was defined 
    36         (and used in the statistics). If instances were weighted, 
    37         ``n`` is the sum of weights of those instances. 
    38  
    39     .. attribute:: sum, sum2 
    40  
    41         Weighted sum of values and weighted sum of 
    42         squared values of this feature. 
     49        Number of instances for which the value was defined. 
     50        If instances were weighted, :obj:`n` holds the sum of weights 
     51         
     52    .. attribute:: sum 
     53 
     54        Weighted sum of values 
     55 
     56    .. attribute:: sum2 
     57 
     58        Weighted sum of squared values 
    4359 
    4460    .. 
     
    4965    .. method:: add(value[, weight=1]) 
    5066     
    51         Add a value to the statistics. 
     67        Add a value to the statistics: adjust :obj:`min` and :obj:`max` if 
     68        necessary, increase :obj:`n` and recompute :obj:`sum`, :obj:`sum2`, 
     69        :obj:`avg` and :obj:`dev`. 
    5270 
    5371        :param value: Value to be added to the statistics 
     
    6179            Recompute the average and deviation. 
    6280 
    63     The class works as follows. Values are added by :obj:`add`, for each value 
    64     it checks and, if necessary, adjusts :obj:`min` and :obj:`max`, adds the value to 
    65     :obj:`sum` and its square to :obj:`sum2`. The weight is added to :obj:`n`. 
    66  
    67  
    68     The statistics does not include the median or any other statistics that can be computed on the fly, without remembering the data. Quantiles can be computed 
    69     by :obj:`ContDistribution`. !!!TODO 
    70  
    71     Instances of this class are seldom constructed manually; they are more often 
    72     returned by :obj:`DomainBasicStatistics` described below. 
    73  
    7481.. class:: DomainBasicStatistics 
    7582 
    7683    ``DomainBasicStatistics`` behaves like a ordinary list, except that its 
    77     elements can also be indexed by feature descriptors or feature names. 
     84    elements can also be indexed by variable names or descriptors. 
    7885 
    7986    .. method:: __init__(data[, weight=None]) 
    8087 
    81         Compute the statistics for all continuous features in the 
    82         give data, and put `None` to the places corresponding to features of other types. 
     88        Compute the statistics for all continuous features in the data, and put 
     89        :obj:`None` to the places corresponding to variables of other types. 
    8390 
    8491        :param data: A table of instances 
     
    8996    .. method:: purge() 
    9097     
    91         Remove the ``None``'s corresponding to non-continuous features. 
     98        Remove the :obj:`None`'s corresponding to non-continuous features; this 
     99        truncates the list, so the indices do not respond to indices of 
     100        variables in the domain. 
    92101     
    93102    part of `distributions-basic-stat.py`_ (uses monks-1.tab) 
     
    118127 
    119128 
     129================== 
    120130Contingency Matrix 
    121131================== 
    122132 
    123 Contingency matrix contains conditional distributions. When initialized, they 
    124 will typically contain absolute frequencies, that is, the number of instances 
    125 with a particular combination of two variables' values. If they are normalized 
    126 by dividing each cell by the row sum, the represent conditional probabilities 
    127 of the column variable (here denoted as ``innerVariable``) conditioned by the 
    128 row variable (``outerVariable``).  
    129  
    130 Contingencies work with both, discrete and continuous variables. 
     133Contingency matrix contains conditional distributions. Unless explicitly 
     134'normalized', they contain absolute frequencies, that is, the number of 
     135instances with a particular combination of two variables' values. If they are 
     136normalized by dividing each cell by the row sum, the represent conditional 
     137probabilities of the column variable (here denoted as ``innerVariable``) 
     138conditioned by the row variable (``outerVariable``). 
     139 
     140Contingency matrices are usually constructed for discrete variables. Matrices 
     141for continuous variables have certain limitations described in a :ref:`separate 
     142section <contcont>`. 
     143 
     144The example below loads the monks-1 data set and prints out the conditional 
     145class distribution given the value of `e`. 
    131146 
    132147.. _distributions-contingency: code/distributions-contingency.py 
     
    144159    4 <72.000, 36.000>  
    145160 
    146 Contingencies behave like lists of distributions (in this case, class distributions) indexed by values (of `e`, in this example). Distributions are, in turn indexed 
    147 by values (class values, here). The variable `e` from the above example is called 
    148 the outer variable, and the class is the inner. This can also be reversed, and it 
    149 is also possible to use features for both, outer and inner variable, so the 
    150 matrix shows distributions of one variable's values given the value of another. 
    151 There is a corresponding hierarchy of classes for handling hierarchies: :obj:`Contingency` is a base class for :obj:`ContingencyVarVar` (both variables 
    152 are attribtes) and :obj:`ContingencyClass` (one variable is the class). 
    153 The latter is the base class for :obj:`ContingencyVarClass` and :obj:`ContingencyClassVar`. 
     161Contingencies behave like lists of distributions (in this case, class 
     162distributions) indexed by values (of `e`, in this 
     163example). Distributions are, in turn indexed by values (class values, 
     164here). The variable `e` from the above example is called the outer 
     165variable, and the class is the inner. This can also be reversed. It is 
     166also possible to use features for both, outer and inner variable, so 
     167the matrix shows distributions of one variable's values given the 
     168value of another.  There is a corresponding hierarchy of classes: 
     169:obj:`Contingency` is a base class for :obj:`ContingencyVarVar` (both 
     170variables are attribtes) and :obj:`ContingencyClass` (one variable is 
     171the class).  The latter is the base class for 
     172:obj:`ContingencyVarClass` and :obj:`ContingencyClassVar`. 
    154173 
    155174The most commonly used of the above classes is :obj:`ContingencyVarClass` which 
    156175can compute and store conditional probabilities of classes given the feature value. 
    157176 
    158 .. class:: Orange.statistics.distribution.Contingency 
     177Classes for storing contingency matrices 
     178======================================== 
     179 
     180.. class:: Contingency 
     181 
     182    Provides a base class for storing and manipulating contingency 
     183    matrices. Although it is not abstract, it is seldom used directly but rather 
     184    through more convenient derived classes described below. 
    159185 
    160186    .. attribute:: outerVariable 
    161187 
    162        Descriptor (:class:`Orange.data.feature.Feature`) of the outer variable. 
     188       Outer variable (:class:`Orange.data.feature.Feature`) whose values are 
     189       used as the first, outer index. 
    163190 
    164191    .. attribute:: innerVariable 
    165192 
    166         Descriptor (:class:`Orange.data.feature.Feature`) of the inner variable. 
     193       Inner variable(:class:`Orange.data.feature.Feature`), whose values are 
     194       used as the second, inner index. 
    167195  
    168196    .. attribute:: outerDistribution 
     
    176204    .. attribute:: innerDistributionUnknown 
    177205 
    178         The distribution (:class:`Distribution`) of the inner variable for  
    179         instances for which the outer variable was undefined. 
    180         This is the difference between the ``innerDistribution`` 
    181         and unconditional distribution of inner variable. 
     206        The distribution (:class:`Distribution`) of the inner variable for 
     207        instances for which the outer variable was undefined. This is the 
     208        difference between the ``innerDistribution`` and (unconditional) 
     209        distribution of inner variable. 
    182210       
    183211    .. attribute:: varType 
    184212 
    185         The type of the outer feature (:obj:`Orange.data.Type`, usually 
    186         :obj:`Orange.data.feature.Discrete` or  
    187         :obj:`Orange.data.feature.Continuous`). ``varType`` equals ``outerVariable.varType`` and ``outerDistribution.varType``. 
    188  
    189     .. method:: __init__(outerVariable, innerVariable) 
     213        The type of the outer variable (:obj:`Orange.data.Type`, usually 
     214        :obj:`Orange.data.feature.Discrete` or 
     215        :obj:`Orange.data.feature.Continuous`); equals 
     216        ``outerVariable.varType`` and ``outerDistribution.varType``. 
     217 
     218    .. method:: __init__(outer_variable, inner_variable) 
    190219      
    191220        Construct an instance of ``Contingency`` for the given pair of 
    192221        variables. 
    193222      
    194         :param outerVariable: Descriptor of the outer variable 
    195         :type outerVariable: Orange.data.feature.Feature 
    196         :param outerVariable: Descriptor of the inner variable 
    197         :type innerVariable: Orange.data.feature.Feature 
     223        :param outer_variable: Descriptor of the outer variable 
     224        :type outer_variable: Orange.data.feature.Feature 
     225        :param outer_variable: Descriptor of the inner variable 
     226        :type inner_variable: Orange.data.feature.Feature 
    198227         
    199228    .. method:: add(outer_value, inner_value[, weight=1]) 
    200229     
    201         Add an element to the contingency matrix by adding 
    202         ``weight`` to the corresponding cell. 
     230        Add an element to the contingency matrix by adding ``weight`` to the 
     231        corresponding cell. 
    203232 
    204233        :param outer_value: The value for the outer variable 
     
    211240    .. method:: normalize() 
    212241 
    213         Normalize all distributions (rows) in the contingency to sum to ``1``:: 
     242        Normalize all distributions (rows) in the matrix to sum to ``1``:: 
    214243         
    215244            >>> cont.normalize() 
     
    226255        .. note:: 
    227256        
    228             This method doesn't change the ``innerDistribution`` or 
     257            This method does not change the ``innerDistribution`` or 
    229258            ``outerDistribution``. 
    230259         
    231260    With respect to indexing, contingency matrix is a cross between dictionary 
    232261    and a list. It supports standard dictionary methods ``keys``, ``values`` and 
    233     ``items``.:: 
     262    ``items``. :: 
    234263 
    235264        >> print cont.keys() 
     
    241270        ('3', <72.000, 36.000>), ('4', <72.000, 36.000>)]  
    242271 
    243     Although keys returned by the above functions are strings, contingency 
    244     can be indexed with anything that converts into values 
    245     of the outer variable: strings, numbers or instances of ``Orange.data.Value``.:: 
     272    Although keys returned by the above functions are strings, contingency can 
     273    be indexed by anything that can be converted into values of the outer 
     274    variable: strings, numbers or instances of ``Orange.data.Value``. :: 
    246275 
    247276        >>> print cont[0] 
     
    253282    The length of ``Contingency`` equals the number of values of the outer 
    254283    variable. However, iterating through contingency 
    255     doesn't return keys, as with dictionaries, but distributions.:: 
     284    does not return keys, as with dictionaries, but distributions. :: 
    256285 
    257286        >>> for i in cont: 
     
    264293 
    265294 
    266 .. class:: Orange.statistics.distribution.ContingencyClass 
    267  
    268     ``ContingencyClass`` is an abstract base class for contingency matrices 
    269     that contain the class, either as the inner or the outer 
    270     variable. 
     295.. class:: ContingencyClass 
     296 
     297    An abstract base class for contingency matrices that contain the class, 
     298    either as the inner or the outer variable. 
    271299 
    272300    .. attribute:: classVar (read only) 
    273301     
    274         The class attribute descriptor. 
    275         This is always equal either to :obj:`Contingency.innerVariable` or 
    276         ``outerVariable``. 
     302        The class attribute descriptor; always equal to either 
     303        :obj:`Contingency.innerVariable` or :obj:``Contingency.outerVariable``. 
    277304 
    278305    .. attribute:: variable 
    279306     
    280         The class attribute descriptor. 
    281         This is always equal either to innerVariable or outerVariable 
    282  
    283     .. method:: add_attrclass(variable_value, class_value[, weight]) 
    284  
    285         Adds an element to contingency. The difference between this and 
    286         Contigency.add is that the variable value is always the first 
    287         argument and class value the second, regardless of what is inner and 
    288         outer.  
     307        Variable; always equal either to either innerVariable or outerVariable 
     308 
     309    .. method:: add_attrclass(variable_value, class_value[, weight=1]) 
     310 
     311        Add an element to contingency by increasing the corresponding count. The 
     312        difference between this and :obj:`Contigency.add` is that the variable 
     313        value is always the first argument and class value the second, 
     314        regardless of which one is inner and which one is outer. 
    289315 
    290316        :param attribute_value: Variable value 
     
    296322 
    297323 
    298  
    299 .. class:: Orange.statistics.distribution.ContingencyVarClass 
    300  
    301     A class derived from :obj:`ContingencyVarClass`, which uses a given feature 
    302     as the :obj:`Contingency.outerVariable` and class as the 
    303     :obj:`Contingency.innerVariable` to provide a form suitable for computation 
    304     of conditional class probabilities given the variable value. 
    305      
    306     Calling :obj:`ContingencyVarClass.add_attrclass(v, c)` is equivalent 
    307     to calling :obj:`Contingency.add(v, c)`. Similar as :obj:`Contingency`, 
     324.. class:: ContingencyVarClass 
     325 
     326    A class derived from :obj:`ContingencyVarClass` in which the variable is 
     327    used as :obj:`Contingency.outerVariable` and class as the 
     328    :obj:`Contingency.innerVariable`. This form is a form suitable for 
     329    computation of conditional class probabilities given the variable value. 
     330     
     331    Calling :obj:`ContingencyVarClass.add_attrclass(v, c)` is equivalent to 
     332    :obj:`Contingency.add(v, c)`. Similar as :obj:`Contingency`, 
    308333    :obj:`ContingencyVarClass` can compute contingency from instances. 
    309334 
    310     .. method:: __init__(feature, class_attribute) 
     335    .. method:: __init__(feature, class_variable) 
    311336 
    312337        Construct an instance of :obj:`ContingencyVarClass` for the given pair of 
    313338        variables. Inherited from :obj:`Contingency`. 
    314339 
    315         :param outerVariable: Descriptor of the outer variable 
    316         :type outerVariable: Orange.data.feature.Feature 
    317         :param outerVariable: Descriptor of the inner variable 
    318         :type innerVariable: Orange.data.feature.Feature 
     340        :param feature: Outer variable 
     341        :type feature: Orange.data.feature.Feature 
     342        :param class_attribute: Class variable; used as ``innerVariable`` 
     343        :type class_attribute: Orange.data.feature.Feature 
    319344         
    320345    .. method:: __init__(feature, data[, weightId]) 
    321346 
    322         Compute the contingency from the given instances.      
     347        Compute the contingency from data. 
     348 
     349        :param feature: Outer variable 
     350        :type feature: Orange.data.feature.Feature 
     351        :param data: A set of instances 
     352        :type data: Orange.data.Table 
     353        :param weightId: meta attribute with weights of instances 
     354        :type weightId: int 
     355 
     356    .. method:: p_class(value) 
     357 
     358        Return the probability distribution of classes given the value of the 
     359        variable. 
     360 
     361        :param value: The value of the variable 
     362        :type value: int, float, string or :obj:`Orange.data.Value` 
     363        :rtype: Orange.statistics.distribution.Distribution 
     364 
     365 
     366    .. method:: p_class(value, class_value) 
     367 
     368        Returns the conditional probability of the class_value given the 
     369        feature value, p(class_value|value) (note the order of arguments!) 
     370         
     371        :param value: The value of the variable 
     372        :type value: int, float, string or :obj:`Orange.data.Value` 
     373        :param class_value: The class value 
     374        :type value: int, float, string or :obj:`Orange.data.Value` 
     375        :rtype: float 
     376 
     377    .. _distributions-contingency3.py: code/distributions-contingency3.py 
     378 
     379    part of `distributions-contingency3.py`_ (uses monks-1.tab) 
     380 
     381    .. literalinclude:: code/distributions-contingency3.py 
     382        :lines: 1-25 
     383 
     384    The inner and the outer variable and their relations to the class are 
     385    as follows:: 
     386 
     387        Inner variable:  y 
     388        Outer variable:  e 
     389     
     390        Class variable:  y 
     391        Feature:         e 
     392 
     393    Distributions are normalized, and probabilities are elements from the 
     394    normalized distributions. Knowing that the target concept is 
     395    y := (e=1) or (a=b), distributions are as expected: when e equals 1, class 1 
     396    has a 100% probability, while for the rest, probability is one third, which 
     397    agrees with a probability that two three-valued independent features 
     398    have the same value. :: 
     399 
     400        Distributions: 
     401          p(.|1) = <0.000, 1.000> 
     402          p(.|2) = <0.662, 0.338> 
     403          p(.|3) = <0.659, 0.341> 
     404          p(.|4) = <0.669, 0.331> 
     405     
     406        Probabilities of class '1' 
     407          p(1|1) = 1.000 
     408          p(1|2) = 0.338 
     409          p(1|3) = 0.341 
     410          p(1|4) = 0.331 
     411     
     412        Distributions from a matrix computed manually: 
     413          p(.|1) = <0.000, 1.000> 
     414          p(.|2) = <0.662, 0.338> 
     415          p(.|3) = <0.659, 0.341> 
     416          p(.|4) = <0.669, 0.331> 
     417 
     418 
     419.. class:: ContingencyClassVar 
     420 
     421    :obj:`ContingencyClassVar` is similar to :obj:`ContingencyVarClass` except 
     422    that the class is outside and the variable is inside. This form of 
     423    contingency matrix is suitable for computing conditional probabilities of 
     424    variable given the class. All methods get the two arguments in the same 
     425    order as :obj:`ContingencyVarClass`. 
     426 
     427    .. method:: __init__(feature, class_variable) 
     428 
     429        Construct an instance of :obj:`ContingencyVarClass` for the given pair of 
     430        variables. Inherited from :obj:`Contingency`, except for the reversed 
     431        order of arguments. 
     432 
     433        :param feature: Outer variable 
     434        :type feature: Orange.data.feature.Feature 
     435        :param class_variable: Class variable 
     436        :type class_variable: Orange.data.feature.Feature 
     437         
     438    .. method:: __init__(feature, data[, weightId]) 
     439 
     440        Compute contingency from the data. 
    323441 
    324442        :param feature: Descriptor of the outer variable 
     
    329447        :type weightId: int 
    330448 
    331     .. method:: p_class(value) 
    332  
    333         Return the probability distribution of classes given the value of the 
    334         variable. Equivalent to `self[value]`, except for normalization. 
    335  
    336         :param value: The value of the variable 
    337         :type value: int, float, string or :obj:`Orange.data.Value` 
    338  
    339     .. method:: p_class(value, class_value) 
    340  
    341         Returns the conditional probability of the class_value given the 
    342         feature value, p(class_value|value) (note the order of arguments!) 
    343         Equivalent to `self[values][class_value]`, except for normalization. 
    344          
    345         :param value: The value of the variable 
    346         :type value: int, float, string or :obj:`Orange.data.Value` 
    347         :param class_value: The class value 
    348         :type value: int, float, string or :obj:`Orange.data.Value` 
    349  
    350     .. _distributions-contingency3.py: code/distributions-contingency3.py 
    351  
    352     part of `distributions-contingency3.py`_ (uses monks-1.tab) 
    353  
    354     .. literalinclude:: code/distributions-contingency3.py 
    355         :lines: 1-25 
    356  
    357     The inner and the outer variable and their relations to the class are 
    358     as follows:: 
    359  
    360         Inner variable:  y 
    361         Outer variable:  e 
    362      
    363         Class variable:  y 
    364         Feature:         e 
    365  
    366     Distributions are normalized and probabilities are elements from the 
    367     normalized distributions. Knowing that the target concept is 
    368     y := (e=1) or (a=b), distributions are as expected: when e equals 1, class 1 
    369     has a 100% probability, while for the rest, probability is one third, which 
    370     agrees with a probability that two three-valued independent features 
    371     have the same value. :: 
    372  
    373         Distributions: 
    374           p(.|1) = <0.000, 1.000> 
    375           p(.|2) = <0.662, 0.338> 
    376           p(.|3) = <0.659, 0.341> 
    377           p(.|4) = <0.669, 0.331> 
    378      
    379         Probabilities of class '1' 
    380           p(1|1) = 1.000 
    381           p(1|2) = 0.338 
    382           p(1|3) = 0.341 
    383           p(1|4) = 0.331 
    384      
    385         Distributions from a matrix computed manually: 
    386           p(.|1) = <0.000, 1.000> 
    387           p(.|2) = <0.662, 0.338> 
    388           p(.|3) = <0.659, 0.341> 
    389           p(.|4) = <0.669, 0.331> 
    390  
    391  
    392 .. class:: Orange.statistics.distribution.ContingencyClassVar 
    393  
    394     :obj:`ContingencyClassVar` is similar to :obj:`ContingencyVarClass` except 
    395     that here the class is outside and the variable is inside. This form of 
    396     contingency matrix is suitable for computing conditional probabilities of 
    397     variable given the class. All methods get the two arguments in the same 
    398     order as in :obj:`ContingencyVarClass`. 
    399  
    400     .. method:: __init__(feature, class_attribute) 
    401  
    402         Construct an instance of :obj:`ContingencyVarClass` for the given pair of 
    403         variables. Inherited from :obj:`Contingency`, except for the reversed 
    404         order. 
    405  
    406         :param outerVariable: Descriptor of the outer variable 
    407         :type outerVariable: Orange.data.feature.Feature 
    408         :param outerVariable: Descriptor of the inner variable 
    409         :type innerVariable: Orange.data.feature.Feature 
    410          
    411     .. method:: __init__(feature, instances[, weightId]) 
    412  
    413         Compute the contingency from the given instances.      
    414  
    415         :param feature: Descriptor of the outer variable 
    416         :type feature: Orange.data.feature.Feature 
    417         :param data: A set of instances 
    418         :type data: Orange.data.Table 
    419         :param weightId: meta attribute with weights of instances 
    420         :type weightId: int 
    421  
    422449    .. method:: p_attr(class_value) 
    423450 
    424451        Return the probability distribution of variable given the class. 
    425         Equivalent to `self[class_value]`, except for normalization. 
    426452 
    427453        :param class_value: The value of the variable 
    428454        :type class_value: int, float, string or :obj:`Orange.data.Value` 
     455        :rtype: Orange.statistics.distribution.Distribution 
    429456 
    430457    .. method:: p_attr(value, class_value) 
     
    434461        Equivalent to `self[class][value]`, except for normalization. 
    435462 
    436         :param value: The value of the variable 
     463        :param value: Value of the variable 
    437464        :type value: int, float, string or :obj:`Orange.data.Value` 
    438         :param class_value: The class value 
     465        :param class_value: Class value 
    439466        :type value: int, float, string or :obj:`Orange.data.Value` 
     467        :rtype: float 
    440468 
    441469    .. _distributions-contingency4.py: code/distributions-contingency4.py 
     
    443471    part of the output from `distributions-contingency4.py`_ (uses monk1.tab) 
    444472     
    445     The inner and the outer variable and their relations to the class 
    446     and the features are exactly the reverse from :obj:`ContingencyClassVar`:: 
     473    The role of the feature and the class are reversed compared to 
     474    :obj:`ContingencyClassVar`:: 
    447475     
    448476        Inner variable:  e 
     
    463491        p(.|1) = <0.500, 0.167, 0.167, 0.167> 
    464492     
    465     If the class value is '0', the attribute e cannot be '1' (the first value), 
    466     while distribution across other values is uniform. 
    467     If the class value is '1', e is '1' in exactly half of examples, and 
    468     distribution of other values is again uniform. 
    469  
    470 .. class:: Orange.statistics.distribution.ContingencyVarVar 
    471  
    472     Contingency matrices in which none of the variables is the class.  
    473     The class is similar to the parent class :obj:`Contingency`, except for 
    474     an additional constructor and method for getting conditional probabilities. 
     493    If the class value is '0', the attribute `e` cannot be `1` (the first 
     494    value), while distribution across other values is uniform.  If the class 
     495    value is `1`, `e` is `1` for exactly half of instances, and distribution of 
     496    other values is again uniform. 
     497 
     498.. class:: ContingencyVarVar 
     499 
     500    Contingency matrices in which none of the variables is the class.  The class 
     501    is derived from :obj:`Contingency`, and adds an additional constructor and 
     502    method for getting conditional probabilities. 
    475503 
    476504    .. method:: ContingencyVarVar(outer_variable, inner_variable) 
     
    482510        Compute the contingency from the given instances. 
    483511 
    484         :param outer_variable: Descriptor of the outer variable 
     512        :param outer_variable: Outer variable 
    485513        :type outer_variable: Orange.data.feature.Feature 
    486         :param inner_variable: Descriptor of the inner variable 
     514        :param inner_variable: Inner variable 
    487515        :type inner_variable: Orange.data.feature.Feature 
    488516        :param data: A set of instances 
     
    493521    .. method:: p_attr(outer_value) 
    494522 
    495         Return the probability distribution of the inner 
    496         variable given the outer variable value. 
     523        Return the probability distribution of the inner variable given the 
     524        outer variable value. 
    497525 
    498526        :param outer_value: The value of the outer variable 
    499527        :type outer_value: int, float, string or :obj:`Orange.data.Value` 
     528        :rtype: Orange.statistics.distribution.Distribution 
    500529  
    501530    .. method:: p_attr(outer_value, inner_value) 
     
    508537        :param inner_value: The value of the inner variable 
    509538        :type inner_value: int, float, string or :obj:`Orange.data.Value` 
     539        :rtype: float 
    510540 
    511541    The following example investigates which material is used for 
     
    519549        :lines: 1-19 
    520550 
    521     Short bridges are mostly wooden or iron, 
    522     and the longer (and the most of middle sized) are made from steel:: 
     551    Short bridges are mostly wooden or iron, and the longer (and most of the 
     552    middle sized) are made from steel:: 
    523553     
    524554        SHORT: 
     
    540570 
    541571 
    542 Contingency matrices for continuous variables 
    543 --------------------------------------------- 
    544  
    545 The described classes can also be used for continuous values. 
    546  
    547 If the outer feature is continuous, the index must be one 
    548 of the values that do exist in the contingency matrix. Using other values 
    549 triggers an exception:: 
    550  
    551     .. _distributions-contingency6: code/distributions-contingency6.py 
    552      
    553     part of `distributions-contingency6`_ (uses monks-1.tab) 
    554      
    555     .. literalinclude:: code/distributions-contingency6.py 
    556         :lines: 1-5,18,19 
    557  
    558 Since even rounding is a problem, the keys should generally come from the 
    559 contingencies `keys`. 
    560  
    561 Contingencies with discrete outer variable continuous inner variables are 
    562 more useful, since methods :obj:`ContingencyClassVar.p_class` and  
    563 :obj:`ContingencyVarClass.p_attr` use the primitive density estimation 
    564 provided by :obj:`Distribution`. 
    565  
    566 For example, :obj:`ContingencyClassVar` on the iris dataset, 
    567 you can enquire about the probability of the sepal length 5.5:: 
    568  
    569     .. _distributions-contingency7: code/distributions-contingency7.py 
    570      
    571     part of `distributions-contingency7`_ (uses iris.tab) 
    572      
    573     .. literalinclude:: code/distributions-contingency7.py 
    574  
    575 The script outputs:: 
    576  
    577     Estimated frequencies for e=5.5 
    578       f(5.5|Iris-setosa) = 2.000 
    579       f(5.5|Iris-versicolor) = 5.000 
    580       f(5.5|Iris-virginica) = 1.000 
    581  
    582  
    583572Contingency matrices for the entire domain 
    584 ------------------------------------------ 
    585  
    586 DomainContingency is basically a list of contingencies, 
    587 either :obj:`ContingencyVarClass` or :obj:`ContingencyClassVar`. 
     573========================================== 
     574 
     575A list of contingencies, either :obj:`ContingencyVarClass` or 
     576:obj:`ContingencyClassVar`. 
    588577 
    589578.. class:: DomainContingency 
     
    613602        Contains the distribution of class values on the entire dataset. 
    614603 
    615     .. method:: normalize 
     604    .. method:: normalize() 
    616605 
    617606        Call normalize for all contingencies. 
     
    636625    .. literalinclude:: code/distributions-contingency8.py 
    637626        :lines: 13-  
     627 
     628 
     629.. _contcont: 
     630 
     631Contingency matrices for continuous variables 
     632============================================= 
     633 
     634If the outer variable is continuous, the index must be one of the values that do 
     635exist in the contingency matrix. Using other values raises an exception:: 
     636 
     637    .. _distributions-contingency6: code/distributions-contingency6.py 
     638     
     639    part of `distributions-contingency6`_ (uses monks-1.tab) 
     640     
     641    .. literalinclude:: code/distributions-contingency6.py 
     642        :lines: 1-5,18,19 
     643 
     644Since even rounding can be a problem, the only safe way to get the key is to 
     645take it from from the contingencies' ``keys``. 
     646 
     647Contingencies with discrete outer variable and continuous inner variables are 
     648more useful, since methods :obj:`ContingencyClassVar.p_class` and  
     649:obj:`ContingencyVarClass.p_attr` use the primitive density estimation 
     650provided by :obj:`Orange.statistics.distribution.Distribution`. 
     651 
     652For example, :obj:`ContingencyClassVar` on the iris dataset can return the 
     653probability of the sepal length 5.5 for different classes:: 
     654 
     655    .. _distributions-contingency7: code/distributions-contingency7.py 
     656     
     657    part of `distributions-contingency7`_ (uses iris.tab) 
     658     
     659    .. literalinclude:: code/distributions-contingency7.py 
     660 
     661The script outputs:: 
     662 
     663    Estimated frequencies for e=5.5 
     664      f(5.5|Iris-setosa) = 2.000 
     665      f(5.5|Iris-versicolor) = 5.000 
     666      f(5.5|Iris-virginica) = 1.000 
    638667 
    639668""" 
Note: See TracChangeset for help on using the changeset viewer.