Ignore:
Timestamp:
02/08/12 18:17:05 (2 years ago)
Author:
Matija Polajnar <matija.polajnar@…>
Branch:
default
rebase_source:
e1c57397b9f546f4ad8f3ccb9e05cb89ad67e639
Message:

Move Orange.statistics.estimate documentation to rst. Closes #1070.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/reference/rst/Orange.statistics.estimate.rst

    r9372 r10102  
    11.. automodule:: Orange.statistics.estimate 
    22 
     3.. index:: Probability Estimation 
     4 
     5======================================= 
     6Probability Estimation (``estimate``) 
     7======================================= 
     8 
     9Probability estimators compute probabilities of values of class variable. 
     10They come in two flavours: 
     11 
     12#. for unconditional probabilities (:math:`p(C=c)`, where :math:`c` is a 
     13   class) and 
     14 
     15#. for conditional probabilities (:math:`p(C=c|V=v)`, 
     16   where :math:`v` is a feature value). 
     17 
     18A duality much like the one between learners and classifiers exists between 
     19probability estimator constructors and probability estimators: when a 
     20probability estimator constructor is called with data, it constructs a 
     21probability estimator that can then be called with a value of class variable 
     22to obtain a probability of that value. This duality is mainly needed to 
     23enable probability estimation for continuous variables, 
     24where it is not possible to generate a list of probabilities of all possible 
     25values in advance. 
     26 
     27First, probability estimation constructors for common probability estimation 
     28techniques are enumerated. Base classes, knowledge of which is needed to 
     29develop new techniques, are described later in this document. 
     30 
     31Probability Estimation Constructors 
     32=================================== 
     33 
     34.. class:: RelativeFrequency 
     35 
     36    Bases: :class:`EstimatorConstructor` 
     37 
     38    Compute distribution using relative frequencies of classes. 
     39 
     40    :rtype: :class:`EstimatorFromDistribution` 
     41 
     42.. class:: Laplace 
     43 
     44    Bases: :class:`EstimatorConstructor` 
     45 
     46    Use Laplace estimation to compute distribution from frequencies of classes: 
     47 
     48    .. math:: 
     49 
     50        p(c) = \\frac{Nc+1}{N+n} 
     51 
     52    where :math:`Nc` is number of occurrences of an event (e.g. number of 
     53    instances in class c), :math:`N` is the total number of events (instances) 
     54    and :math:`n` is the number of different events (classes). 
     55 
     56    :rtype: :class:`EstimatorFromDistribution` 
     57 
     58.. class:: M 
     59 
     60    Bases: :class:`EstimatorConstructor` 
     61 
     62    .. method:: __init__(m) 
     63 
     64        :param m: Parameter for m-estimation. 
     65        :type m: int 
     66 
     67    Use m-estimation to compute distribution from frequencies of classes: 
     68 
     69    .. math:: 
     70 
     71        p(c) = \\frac{Nc+m*ap(c)}{N+m} 
     72 
     73    where :math:`Nc` is number of occurrences of an event (e.g. number of 
     74    instances in class c), :math:`N` is the total number of events (instances) 
     75    and :math:`ap(c)` is the prior probability of event (class) c. 
     76 
     77    :rtype: :class:`EstimatorFromDistribution` 
     78 
     79.. class:: Kernel 
     80 
     81    Bases: :class:`EstimatorConstructor` 
     82 
     83    .. method:: __init__(min_impact, smoothing, n_points) 
     84 
     85        :param min_impact: A requested minimal weight of a point (default: 
     86            0.01); points with lower weights won't be taken into account. 
     87        :type min_impact: float 
     88 
     89        :param smoothing: Smoothing factor (default: 1.144). 
     90        :type smoothing: float 
     91 
     92        :param n_points: Number of points for the interpolating curve. If 
     93            negative, say -3 (default), 3 points will be inserted between each 
     94            data points. 
     95        :type n_points: int 
     96 
     97    Compute probabilities for continuous variable for certain number of points 
     98    using Gaussian kernels. The resulting point-wise continuous distribution is 
     99    stored as :class:`~Orange.statistics.distribution.Continuous`. 
     100 
     101    Probabilities are always computed at all points that 
     102    are present in the data (i.e. the existing values of the continuous 
     103    feature). If :obj:`n_points` is positive and greater than the 
     104    number of existing data points, additional points are inserted 
     105    between the existing points to achieve the required number of 
     106    points. Approximately equal number of new points is inserted between 
     107    each adjacent existing point each data points. If :obj:`n_points` is 
     108    negative, its absolute value determines the number of points to be added 
     109    between each two data points. 
     110 
     111    :rtype: :class:`EstimatorFromDistribution` 
     112 
     113.. class:: Loess 
     114 
     115    Bases: :class:`EstimatorConstructor` 
     116 
     117    .. method:: __init__(window_proportion, n_points) 
     118 
     119        :param window_proportion: A proportion of points in a window. 
     120        :type window_proportion: float 
     121 
     122        :param n_points: Number of points for the interpolating curve. If 
     123            negative, say -3 (default), 3 points will be inserted between each 
     124            data points. 
     125        :type n_points: int 
     126 
     127    Prepare a probability estimator that computes probability at point ``x`` 
     128    as weighted local regression of probabilities for points in the window 
     129    around this point. 
     130 
     131    The window contains a prescribed proportion of original data points. The 
     132    window is as symmetric as possible in the sense that the leftmost point in 
     133    the window is approximately as far from ``x`` as the rightmost. The 
     134    number of points to the left of ``x`` might thus differ from the number 
     135    of points to the right. 
     136 
     137    Points are weighted by bi-cubic weight function; a weight of point 
     138    at ``x'`` is :math:`(1-|t|^3)^3`, where :math:`t` is 
     139    :math:`(x-x'>)/h` and :math:`h` is the distance to the farther 
     140    of the two window edge points. 
     141 
     142    :rtype: :class:`EstimatorFromDistribution` 
     143 
     144 
     145.. class:: ConditionalLoess 
     146 
     147    Bases: :class:`ConditionalEstimatorConstructor` 
     148 
     149    .. method:: __init__(window_proportion, n_points) 
     150 
     151        :param window_proportion: A proportion of points in a window. 
     152        :type window_proportion: float 
     153 
     154        :param n_points: Number of points for the interpolating curve. If 
     155            negative, say -3 (default), 3 points will be inserted between each 
     156            data points. 
     157        :type n_points: int 
     158 
     159    Construct a conditional probability estimator, in other aspects 
     160    similar to the one constructed by :class:`Loess`. 
     161 
     162    :rtype: :class:`ConditionalEstimatorFromDistribution`. 
     163 
     164 
     165Base classes 
     166============= 
     167 
     168All probability estimators are derived from two base classes: one for 
     169unconditional and the other for conditional probability estimation. The same 
     170is true for probability estimator constructors. 
     171 
     172.. class:: EstimatorConstructor 
     173 
     174    Constructor of an unconditional probability estimator. 
     175 
     176    .. method:: __call__([distribution[, apriori]], [instances[, weight_id]]) 
     177 
     178        :param distribution: input distribution. 
     179        :type distribution: :class:`~Orange.statistics.distribution.Distribution` 
     180 
     181        :param apriori: prior distribution. 
     182        :type distribution: :class:`~Orange.statistics.distribution.Distribution` 
     183 
     184        :param instances: input data. 
     185        :type distribution: :class:`Orange.data.Table` 
     186 
     187        :param weight_id: ID of the weight attribute. 
     188        :type weight_id: int 
     189 
     190        If distribution is given, it can be followed by prior class 
     191        distribution. Similarly, instances can be followed by with 
     192        the ID of meta attribute with instance weights. (Hint: to pass a 
     193        prior distribution and instances, but no distribution, 
     194        just pass :obj:`None` for the latter.) When both, 
     195        distribution and instances are given, it is up to constructor to 
     196        decide what to use. 
     197 
     198.. class:: Estimator 
     199 
     200    .. attribute:: supports_discrete 
     201 
     202        Tells whether the estimator can handle discrete attributes. 
     203 
     204    .. attribute:: supports_continuous 
     205 
     206        Tells whether the estimator can handle continuous attributes. 
     207 
     208    .. method:: __call__([value]) 
     209 
     210        If value is given, return the probability of the value. 
     211 
     212        :rtype: float 
     213 
     214        If the value is omitted, an attempt is made 
     215        to return a distribution of probabilities for all values. 
     216 
     217        :rtype: :class:`~Orange.statistics.distribution.Distribution` 
     218            (usually :class:`~Orange.statistics.distribution.Discrete` for 
     219            discrete and :class:`~Orange.statistics.distribution.Continuous` 
     220            for continuous) or :obj:`NoneType` 
     221 
     222.. class:: ConditionalEstimatorConstructor 
     223 
     224    Constructor of a conditional probability estimator. 
     225 
     226    .. method:: __call__([table[, apriori]], [instances[, weight_id]]) 
     227 
     228        :param table: input distribution. 
     229        :type table: :class:`Orange.statistics.contingency.Table` 
     230 
     231        :param apriori: prior distribution. 
     232        :type distribution: :class:`~Orange.statistics.distribution.Distribution` 
     233 
     234        :param instances: input data. 
     235        :type distribution: :class:`Orange.data.Table` 
     236 
     237        :param weight_id: ID of the weight attribute. 
     238        :type weight_id: int 
     239 
     240        If distribution is given, it can be followed by prior class 
     241        distribution. Similarly, instances can be followed by with 
     242        the ID of meta attribute with instance weights. (Hint: to pass a 
     243        prior distribution and instances, but no distribution, 
     244        just pass :obj:`None` for the latter.) When both, 
     245        distribution and instances are given, it is up to constructor to 
     246        decide what to use. 
     247 
     248.. class:: ConditionalEstimator 
     249 
     250    As a counterpart of :class:`Estimator`, this estimator can return 
     251    conditional probabilities. 
     252 
     253    .. method:: __call__([[value,] condition_value]) 
     254 
     255        When given two values, it returns a probability of :math:`p(value|condition)`. 
     256 
     257        :rtype: float 
     258 
     259        When given only one value, it is interpreted as condition; the estimator 
     260        attempts to return a distribution of conditional probabilities for all 
     261        values. 
     262 
     263        :rtype: :class:`~Orange.statistics.distribution.Distribution` 
     264            (usually :class:`~Orange.statistics.distribution.Discrete` for 
     265            discrete and :class:`~Orange.statistics.distribution.Continuous` 
     266            for continuous) or :obj:`NoneType` 
     267 
     268        When called without arguments, it returns a 
     269        matrix containing probabilities :math:`p(value|condition)` for each 
     270        possible :math:`value` and :math:`condition` (a contingency table); 
     271        condition is used as outer 
     272        variable. 
     273 
     274        :rtype: :class:`Orange.statistics.contingency.Table` or :obj:`NoneType` 
     275 
     276        If estimator cannot return precomputed distributions and/or 
     277        contingencies, it returns :obj:`None`. 
     278 
     279Common Components 
     280================= 
     281 
     282.. class:: EstimatorFromDistribution 
     283 
     284    Bases: :class:`Estimator` 
     285 
     286    Probability estimator constructors that compute probabilities for all 
     287    values in advance return this estimator with calculated 
     288    quantities in the :obj:`probabilities` attribute. 
     289 
     290    .. attribute:: probabilities 
     291 
     292        A precomputed list of probabilities. 
     293 
     294    .. method:: __call__([value]) 
     295 
     296        If value is given, return the probability of the value. For discrete 
     297        variables, every value has an entry in the :obj:`probabilities` 
     298        attribute. For continuous variables, a linear interpolation between 
     299        two nearest points is used to compute the probability. 
     300 
     301        :rtype: float 
     302 
     303        If the value is omitted, a copy of :obj:`probabilities` is returned. 
     304 
     305        :rtype: :class:`~Orange.statistics.distribution.Distribution` 
     306            (usually :class:`~Orange.statistics.distribution.Discrete` for 
     307            discrete and :class:`~Orange.statistics.distribution.Continuous` 
     308            for continuous). 
     309 
     310.. class:: ConditionalEstimatorFromDistribution 
     311 
     312    Bases: :class:`ConditionalEstimator` 
     313 
     314    Probability estimator constructors that compute the whole 
     315    contingency table (:class:`Orange.statistics.contingency.Table`) of 
     316    conditional probabilities in advance 
     317    return this estimator with the table in the :obj:`probabilities` attribute. 
     318 
     319    .. attribute:: probabilities 
     320 
     321        A precomputed contingency table. 
     322 
     323    .. method:: __call__([[value,] condition_value]) 
     324 
     325        For detailed description of handling of different combinations of 
     326        parameters, see the inherited :obj:`ConditionalEstimator.__call__`. 
     327        For behaviour with continuous variable distributions, 
     328        see the unconditional counterpart :obj:`EstimatorFromDistribution.__call__`. 
     329 
     330.. class:: ConditionalByRows 
     331 
     332    Bases: :class:`ConditionalEstimator` 
     333 
     334    .. attribute:: estimator_constructor 
     335 
     336        An unconditional probability estimator constructor. 
     337 
     338    Computes a conditional probability estimator using 
     339    an unconditional probability estimator constructor. The result 
     340    can be of type :class:`ConditionalEstimatorFromDistribution` 
     341    or :class:`ConditionalEstimatorByRows`, depending on the type of 
     342    constructor. 
     343 
     344    .. method:: __call__([table[, apriori]], [instances[, weight_id]], estimator) 
     345 
     346        :param table: input distribution. 
     347        :type table: :class:`Orange.statistics.contingency.Table` 
     348 
     349        :param apriori: prior distribution. 
     350        :type distribution: :class:`~Orange.statistics.distribution.Distribution` 
     351 
     352        :param instances: input data. 
     353        :type distribution: :class:`Orange.data.Table` 
     354 
     355        :param weight_id: ID of the weight attribute. 
     356        :type weight_id: int 
     357 
     358        :param estimator: unconditional probability estimator constructor. 
     359        :type estimator: :class:`EstimatorConstructor` 
     360 
     361        Compute contingency matrix if it has not been computed already. Then 
     362        call :obj:`estimator_constructor` for each value of condition attribute. 
     363        If all constructed estimators can return distribution of probabilities 
     364        for all classes (usually either all or none can), the 
     365        :class:`~Orange.statistics.distribution.Distribution` instances are put 
     366        in a contingency table 
     367        and :class:`ConditionalEstimatorFromDistribution` 
     368        is constructed and returned. If constructed estimators are 
     369        not capable of returning distribution of probabilities, 
     370        a :class:`ConditionalEstimatorByRows` is constructed and the 
     371        estimators are stored in its :obj:`estimator_list`. 
     372 
     373        :rtype: :class:`ConditionalEstimatorFromDistribution` or :class:`ConditionalEstimatorByRows` 
     374 
     375.. class:: ConditionalEstimatorByRows 
     376 
     377    Bases: :class:`ConditionalEstimator` 
     378 
     379    A conditional probability estimator constructors that itself uses a series 
     380    of estimators, one for each possible condition, 
     381    stored in its :obj:`estimator_list` attribute. 
     382 
     383    .. attribute:: estimator_list 
     384 
     385        A list of estimators; one for each value of :obj:`condition`. 
     386 
     387    .. method:: __call__([[value,] condition_value]) 
     388 
     389        Uses estimators from :obj:`estimator_list`, 
     390        depending on given `condition_value`. 
     391        For detailed description of handling of different combinations of 
     392        parameters, see the inherited :obj:`ConditionalEstimator.__call__`. 
     393 
Note: See TracChangeset for help on using the changeset viewer.