Changeset 8160:b22f1ee57610 in orange


Ignore:
Timestamp:
08/09/11 13:09:07 (3 years ago)
Author:
markotoplak
Branch:
default
Convert:
d7323a465a9964e3e8b1013af46752ba4c04da16
Message:

Updates to Orange.feature.scoring. References #882.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/feature/scoring.py

    r8157 r8160  
    1212prediction of the dependant (class) variable. 
    1313 
    14 To compute the information gain of feature "tear_rate" in the Lenses data set (loaded into `data`) use: 
     14To compute the information gain of feature "tear_rate" in the Lenses data set (loaded into ``data``) use: 
    1515 
    1616    >>> meas = Orange.feature.scoring.InfoGain() 
     
    1919 
    2020Apart from information gain you could also use other scoring methods; 
    21 :ref:`classification` and :ref:`regression`. For various 
    22 ways to call them see :ref:`callingscore`. 
     21see :ref:`classification` and :ref:`regression`. Various 
     22ways to call them are described on :ref:`callingscore`. 
    2323 
    2424It is possible to construct the object and use 
     
    9090contingency itself. 
    9191 
    92 Not all classes will accept all kinds of arguments. :obj:`Relief`, 
     92Not all classes accept all kinds of arguments. :obj:`Relief`, 
    9393for instance, only supports the form with instances on the input. 
    9494 
     
    194194    score can be constructed as follows:: 
    195195 
    196     .. comment:: opposite error - is this term correct? TODO 
    197196 
    198197        >>> meas = Orange.feature.scoring.Cost() 
     
    203202    Knowing the value of feature 3 would decrease the 
    204203    classification cost for approximately 0.083 per instance. 
     204 
     205    .. comment:: opposite error - is this term correct? TODO 
    205206 
    206207.. index::  
     
    228229     
    229230        Check if the cached data is changed with data checksum. Slow 
    230         on large tables.  Defaults to True. Disable it if you know that 
     231        on large tables.  Defaults to :obj:`True`. Disable it if you know that 
    231232        the data will not change. 
    232233 
     
    443444.. [Breiman1984] L Breiman et al: Classification and Regression Trees, Chapman and Hall, 1984. 
    444445 
     446.. [Kononenko1995] I Kononenko: On biases in estimating multi-valued attributes, International Joint Conference on Artificial Intelligence, 1995. 
    445447 
    446448.. _iris.tab: code/iris.tab 
     
    481483     
    482484        A scoring method derived from :obj:`~Orange.feature.scoring.Score`. 
    483         If None, :obj:`Relief` with m=5 and k=10 will be used. 
     485        If :obj:`None`, :obj:`Relief` with m=5 and k=10 will be used. 
    484486     
    485487    """ 
     
    511513 
    512514class Distance(Score): 
    513     """The 1-D feature distance score described in [Kononenko2007]_. TODO""" 
     515    """The :math:`1-D` distance is defined as information gain divided 
     516    by joint entropy :math:`H_{CA}` (:math:`C` is the class variable 
     517    and :math:`A` the feature): 
     518 
     519    .. math:: 
     520        1-D(C,A) = \\frac{\\mathrm{Gain}(A)}{H_{CA}} 
     521    """ 
    514522 
    515523    @Orange.misc.deprecated_keywords({"aprioriDist": "apriori_dist"}) 
     
    555563 
    556564class MDL(Score): 
    557     """Score feature based on the minimum description length principle. TODO.""" 
     565    """Minimum description length principle [Kononenko1995]_. Let 
     566    :math:`n` be the number of instances, :math:`n_0` the number of 
     567    classes, and :math:`n_{cj}` the number of instances with feature 
     568    value :math:`j` and class value :math:`c`. Then MDL score for the 
     569    feature A is 
     570 
     571    .. math:: 
     572         \mathrm{MDL}(A) = \\frac{1}{n} \\Bigg[ 
     573         \\log\\binom{n}{n_{1.},\\cdots,n_{n_0 .}} - \\sum_j 
     574         \\log \\binom{n_{.j}}{n_{1j},\\cdots,n_{n_0 j}} \\\\ 
     575         + \\log \\binom{n+n_0-1}{n_0-1} - \\sum_j \\log 
     576         \\binom{n_{.j}+n_0-1}{n_0-1} 
     577         \\Bigg] 
     578    """ 
    558579 
    559580    @Orange.misc.deprecated_keywords({"aprioriDist": "apriori_dist"}) 
Note: See TracChangeset for help on using the changeset viewer.