Changeset 8160:b22f1ee57610 in orange
 Timestamp:
 08/09/11 13:09:07 (3 years ago)
 Branch:
 default
 Convert:
 d7323a465a9964e3e8b1013af46752ba4c04da16
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

orange/Orange/feature/scoring.py
r8157 r8160 12 12 prediction of the dependant (class) variable. 13 13 14 To compute the information gain of feature "tear_rate" in the Lenses data set (loaded into ` data`) use:14 To compute the information gain of feature "tear_rate" in the Lenses data set (loaded into ``data``) use: 15 15 16 16 >>> meas = Orange.feature.scoring.InfoGain() … … 19 19 20 20 Apart from information gain you could also use other scoring methods; 21 :ref:`classification` and :ref:`regression`. For various22 ways to call them see:ref:`callingscore`.21 see :ref:`classification` and :ref:`regression`. Various 22 ways to call them are described on :ref:`callingscore`. 23 23 24 24 It is possible to construct the object and use … … 90 90 contingency itself. 91 91 92 Not all classes willaccept all kinds of arguments. :obj:`Relief`,92 Not all classes accept all kinds of arguments. :obj:`Relief`, 93 93 for instance, only supports the form with instances on the input. 94 94 … … 194 194 score can be constructed as follows:: 195 195 196 .. comment:: opposite error  is this term correct? TODO197 196 198 197 >>> meas = Orange.feature.scoring.Cost() … … 203 202 Knowing the value of feature 3 would decrease the 204 203 classification cost for approximately 0.083 per instance. 204 205 .. comment:: opposite error  is this term correct? TODO 205 206 206 207 .. index:: … … 228 229 229 230 Check if the cached data is changed with data checksum. Slow 230 on large tables. Defaults to True. Disable it if you know that231 on large tables. Defaults to :obj:`True`. Disable it if you know that 231 232 the data will not change. 232 233 … … 443 444 .. [Breiman1984] L Breiman et al: Classification and Regression Trees, Chapman and Hall, 1984. 444 445 446 .. [Kononenko1995] I Kononenko: On biases in estimating multivalued attributes, International Joint Conference on Artificial Intelligence, 1995. 445 447 446 448 .. _iris.tab: code/iris.tab … … 481 483 482 484 A scoring method derived from :obj:`~Orange.feature.scoring.Score`. 483 If None, :obj:`Relief` with m=5 and k=10 will be used.485 If :obj:`None`, :obj:`Relief` with m=5 and k=10 will be used. 484 486 485 487 """ … … 511 513 512 514 class Distance(Score): 513 """The 1D feature distance score described in [Kononenko2007]_. TODO""" 515 """The :math:`1D` distance is defined as information gain divided 516 by joint entropy :math:`H_{CA}` (:math:`C` is the class variable 517 and :math:`A` the feature): 518 519 .. math:: 520 1D(C,A) = \\frac{\\mathrm{Gain}(A)}{H_{CA}} 521 """ 514 522 515 523 @Orange.misc.deprecated_keywords({"aprioriDist": "apriori_dist"}) … … 555 563 556 564 class MDL(Score): 557 """Score feature based on the minimum description length principle. TODO.""" 565 """Minimum description length principle [Kononenko1995]_. Let 566 :math:`n` be the number of instances, :math:`n_0` the number of 567 classes, and :math:`n_{cj}` the number of instances with feature 568 value :math:`j` and class value :math:`c`. Then MDL score for the 569 feature A is 570 571 .. math:: 572 \mathrm{MDL}(A) = \\frac{1}{n} \\Bigg[ 573 \\log\\binom{n}{n_{1.},\\cdots,n_{n_0 .}}  \\sum_j 574 \\log \\binom{n_{.j}}{n_{1j},\\cdots,n_{n_0 j}} \\\\ 575 + \\log \\binom{n+n_01}{n_01}  \\sum_j \\log 576 \\binom{n_{.j}+n_01}{n_01} 577 \\Bigg] 578 """ 558 579 559 580 @Orange.misc.deprecated_keywords({"aprioriDist": "apriori_dist"})
Note: See TracChangeset
for help on using the changeset viewer.