Changeset 10170:c99ba9ecf870 in orange


Ignore:
Timestamp:
02/12/12 00:35:25 (2 years ago)
Author:
janezd <janez.demsar@…>
Branch:
default
Message:

Documentation for Orange.feature.scoring: minor changes in the first part

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/reference/rst/Orange.feature.scoring.rst

    r9988 r10170  
    1111 
    1212Feature score is an assessment of the usefulness of the feature for 
    13 prediction of the dependant (class) variable. 
    14  
    15 To compute the information gain of feature "tear_rate" in the Lenses data set (loaded into ``data``) use: 
    16  
    17     >>> meas = Orange.feature.scoring.InfoGain() 
    18     >>> print meas("tear_rate", data) 
    19     0.548794925213 
    20  
    21 Other scoring methods are listed in :ref:`classification` and 
    22 :ref:`regression`. Various ways to call them are described on 
    23 :ref:`callingscore`. 
    24  
    25 Instead of first constructing the scoring object (e.g. ``InfoGain``) and 
    26 then using it, it is usually more convenient to do both in a single step:: 
     13prediction of the dependant (class) variable. Orange provides classes 
     14that compute the common feature scores for :ref:`classification 
     15<classification>` and regression :ref:`regression <regression>`. 
     16 
     17The script below computes the information gain of feature "tear_rate" 
     18in the Lenses data set (loaded into ``data``): 
    2719 
    2820    >>> print Orange.feature.scoring.InfoGain("tear_rate", data) 
    29     0.548794925213 
    30  
    31 This way is much slower for Relief that can efficiently compute scores 
    32 for all features in parallel. 
    33  
    34 It is also possible to score features that do not appear in the data 
    35 but can be computed from it. A typical case are discretized features: 
    36  
    37 .. literalinclude:: code/scoring-info-iris.py 
    38     :lines: 7-11 
    39  
    40 The following example computes feature scores, both with 
    41 :obj:`score_all` and by scoring each feature individually, and prints out 
    42 the best three features. 
     21    0.548795044422 
     22 
     23Calling the scorer by passing the variable and the data to the 
     24constructor, like above is convenient. However, when scoring multiple 
     25variables, some methods run much faster if the scorer is constructed, 
     26stored and called for each variable. 
     27 
     28    >>> gain = Orange.feature.scoring.InfoGain() 
     29    >>> for feature in data.domain.features: 
     30    ...     print feature.name, gain(feature, data) 
     31    age 0.0393966436386 
     32    prescription 0.0395109653473 
     33    astigmatic 0.377005338669 
     34    tear_rate 0.548795044422 
     35 
     36The speed gain is most noticable in Relief, which computes the scores of 
     37all features in parallel. 
     38 
     39The module also provides a convenience function :obj:`score_all` that 
     40computes the scores for all attributes. The following example computes 
     41feature scores, both with :obj:`score_all` and by scoring each feature 
     42individually, and prints out the best three features. 
    4343 
    4444.. literalinclude:: code/scoring-all.py 
     
    7272        0.166  0.345  adoption-of-the-budget-resolution 
    7373 
     74It is also possible to score features that do not appear in the data 
     75but can be computed from it. A typical case are discretized features: 
     76 
     77.. literalinclude:: code/scoring-info-iris.py 
     78    :lines: 7-11 
    7479 
    7580.. _callingscore: 
     
    7984======================= 
    8085 
    81 To score a feature use :obj:`Score.__call__`. There are diferent 
    82 function signatures, which enable optimization. For instance, 
    83 most scoring methods first compute contingency tables from the 
    84 data. If these are already computed, they can be passed to the scorer 
    85 instead of the data. 
     86Scorers can be called with different type of arguments. For instance, 
     87when given the data, most scoring methods first compute the 
     88corresponding contingency tables. If these are already known, they can 
     89be given to the scorer instead of the data to save some time. 
    8690 
    8791Not all classes accept all kinds of arguments. :obj:`Relief`, 
     
    97101    :param weightID: id for meta-feature with weight. 
    98102 
    99     All scoring methods support the first signature. 
     103    All scoring methods support this form. 
    100104 
    101105.. method:: Score.__call__(attribute, domain_contingency[, apriori_class_distribution]) 
     
    125129    :rtype: float or :obj:`Score.Rejected`. 
    126130 
    127 The code below scores the same feature with :obj:`GainRatio` 
    128 using different calls. 
     131The code demonstrates using the different call signatures by computing 
     132the score of the same feature with :obj:`GainRatio`. 
    129133 
    130134.. literalinclude:: code/scoring-calls.py 
     
    183187    .. attribute:: cost 
    184188 
    185         Cost matrix, see :obj:`Orange.classification.CostMatrix` for details. 
     189        Cost matrix, an instance of :obj:`Orange.misc.CostMatrix`. 
    186190 
    187191    If the cost of predicting the first class of an instance that is actually in 
     
    271275    .. attribute:: unknowns_treatment 
    272276 
    273         What to do with unknown values. See :obj:`Score.unknowns_treatment`. 
     277        Decides the treatment of unknown values. See 
     278        :obj:`Score.unknowns_treatment`. 
    274279 
    275280    .. attribute:: m 
Note: See TracChangeset for help on using the changeset viewer.