Changeset 7385:3f8f4fa60cea in orange


Ignore:
Timestamp:
02/04/11 09:39:42 (3 years ago)
Author:
tomazc <tomaz.curk@…>
Branch:
default
Convert:
8c5c3391f2b0f1a95f11b64568b0813de83267ea
Message:

Documentatio and code refactoring at Bohinj retreat.

Location:
orange/Orange/feature
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/feature/__init__.py

    r7290 r7385  
    1111 
    1212.. automodule:: Orange.feature.scoring 
    13    :members: 
    1413 
    1514================== 
     
    2827 
    2928.. automodule:: Orange.feature.discretization 
    30    :members: 
    3129 
    3230================== 
     
    3735 
    3836.. automodule:: Orange.feature.continuization 
    39    :members: 
    4037 
    4138================== 
     
    4643 
    4744.. automodule:: Orange.feature.imputation 
    48    :members: 
    4945 
    5046""" 
  • orange/Orange/feature/scoring.py

    r7382 r7385  
    4040 
    4141.. automethod:: Orange.feature.scoring.attMeasure 
    42  
    43  
    44 ======================== 
    45 Different Score Measures 
    46 ======================== 
    47  
    48 .. note: add links to gain ratio, relief and other feature scores 
    49  
    50 The following script reports on gain ratio and relief feature scores. 
    51  
    52 `scoring-relief-gainRatio.py`_ (uses `voting.tab`_): 
    53  
    54 .. literalinclude:: code/scoring-relief-gainRatio.py 
    55     :lines: 7- 
    56      
    57 Notice that on this data the ranks of features match rather well:: 
    58      
    59     Relief GainRt Feature 
    60     0.613  0.752  physician-fee-freeze 
    61     0.255  0.444  el-salvador-aid 
    62     0.228  0.414  synfuels-corporation-cutback 
    63     0.189  0.382  crime 
    64     0.166  0.345  adoption-of-the-budget-resolution 
    6542 
    6643============ 
     
    282259==================================== 
    283260 
     261This script scores features with gain ratio and relief. 
     262 
     263`scoring-relief-gainRatio.py`_ (uses `voting.tab`_): 
     264 
     265.. literalinclude:: code/scoring-relief-gainRatio.py 
     266    :lines: 7- 
     267 
     268Notice that on this data the ranks of features match rather well:: 
     269     
     270    Relief GainRt Feature 
     271    0.613  0.752  physician-fee-freeze 
     272    0.255  0.444  el-salvador-aid 
     273    0.228  0.414  synfuels-corporation-cutback 
     274    0.189  0.382  crime 
     275    0.166  0.345  adoption-of-the-budget-resolution 
     276 
    284277The following section describes the attribute quality measures suitable for  
    285278discrete features and outcomes.  
     
    288281for more examples on their use. 
    289282 
    290 ---------------- 
    291 Information Gain 
    292 ---------------- 
    293283.. index::  
    294284   single: feature scoring; information gain 
    295285 
    296 .. class:: Info 
     286.. class:: InfoGain 
    297287 
    298288    The most popular measure, information gain :obj:`Info` measures the expected 
    299289    decrease of the entropy. 
    300290 
    301 ---------- 
    302 Gain Ratio 
    303 ---------- 
    304  
    305291.. index::  
    306292   single: feature scoring; gain ratio 
    307293 
    308 .. class:: Gain 
     294.. class:: GainRatio 
    309295 
    310296    Gain ratio :obj:`GainRatio` was introduced by Quinlan in order to avoid 
     
    314300    values.) 
    315301 
    316 ---------- 
    317 Gini index 
    318 ---------- 
    319  
    320302.. index::  
    321303   single: feature scoring; gini index 
     
    327309    classes. 
    328310 
    329 --------- 
    330 Relevance 
    331 --------- 
    332  
    333311.. index::  
    334312   single: feature scoring; relevance 
     
    340318    decision rules. 
    341319 
    342 ----- 
    343 Costs 
    344 ----- 
     320.. index::  
     321   single: feature scoring; cost 
    345322 
    346323.. class:: Cost 
     
    364341    This tells that knowing the value of attribute 3 would decrease the 
    365342    classification cost for appx 0.083 per example. 
    366  
    367 ------- 
    368 ReliefF 
    369 ------- 
    370343 
    371344.. index::  
     
    454427Except for ReliefF, the only attribute quality measure available for regression 
    455428problems is based on a mean square error. 
    456  
    457 ----------------- 
    458 Mean Square Error 
    459 ----------------- 
    460429 
    461430.. index::  
  • orange/Orange/feature/selection.py

    r7342 r7385  
    138138all the ten cross-validation tests! 
    139139 
     140========== 
    140141References 
    141 ---------- 
     142========== 
    142143 
    143144* K. Kira and L. Rendell. A practical approach to feature selection. In 
Note: See TracChangeset for help on using the changeset viewer.