Changeset 7405:6ab509c36d14 in orange


Ignore:
Timestamp:
02/04/11 11:20:46 (3 years ago)
Author:
tomazc <tomaz.curk@…>
Branch:
default
Convert:
eaf3feee42283e137e769ff59f13111eb459e056
Message:
 
Location:
orange/Orange/feature
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/feature/__init__.py

    r7387 r7405  
    11""" 
     2 
    23.. index:: feature 
    34 
     
    67interaction analysis. 
    78 
    8 ================== 
     9======= 
    910Scoring 
    10 ================== 
    11  
    12 .. index:: feature scoring 
     11======= 
    1312 
    1413.. automodule:: Orange.feature.scoring 
    1514 
    16 ================== 
     15========= 
    1716Selection 
    18 ================== 
    19  
    20 .. index:: feature selection 
     17========= 
    2118 
    2219.. automodule:: Orange.feature.selection 
    2320 
    24 ================== 
     21============== 
    2522Discretization 
    26 ================== 
    27  
    28 .. index:: discretization 
     23============== 
    2924 
    3025.. automodule:: Orange.feature.discretization 
    3126 
    32 ================== 
     27============== 
    3328Continuization 
    34 ================== 
     29============== 
    3530 
    3631.. index:: continuization 
     
    3833.. automodule:: Orange.feature.continuization 
    3934 
    40 ================== 
     35========== 
    4136Imputation 
    42 ================== 
    43  
    44 .. index:: imputation 
     37========== 
    4538 
    4639.. automodule:: Orange.feature.imputation 
  • orange/Orange/feature/discretization.py

    r7400 r7405  
    11""" 
     2 
     3.. index:: discretization 
     4 
     5.. index::  
     6   single: feature; discretization 
    27 
    38This module implements some functions and classes that can be used for 
     
    712categorization that can be used for learning. 
    813 
    9 .. method:: entropyDiscretization(table) 
     14.. automethod:: Orange.feature.discretization.entropyDiscretization 
    1015 
    11     Take the classified data set (table) and categorize all continuous 
     16.. autoclass:: Orange.feature.discretization.EntropyDiscretization 
     17 
     18.. autoclass:: Orange.feature.discretization.DiscretizedLearner_Class 
     19 
     20 
     21 
     22======== 
     23Examples 
     24======== 
     25 
     26A chapter on `feature subset selection <../ofb/o_fss.htm>`_ in Orange 
     27for Beginners tutorial shows the use of DiscretizedLearner. Other 
     28discretization classes from core Orange are listed in chapter on 
     29`categorization <../ofb/o_categorization.htm>`_ of the same tutorial. 
     30 
     31.. note:: 
     32    add from reference http://orange.biolab.si/doc/reference/discretization.htm 
     33 
     34========== 
     35References 
     36========== 
     37 
     38* UM Fayyad and KB Irani. Multi-interval discretization of continuous valued 
     39  attributes for classification learning. In Proceedings of the 13th 
     40  International Joint Conference on Artificial Intelligence, pages 
     41  1022--1029, Chambery, France, 1993. 
     42 
     43""" 
     44 
     45import Orange.core as orange 
     46 
     47from Orange.core import \ 
     48    Discrete2Continuous, \ 
     49    Discretizer, \ 
     50        BiModalDiscretizer, \ 
     51        EquiDistDiscretizer, \ 
     52        IntervalDiscretizer, \ 
     53        ThresholdDiscretizer 
     54 
     55###### 
     56# from orngDics.py 
     57def entropyDiscretization(table): 
     58    """Take the classified table set (table) and categorize all continuous 
    1259    features using the entropy based discretization  
    13     :obj:`EntropyDiscretization`. After categorization,  
    14     features that were categorized to a single interval (to a constant value)  
    15     are removed from table. Returns the data set that includes all categorical 
    16     and discretized continuous features from the original data table. 
     60    :obj:`EntropyDiscretization`. 
     61     
     62    :param table: data to discretize. 
     63    :type table: Orange.data.Table 
     64     
     65    After categorization, features that were categorized to a single interval 
     66    (to a constant value) are removed from table and prints their names. 
     67    Returns a table that includes all categorical and discretized 
     68    continuous features from the original data table. 
    1769 
    18 .. class:: EntropyDiscretization 
     70    """ 
     71    orange.setrandseed(0) 
     72    tablen=orange.Preprocessor_discretize(table, method=orange.EntropyDiscretization()) 
     73     
     74    attrlist=[] 
     75    nrem=0 
     76    for i in tablen.domain.attributes: 
     77        if (len(i.values)>1): 
     78            attrlist.append(i) 
     79        else: 
     80            nrem=nrem+1 
     81    attrlist.append(tablen.domain.classVar) 
     82    return tablen.select(attrlist) 
    1983 
    20     This is simple wrapper class around the function 
    21     :obj:`entropyDiscretization`. Once invoked it would either create an 
    22     object that can be passed a data set for discretization, or if invoked 
    23     with the data set, would return a discretized data set:: 
     84 
     85class EntropyDiscretization: 
     86    """This is simple wrapper class around the function  
     87    :obj:`entropyDiscretization`.  
     88     
     89    :param data: data to discretize. 
     90    :type data: Orange.data.Table 
     91     
     92    Once invoked it would either create an object that can be passed a data 
     93    set for discretization, or if invoked with the data set, would return a 
     94    discretized data set:: 
    2495 
    2596        discretizer = Orange.feature.dicretization.EntropyDiscretization() 
     
    2798        another_disc_data = Orange.feature.dicretization.EntropyDiscretization(table) 
    2899 
    29 .. class:: DiscretizedLearner([baseLearner[, table[, discretizer[, name]]]]) 
     100    """ 
     101    def __call__(self, data): 
     102        return entropyDiscretization(data) 
    30103 
    31     This class allows to set an learner object, such that before learning a 
     104def DiscretizedLearner(baseLearner, examples=None, weight=0, **kwds): 
     105  learner = apply(DiscretizedLearner_Class, [baseLearner], kwds) 
     106  if examples: return learner(examples, weight) 
     107  else: return learner 
     108 
     109class DiscretizedLearner_Class: 
     110    """This class allows to set an learner object, such that before learning a 
    32111    data passed to a learner is discretized. In this way we can prepare an  
    33112    object that lears without giving it the data, and, for instance, use it in 
     
    58137        classifier = Orange.feature.discretization.DiscretizedLearner(bayes, examples=table) 
    59138 
    60 ======== 
    61 Examples 
    62 ======== 
    63  
    64 A chapter on `feature subset selection <../ofb/o_fss.htm>`_ in Orange 
    65 for Beginners tutorial shows the use of DiscretizedLearner. Other 
    66 discretization classes from core Orange are listed in chapter on 
    67 `categorization <../ofb/o_categorization.htm>`_ of the same tutorial. 
    68  
    69  
    70 .. note:: 
    71     add from reference http://orange.biolab.si/doc/reference/discretization.htm 
    72  
    73 ========== 
    74 References 
    75 ========== 
    76  
    77 * UM Fayyad and KB Irani. Multi-interval discretization of continuous valued 
    78   attributes for classification learning. In Proceedings of the 13th 
    79   International Joint Conference on Artificial Intelligence, pages 
    80   1022--1029, Chambery, France, 1993. 
    81  
    82 """ 
    83  
    84 import Orange.core as orange 
    85  
    86 from Orange.core import \ 
    87     Discrete2Continuous, \ 
    88     Discretizer, \ 
    89         BiModalDiscretizer, \ 
    90         EquiDistDiscretizer, \ 
    91         IntervalDiscretizer, \ 
    92         ThresholdDiscretizer 
    93  
    94 ###### 
    95 # from orngDics.py 
    96 def entropyDiscretization(data): 
    97   """ 
    98   Discretizes continuous attributes using the entropy based discretization. 
    99   It removes the attributes discretized to a single interval and prints their names. 
    100   Arguments: data 
    101   Returns:   table of examples with discretized atributes. Attributes that are 
    102              categorized to a single value (constant) are removed. 
    103  
    104   """ 
    105   orange.setrandseed(0) 
    106   tablen=orange.Preprocessor_discretize(data, method=orange.EntropyDiscretization()) 
    107  
    108   attrlist=[] 
    109   nrem=0 
    110   for i in tablen.domain.attributes: 
    111     if (len(i.values)>1): 
    112       attrlist.append(i) 
    113     else: 
    114       nrem=nrem+1 
    115  
    116   attrlist.append(tablen.domain.classVar) 
    117   return tablen.select(attrlist) 
    118  
    119  
    120 class EntropyDiscretization: 
    121   def __call__(self, data): 
    122     return entropyDiscretization(data) 
    123  
    124  
    125 def DiscretizedLearner(baseLearner, examples=None, weight=0, **kwds): 
    126   learner = apply(DiscretizedLearner_Class, [baseLearner], kwds) 
    127   if examples: return learner(examples, weight) 
    128   else: return learner 
    129  
    130 class DiscretizedLearner_Class: 
    131   def __init__(self, baseLearner, discretizer=EntropyDiscretization(), **kwds): 
    132     self.baseLearner = baseLearner 
    133     if hasattr(baseLearner, "name"): 
    134       self.name = baseLearner.name 
    135     self.discretizer = discretizer 
    136     self.__dict__.update(kwds) 
    137   def __call__(self, data, weight=None): 
    138     # filter the data and then learn 
    139     ddata = self.discretizer(data) 
    140     if weight<>None: 
    141       model = self.baseLearner(ddata, weight) 
    142     else: 
    143       model = self.baseLearner(ddata) 
    144     dcl = DiscretizedClassifier(classifier = model) 
    145     if hasattr(model, "domain"): 
    146       dcl.domain = model.domain 
    147     if hasattr(model, "name"): 
    148       dcl.name = model.name 
    149     return dcl 
     139    """ 
     140    def __init__(self, baseLearner, discretizer=EntropyDiscretization(), **kwds): 
     141        self.baseLearner = baseLearner 
     142        if hasattr(baseLearner, "name"): 
     143            self.name = baseLearner.name 
     144        self.discretizer = discretizer 
     145        self.__dict__.update(kwds) 
     146    def __call__(self, data, weight=None): 
     147        # filter the data and then learn 
     148        ddata = self.discretizer(data) 
     149        if weight<>None: 
     150            model = self.baseLearner(ddata, weight) 
     151        else: 
     152            model = self.baseLearner(ddata) 
     153        dcl = DiscretizedClassifier(classifier = model) 
     154        if hasattr(model, "domain"): 
     155            dcl.domain = model.domain 
     156        if hasattr(model, "name"): 
     157            dcl.name = model.name 
     158        return dcl 
    150159 
    151160class DiscretizedClassifier: 
  • orange/Orange/feature/scoring.py

    r7395 r7405  
    11""" 
     2 
     3.. index:: feature scoring 
     4 
     5.. index::  
     6   single: feature; feature scoring 
     7 
    28Feature scoring is used in feature subset selection for classification 
    39problems. The goal is to find "good" features that are relevant for the given 
  • orange/Orange/feature/selection.py

    r7385 r7405  
    11""" 
     2 
    23.. index:: feature selection 
     4 
     5.. index::  
     6   single: feature; feature selection 
    37 
    48Some machine learning methods may perform better if they learn only from a  
Note: See TracChangeset for help on using the changeset viewer.