Changeset 10172:2ab492979b00 in orange


Ignore:
Timestamp:
02/12/12 02:11:46 (2 years ago)
Author:
janezd <janez.demsar@…>
Branch:
default
Message:

Documentation for Orange.feature.scoring: more cleaning up

Files:
2 edited

Legend:

Unmodified
Added
Removed
  • Orange/feature/selection.py

    r10171 r10172  
    66 
    77 
    8 def best_n(scores, N): 
    9     """Return the best N features (without scores) from the list returned 
    10     by :obj:`Orange.feature.scoring.score_all`. 
     8def best_n(scores, n): 
     9    """Return the best features (without scores) from the list 
    1110 
    1211    :param scores: a list such as the one returned by 
    13       :obj:`Orange.feature.scoring.score_all` 
    14     :type scores: list 
    15     :param N: number of features to select. 
    16     :type N: int 
     12      :obj:`~Orange.feature.scoring.score_all` 
     13    :type scores: list 
     14    :param n: number of features to select. 
     15    :type n: int 
    1716    :rtype: :obj:`list` 
    1817 
    1918    """ 
    20     return [x[0] for x in sorted(scores)[:N]] 
     19    return [x[0] for x in sorted(scores)[:n]] 
    2120 
    2221bestNAtts = best_n 
     
    2423 
    2524def above_threshold(scores, threshold=0.0): 
    26     """Return features (without scores) from the list returned by 
    27     :obj:`Orange.feature.scoring.score_all` with score above or 
     25    """Return features (without scores) with scores above or 
    2826    equal to a specified threshold. 
    2927 
    3028    :param scores: a list such as one returned by 
    31       :obj:`Orange.feature.scoring.score_all` 
    32     :type scores: list 
    33     :param threshold: threshold for selection. Defaults to 0. 
     29      :obj:`~Orange.feature.scoring.score_all` 
     30    :type scores: list 
     31    :param threshold: threshold for selection 
    3432    :type threshold: float 
    3533    :rtype: :obj:`list` 
     
    4240 
    4341 
    44 def select_best_n(data, scores, N): 
     42def select_best_n(data, scores, n): 
    4543    """Construct and return a new data table that includes a 
    46     class and only N best features from a list scores. 
    47  
    48     :param data: an example table 
    49     :type data: Orange.data.table 
     44    class and only the best features from a list scores. 
     45 
     46    :param data: a data table 
     47    :type data: :obj:`Orange.data.Table` 
    5048    :param scores: a list such as the one returned by 
    51       :obj:`Orange.feature.scoring.score_all` 
    52     :type scores: list 
    53     :param N: number of features to select 
    54     :type N: int 
    55     :rtype: new data table 
    56  
    57     """ 
    58     return data.select(best_n(scores, N) + [data.domain.classVar.name]) 
     49      :obj:`~Orange.feature.scoring.score_all` 
     50    :type scores: list 
     51    :param n: number of features to select 
     52    :type n: int 
     53    :rtype: :obj:`Orange.data.Table` 
     54    """ 
     55    return data.select(best_n(scores, n) + [data.domain.classVar.name]) 
    5956 
    6057selectBestNAtts = select_best_n 
     
    6461    """Construct and return a new data table that includes a class and 
    6562    features from the list returned by 
    66     :obj:`Orange.feature.scoring.score_all` that have the score above or 
     63    :obj:`~Orange.feature.scoring.score_all` that have the score above or 
    6764    equal to a specified threshold. 
    6865 
    69     :param data: an example table 
    70     :type data: Orange.data.table 
     66    :param data: a data table 
     67    :type data: :obj:`Orange.data.Table` 
    7168    :param scores: a list such as the one returned by 
    72       :obj:`Orange.feature.scoring.score_all` 
    73     :type scores: list 
    74     :param threshold: threshold for selection. Defaults to 0. 
     69      :obj:`~Orange.feature.scoring.score_all` 
     70    :type scores: list 
     71    :param threshold: threshold for selection 
    7572    :type threshold: float 
    76     :rtype: new data table 
    77  
     73    :rtype: :obj:`Orange.data.Table` 
    7874    """ 
    7975    return data.select(above_threshold(scores, threshold) + \ 
     
    9086    features. The score is thus recomputed in each iteration. 
    9187 
    92     :param data: an data table 
    93     :type data: Orange.data.table 
    94     :param measure: a feature scorer (derived from 
    95       :obj:`Orange.feature.scoring.Measure`) 
    96     :param margin: margin for removal. Defaults to 0. 
     88    :param data: a data table 
     89    :type data: :obj:`Orange.data.Table` 
     90    :param measure: a feature scorer 
     91    :type measure: :obj:`Orange.feature.scoring.Score` 
     92    :param margin: margin for removal 
    9793    :type margin: float 
    9894 
     
    108104 
    109105class FilterAboveThreshold(object): 
    110     """Store filter parameters that are later called with the data to 
    111     return the data table with only selected features. 
    112  
    113     This class uses :obj:`select_above_threshold`. 
    114  
    115     :param measure: an attribute measure (derived from 
    116       :obj:`Orange.feature.scoring.Measure`). Defaults to 
    117       :obj:`Orange.feature.scoring.Relief` for k=20 and m=50. 
    118     :param threshold: score threshold for attribute selection. Defaults to 0. 
     106    """A class wrapper around :obj:`select_above_threshold`; the 
     107    constructor stores the filter parameters that are applied when the 
     108    function is called. 
     109 
     110    :param measure: a feature scorer 
     111    :type measure: :obj:`Orange.feature.scoring.Score` 
     112    :param threshold: threshold for selection. Defaults to 0. 
    119113    :type threshold: float 
    120114 
    121     Some examples of how to use this class are:: 
     115    Some examples of how to use this class:: 
    122116 
    123117        filter = Orange.feature.selection.FilterAboveThreshold(threshold=.15) 
     
    132126                measure=orange.MeasureAttribute_relief(k=20, m=50), 
    133127                threshold=0.0): 
    134  
    135128        if data is None: 
    136129            self = object.__new__(cls) 
     
    142135    def __init__(self, measure=orange.MeasureAttribute_relief(k=20, m=50), \ 
    143136                 threshold=0.0): 
    144  
    145137        self.measure = measure 
    146138        self.threshold = threshold 
    147139 
    148140    def __call__(self, data): 
    149         """Return data table features with scores above given threshold. 
     141        """Return data table features that have scores above given 
     142        threshold. 
    150143 
    151144        :param data: data table 
    152         :type data: Orange.data.table 
     145        :type data: Orange.data.Table 
    153146 
    154147        """ 
     
    161154 
    162155class FilterBestN(object): 
    163     """Store filter parameters that are later called with the data to 
    164     return the data table with only selected features. 
    165  
    166     :param measure: an attribute measure (derived from 
    167       :obj:`Orange.feature.scoring.Measure`). Defaults to 
    168       :obj:`Orange.feature.scoring.Relief` for k=20 and m=50. 
    169     :param n: number of best features to return. Defaults to 5. 
     156    """A class wrapper around :obj:`select_best_n`; the 
     157    constructor stores the filter parameters that are applied when the 
     158    function is called. 
     159 
     160    :param measure: a feature scorer 
     161    :type measure: :obj:`Orange.feature.scoring.Score` 
     162    :param n: number of features to select 
    170163    :type n: int 
    171164 
     
    197190 
    198191class FilterRelief(object): 
    199     """Store filter parameters that are later called with the data to 
    200     return the data table with only selected features. 
    201  
    202     :param measure: an attribute measure (derived from 
    203       :obj:`Orange.feature.scoring.Measure`). Defaults to 
    204       :obj:`Orange.feature.scoring.Relief` for k=20 and m=50. 
    205     :param margin: margin for Relief scoring. Defaults to 0. 
     192    """A class wrapper around :obj:`select_best_n`; the 
     193    constructor stores the filter parameters that are applied when the 
     194    function is called. 
     195 
     196    :param measure: a feature scorer 
     197    :type measure: :obj:`Orange.feature.scoring.Score` 
     198    :param margin: margin for Relief scoring 
    206199    :type margin: float 
    207200 
     
    233226 
    234227class FilteredLearner(object): 
    235     """Return the learner that wraps :obj:`Orange.classification.baseLearner`  
    236     and a data selection method. 
    237  
    238     When calling the learner with a data table, data is first filtered and 
    239     then passed to :obj:`Orange.classification.baseLearner`. This comes handy 
    240     when one wants to test the schema of feature-subset-selection-and-learning 
    241     by some repetitive evaluation method, e.g., cross validation. 
    242  
    243     :param filter: defatuls to 
    244       :obj:`Orange.feature.selection.FilterAboveThreshold` 
    245     :type filter: Orange.feature.selection.FilterAboveThreshold 
     228    """A learner that applies the given features selection method and 
     229    then calls the base learner. This learner is needed to properly cross-validate a combination of feature selection and learning. 
    246230 
    247231    Here is an example of how to build a wrapper around naive Bayesian learner 
  • docs/reference/rst/Orange.feature.selection.rst

    r10171 r10172  
    2222        synfuels-corporation-cutback 
    2323 
     24The module also includes a learner that incorporates feature subset 
     25selection. 
     26 
     27-------------------------------------- 
     28Functions for feature subset selection 
     29-------------------------------------- 
     30 
     31.. automethod:: Orange.feature.selection.best_n 
     32 
     33.. automethod:: Orange.feature.selection.above_threshold 
     34 
     35.. automethod:: Orange.feature.selection.select_best_n 
     36 
     37.. automethod:: Orange.feature.selection.select_above_threshold 
     38 
     39.. automethod:: Orange.feature.selection.select_relief(data, measure=Orange.feature.scoring.Relief(k=20, m=10), margin=0) 
     40 
     41-------------------------------------- 
     42Learning with feature subset selection 
     43-------------------------------------- 
     44 
     45.. autoclass:: Orange.feature.selection.FilteredLearner(base_learner, filter=FilterAboveThreshold(), name=filtered) 
     46   :members: 
     47 
     48.. autoclass:: Orange.feature.selection.FilteredClassifier 
     49   :members: 
     50 
     51 
     52-------------------------------------- 
     53Class wrappers for selection functions 
     54-------------------------------------- 
     55 
    2456.. autoclass:: Orange.feature.selection.FilterAboveThreshold(data=None, measure=Orange.feature.scoring.Relief(k=20, m=50), threshold=0.0) 
    2557   :members: 
     
    3163   :members: 
    3264 
    33 .. autoclass:: Orange.feature.selection.FilteredLearner(baseLerner, filter=FilterAboveThreshold(), name=filtered) 
    34    :members: 
    3565 
    36 .. autoclass:: Orange.feature.selection.FilteredClassifier 
    37    :members: 
    38  
    39 These functions support the design of feature subset selection for 
    40 classification problems. 
    41  
    42 .. automethod:: Orange.feature.selection.best_n 
    43  
    44 .. automethod:: Orange.feature.selection.above_threshold 
    45  
    46 .. automethod:: Orange.feature.selection.select_best_n 
    47  
    48 .. automethod:: Orange.feature.selection.select_above_threshold 
    49  
    50 .. automethod:: Orange.feature.selection.select_relief 
    5166 
    5267.. rubric:: Examples 
Note: See TracChangeset for help on using the changeset viewer.