Changeset 7291:00460f0b1ad2 in orange


Ignore:
Timestamp:
02/03/11 10:12:20 (3 years ago)
Author:
jzbontar <jure.zbontar@…>
Branch:
default
Convert:
7e35c40185cfac6df70ca302db3df3032fe38a26
Message:

checkpoint

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/classification/logreg.py

    r7270 r7291  
    77 
    88Module :obj:`Orange.classification.logreg` is a set of wrappers around 
    9 classes LogisticLearner and LogisticClassifier, that are implemented 
     9the classes LogisticLearner and LogisticClassifier, that are implemented 
    1010in core Orange. This module extends the use of logistic regression 
    11 to discrete attributes, it helps handling various anomalies in 
    12 attributes, such as constant variables and singularities, that make 
     11to discrete features, it can handle various anomalies in 
     12features, such as constant variables and singularities, that make 
    1313fitting logistic regression almost impossible. It also implements a 
    14 function for constructing a stepwise logistic regression, which is a 
     14function for constructing stepwise logistic regression, which is a 
    1515good technique for prevent overfitting, and is a good feature subset 
    1616selection technique as well. 
    1717 
    18 Functions 
     18Useful Functions 
    1919--------- 
    2020 
     
    3232-------- 
    3333 
    34 First example shows a very simple induction of a logistic regression 
     34The first example shows a very simple induction of a logistic regression 
    3535classifier (`logreg-run.py`_, uses `titanic.tab`_). 
    3636 
     
    108108                          hours-per-week      -0.04       0.00       -inf       0.00       0.96 
    109109 
    110 In case we set removeSingular to 0, inducing logistic regression 
     110In case we set :obj:`removeSingular` to 0, inducing a logistic regression 
    111111classifier would return an error:: 
    112112 
     
    197197 
    198198def printOUT(classifier): 
    199     """ Formatted print to console of all major attributes in logistic 
     199    """ Formatted print to console of all major features in logistic 
    200200    regression classifier.  
    201201 
     
    217217    # print out the head 
    218218    formatstr = "%"+str(longest)+"s %10s %10s %10s %10s %10s" 
    219     print formatstr % ("Attribute", "beta", "st. error", "wald Z", "P", "OR=exp(beta)") 
     219    print formatstr % ("Feature", "beta", "st. error", "wald Z", "P", "OR=exp(beta)") 
    220220    print 
    221221    formatstr = "%"+str(longest)+"s %10.2f %10.2f %10.2f %10.2f"     
     
    233233 
    234234def LogRegLearner(table=None, weightID=0, **kwds): 
    235     """ Logistic regression learner 
    236  
    237     :obj:`LogRegLearner` implements logistic regression. If data 
    238     instances are provided to the constructor, the learning algorithm 
    239     is called and the resulting classifier is returned instead of the 
    240     learner. 
    241  
    242     :param table: data set with either discrete or continuous features 
     235    """ Logistic regression learner. 
     236 
     237    Implements logistic regression. If data instances are provided to 
     238    the constructor, the learning algorithm is called and the resulting 
     239    classifier is returned instead of the learner. 
     240 
     241    :param table: data table with either discrete or continuous features 
    243242    :type table: Orange.table.data 
    244243    :param weightID: the ID of the weight meta attribute 
    245     :param removeSingular: set to 1 if you want automatic removal of disturbing attributes, such as constants and singularities 
     244    :type weightID: int 
     245    :param removeSingular: set to 1 if you want automatic removal of disturbing features, such as constants and singularities 
     246    :type removeSingular: bool 
    246247    :param fitter: alternate the fitting algorithm (currently the Newton-Raphson fitting algorithm is used) 
     248    :type fitter: type??? 
    247249    :param stepwiseLR: set to 1 if you wish to use stepwise logistic regression 
    248     :param addCrit: parameter for stepwise attribute selection 
    249     :param deleteCrit: parameter for stepwise attribute selection 
    250     :param numAttr: parameter for stepwise attribute selection 
    251          
     250    :type stepwiseLR: bool 
     251    :param addCrit: parameter for stepwise feature selection 
     252    :type addCrit: float 
     253    :param deleteCrit: parameter for stepwise feature selection 
     254    :type deleteCrit: float 
     255    :param numFeatures: parameter for stepwise feature selection 
     256    :type numFeatures: int 
     257    :rtype: :obj:`LogRegLearnerClass` or :obj:`LogRegClassifier` 
     258 
    252259    """ 
    253260    lr = LogRegLearnerClass(**kwds) 
     
    274281            addCrit = getattr(self, "addCrit", 0.2) 
    275282            removeCrit = getattr(self, "removeCrit", 0.3) 
    276             numAttr = getattr(self, "numAttr", -1) 
    277             attributes = StepWiseFSS(examples, addCrit = addCrit, deleteCrit = removeCrit, imputer = imputer, numAttr = numAttr) 
     283            numFeatures = getattr(self, "numFeatures", -1) 
     284            attributes = StepWiseFSS(examples, addCrit = addCrit, deleteCrit = removeCrit, imputer = imputer, numFeatures = numFeatures) 
    278285            tmpDomain = orange.Domain(attributes, examples.domain.classVar) 
    279286            tmpDomain.addmetas(examples.domain.getmetas()) 
     
    803810 
    804811    If :obj:`table` is specified, stepwise logistic regression implemented 
    805     in :obj:`stepWiseFSS_class` is performed and a list of chosen attributes 
     812    in :obj:`StepWiseFSS_class` is performed and a list of chosen features 
    806813    is returned. If :obj:`table` is not specified an instance of 
    807     :obj:`stepWiseFSS_class` with all parameters set is returned. 
     814    :obj:`StepWiseFSS_class` with all parameters set is returned. 
    808815 
    809816    :param table: data set 
    810817    :type table: Orange.data.table 
    811818 
    812     :param addCrit: "Alpha" level to judge if variable has enough importance to be added in the new set. (e.g. if addCrit is 0.2, then attribute is added if its P is lower than 0.2) 
     819    :param addCrit: "Alpha" level to judge if variable has enough importance to be added in the new set. (e.g. if addCrit is 0.2, then features is added if its P is lower than 0.2) 
    813820    :type addCrit: float 
    814821 
     
    816823    :type deleteCrit: float 
    817824 
    818     :param numAttr: maximum number of selected attributes, use -1 for infinity 
    819     :type numAttr: int 
     825    :param numFeatures: maximum number of selected features, use -1 for infinity. 
     826    :type numFeatures: int 
     827    :rtype: :obj:`StepWiseFSS_class` or list of features 
    820828 
    821829    """ 
     
    823831    """ 
    824832      Constructs and returns a new set of table that includes a 
    825       class and attributes selected by stepwise logistic regression. This is an 
     833      class and features selected by stepwise logistic regression. This is an 
    826834      implementation of algorithm described in [Hosmer and Lemeshow, Applied Logistic Regression, 2000] 
    827835 
     
    829837      addCrit: "Alpha" level to judge if variable has enough importance to be added in the new set. (e.g. if addCrit is 0.2, then attribute is added if its P is lower than 0.2) 
    830838      deleteCrit: Similar to addCrit, just that it is used at backward elimination. It should be higher than addCrit! 
    831       numAttr: maximum number of selected attributes, use -1 for infinity 
     839      numFeatures: maximum number of selected features, use -1 for infinity 
    832840 
    833841    """ 
     
    852860 
    853861class StepWiseFSS_class(orange.Learner): 
    854   """ Performs stepwise logistic regression and returns a list of "most" 
    855   informative attributes. Each step of this algorithm is composed 
    856   of two parts. First is backward elimination, where each already 
    857   chosen attribute is tested for significant contribution to overall 
    858   model. If worst among all tested attributes has higher significance 
    859   that is specified in deleteCrit, this attribute is removed from 
    860   the model. The second step is forward selection, which is similar 
    861   to backward elimination. It loops through all attributes that are 
    862   not in model and tests whether they contribute to common model with 
    863   significance lower that addCrit. Algorithm stops when no attribute 
    864   in model is to be removed and no attribute out of the model is to 
    865   be added. By setting numAttr larger than -1, algorithm will stop its 
    866   execution when number of attributes in model will exceed that number. 
     862  """ Performs stepwise logistic regression and returns a list of the 
     863  most "informative" features. Each step of the algorithm is composed 
     864  of two parts. The first is backward elimination, where each already 
     865  chosen feature is tested for a significant contribution to the overall 
     866  model. If the worst among all tested features has higher significance 
     867  than is specified in :obj:`deleteCrit`, the feature is removed from 
     868  the model. The second step is forward selection, which is similar to 
     869  backward elimination. It loops through all the features that are not 
     870  in the model and tests whether they contribute to the common model 
     871  with significance lower that :obj:`addCrit`. The algorithm stops when 
     872  no feature in the model is to be removed and no feature not in the 
     873  model is to be added. By setting :obj:`numFeatures` larger than -1, 
     874  the algorithm will stop its execution when the number of features in model 
     875  exceeds that number. 
    867876 
    868877  Significances are assesed via the likelihood ration chi-square 
     
    872881  """ 
    873882 
    874   def __init__(self, addCrit=0.2, deleteCrit=0.3, numAttr = -1, **kwds): 
     883  def __init__(self, addCrit=0.2, deleteCrit=0.3, numFeatures = -1, **kwds): 
    875884 
    876885    self.__dict__.update(kwds) 
    877886    self.addCrit = addCrit 
    878887    self.deleteCrit = deleteCrit 
    879     self.numAttr = numAttr 
     888    self.numFeatures = numFeatures 
    880889  def __call__(self, examples): 
    881890    if getattr(self, "imputer", 0): 
     
    949958             
    950959        # if enough attributes has been chosen, stop the procedure 
    951         if self.numAttr>-1 and len(attr)>=self.numAttr: 
     960        if self.numFeatures>-1 and len(attr)>=self.numFeatures: 
    952961            remain_attr=[] 
    953962          
     
    10101019 
    10111020class StepWiseFSS_Filter_class(object): 
    1012     def __init__(self, addCrit=0.2, deleteCrit=0.3, numAttr = -1): 
     1021    def __init__(self, addCrit=0.2, deleteCrit=0.3, numFeatures = -1): 
    10131022        self.addCrit = addCrit 
    10141023        self.deleteCrit = deleteCrit 
    1015         self.numAttr = numAttr 
     1024        self.numFeatures = numFeatures 
    10161025    def __call__(self, examples): 
    1017         attr = StepWiseFSS(examples, addCrit=self.addCrit, deleteCrit = self.deleteCrit, numAttr = self.numAttr) 
     1026        attr = StepWiseFSS(examples, addCrit=self.addCrit, deleteCrit = self.deleteCrit, numFeatures = self.numFeatures) 
    10181027        return examples.select(orange.Domain(attr, examples.domain.classVar)) 
    10191028                 
Note: See TracChangeset for help on using the changeset viewer.