Changeset 7291:00460f0b1ad2 in orange
 Timestamp:
 02/03/11 10:12:20 (3 years ago)
 Branch:
 default
 Convert:
 7e35c40185cfac6df70ca302db3df3032fe38a26
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

orange/Orange/classification/logreg.py
r7270 r7291 7 7 8 8 Module :obj:`Orange.classification.logreg` is a set of wrappers around 9 classes LogisticLearner and LogisticClassifier, that are implemented9 the classes LogisticLearner and LogisticClassifier, that are implemented 10 10 in core Orange. This module extends the use of logistic regression 11 to discrete attributes, it helps handlingvarious anomalies in12 attributes, such as constant variables and singularities, that make11 to discrete features, it can handle various anomalies in 12 features, such as constant variables and singularities, that make 13 13 fitting logistic regression almost impossible. It also implements a 14 function for constructing astepwise logistic regression, which is a14 function for constructing stepwise logistic regression, which is a 15 15 good technique for prevent overfitting, and is a good feature subset 16 16 selection technique as well. 17 17 18 Functions18 Useful Functions 19 19  20 20 … … 32 32  33 33 34 First example shows a very simple induction of a logistic regression34 The first example shows a very simple induction of a logistic regression 35 35 classifier (`logregrun.py`_, uses `titanic.tab`_). 36 36 … … 108 108 hoursperweek 0.04 0.00 inf 0.00 0.96 109 109 110 In case we set removeSingular to 0, inducinglogistic regression110 In case we set :obj:`removeSingular` to 0, inducing a logistic regression 111 111 classifier would return an error:: 112 112 … … 197 197 198 198 def printOUT(classifier): 199 """ Formatted print to console of all major attributes in logistic199 """ Formatted print to console of all major features in logistic 200 200 regression classifier. 201 201 … … 217 217 # print out the head 218 218 formatstr = "%"+str(longest)+"s %10s %10s %10s %10s %10s" 219 print formatstr % (" Attribute", "beta", "st. error", "wald Z", "P", "OR=exp(beta)")219 print formatstr % ("Feature", "beta", "st. error", "wald Z", "P", "OR=exp(beta)") 220 220 print 221 221 formatstr = "%"+str(longest)+"s %10.2f %10.2f %10.2f %10.2f" … … 233 233 234 234 def LogRegLearner(table=None, weightID=0, **kwds): 235 """ Logistic regression learner 236 237 :obj:`LogRegLearner` implements logistic regression. If data 238 instances are provided to the constructor, the learning algorithm 239 is called and the resulting classifier is returned instead of the 240 learner. 241 242 :param table: data set with either discrete or continuous features 235 """ Logistic regression learner. 236 237 Implements logistic regression. If data instances are provided to 238 the constructor, the learning algorithm is called and the resulting 239 classifier is returned instead of the learner. 240 241 :param table: data table with either discrete or continuous features 243 242 :type table: Orange.table.data 244 243 :param weightID: the ID of the weight meta attribute 245 :param removeSingular: set to 1 if you want automatic removal of disturbing attributes, such as constants and singularities 244 :type weightID: int 245 :param removeSingular: set to 1 if you want automatic removal of disturbing features, such as constants and singularities 246 :type removeSingular: bool 246 247 :param fitter: alternate the fitting algorithm (currently the NewtonRaphson fitting algorithm is used) 248 :type fitter: type??? 247 249 :param stepwiseLR: set to 1 if you wish to use stepwise logistic regression 248 :param addCrit: parameter for stepwise attribute selection 249 :param deleteCrit: parameter for stepwise attribute selection 250 :param numAttr: parameter for stepwise attribute selection 251 250 :type stepwiseLR: bool 251 :param addCrit: parameter for stepwise feature selection 252 :type addCrit: float 253 :param deleteCrit: parameter for stepwise feature selection 254 :type deleteCrit: float 255 :param numFeatures: parameter for stepwise feature selection 256 :type numFeatures: int 257 :rtype: :obj:`LogRegLearnerClass` or :obj:`LogRegClassifier` 258 252 259 """ 253 260 lr = LogRegLearnerClass(**kwds) … … 274 281 addCrit = getattr(self, "addCrit", 0.2) 275 282 removeCrit = getattr(self, "removeCrit", 0.3) 276 num Attr = getattr(self, "numAttr", 1)277 attributes = StepWiseFSS(examples, addCrit = addCrit, deleteCrit = removeCrit, imputer = imputer, num Attr = numAttr)283 numFeatures = getattr(self, "numFeatures", 1) 284 attributes = StepWiseFSS(examples, addCrit = addCrit, deleteCrit = removeCrit, imputer = imputer, numFeatures = numFeatures) 278 285 tmpDomain = orange.Domain(attributes, examples.domain.classVar) 279 286 tmpDomain.addmetas(examples.domain.getmetas()) … … 803 810 804 811 If :obj:`table` is specified, stepwise logistic regression implemented 805 in :obj:` stepWiseFSS_class` is performed and a list of chosen attributes812 in :obj:`StepWiseFSS_class` is performed and a list of chosen features 806 813 is returned. If :obj:`table` is not specified an instance of 807 :obj:` stepWiseFSS_class` with all parameters set is returned.814 :obj:`StepWiseFSS_class` with all parameters set is returned. 808 815 809 816 :param table: data set 810 817 :type table: Orange.data.table 811 818 812 :param addCrit: "Alpha" level to judge if variable has enough importance to be added in the new set. (e.g. if addCrit is 0.2, then attributeis added if its P is lower than 0.2)819 :param addCrit: "Alpha" level to judge if variable has enough importance to be added in the new set. (e.g. if addCrit is 0.2, then features is added if its P is lower than 0.2) 813 820 :type addCrit: float 814 821 … … 816 823 :type deleteCrit: float 817 824 818 :param numAttr: maximum number of selected attributes, use 1 for infinity 819 :type numAttr: int 825 :param numFeatures: maximum number of selected features, use 1 for infinity. 826 :type numFeatures: int 827 :rtype: :obj:`StepWiseFSS_class` or list of features 820 828 821 829 """ … … 823 831 """ 824 832 Constructs and returns a new set of table that includes a 825 class and attributes selected by stepwise logistic regression. This is an833 class and features selected by stepwise logistic regression. This is an 826 834 implementation of algorithm described in [Hosmer and Lemeshow, Applied Logistic Regression, 2000] 827 835 … … 829 837 addCrit: "Alpha" level to judge if variable has enough importance to be added in the new set. (e.g. if addCrit is 0.2, then attribute is added if its P is lower than 0.2) 830 838 deleteCrit: Similar to addCrit, just that it is used at backward elimination. It should be higher than addCrit! 831 num Attr: maximum number of selected attributes, use 1 for infinity839 numFeatures: maximum number of selected features, use 1 for infinity 832 840 833 841 """ … … 852 860 853 861 class StepWiseFSS_class(orange.Learner): 854 """ Performs stepwise logistic regression and returns a list of "most" 855 informative attributes. Each step of this algorithm is composed 856 of two parts. First is backward elimination, where each already 857 chosen attribute is tested for significant contribution to overall 858 model. If worst among all tested attributes has higher significance 859 that is specified in deleteCrit, this attribute is removed from 860 the model. The second step is forward selection, which is similar 861 to backward elimination. It loops through all attributes that are 862 not in model and tests whether they contribute to common model with 863 significance lower that addCrit. Algorithm stops when no attribute 864 in model is to be removed and no attribute out of the model is to 865 be added. By setting numAttr larger than 1, algorithm will stop its 866 execution when number of attributes in model will exceed that number. 862 """ Performs stepwise logistic regression and returns a list of the 863 most "informative" features. Each step of the algorithm is composed 864 of two parts. The first is backward elimination, where each already 865 chosen feature is tested for a significant contribution to the overall 866 model. If the worst among all tested features has higher significance 867 than is specified in :obj:`deleteCrit`, the feature is removed from 868 the model. The second step is forward selection, which is similar to 869 backward elimination. It loops through all the features that are not 870 in the model and tests whether they contribute to the common model 871 with significance lower that :obj:`addCrit`. The algorithm stops when 872 no feature in the model is to be removed and no feature not in the 873 model is to be added. By setting :obj:`numFeatures` larger than 1, 874 the algorithm will stop its execution when the number of features in model 875 exceeds that number. 867 876 868 877 Significances are assesed via the likelihood ration chisquare … … 872 881 """ 873 882 874 def __init__(self, addCrit=0.2, deleteCrit=0.3, num Attr= 1, **kwds):883 def __init__(self, addCrit=0.2, deleteCrit=0.3, numFeatures = 1, **kwds): 875 884 876 885 self.__dict__.update(kwds) 877 886 self.addCrit = addCrit 878 887 self.deleteCrit = deleteCrit 879 self.num Attr = numAttr888 self.numFeatures = numFeatures 880 889 def __call__(self, examples): 881 890 if getattr(self, "imputer", 0): … … 949 958 950 959 # if enough attributes has been chosen, stop the procedure 951 if self.num Attr>1 and len(attr)>=self.numAttr:960 if self.numFeatures>1 and len(attr)>=self.numFeatures: 952 961 remain_attr=[] 953 962 … … 1010 1019 1011 1020 class StepWiseFSS_Filter_class(object): 1012 def __init__(self, addCrit=0.2, deleteCrit=0.3, num Attr= 1):1021 def __init__(self, addCrit=0.2, deleteCrit=0.3, numFeatures = 1): 1013 1022 self.addCrit = addCrit 1014 1023 self.deleteCrit = deleteCrit 1015 self.num Attr = numAttr1024 self.numFeatures = numFeatures 1016 1025 def __call__(self, examples): 1017 attr = StepWiseFSS(examples, addCrit=self.addCrit, deleteCrit = self.deleteCrit, num Attr = self.numAttr)1026 attr = StepWiseFSS(examples, addCrit=self.addCrit, deleteCrit = self.deleteCrit, numFeatures = self.numFeatures) 1018 1027 return examples.select(orange.Domain(attr, examples.domain.classVar)) 1019 1028
Note: See TracChangeset
for help on using the changeset viewer.