Ignore:
Timestamp:
02/06/12 20:00:44 (2 years ago)
Author:
Matija Polajnar <matija.polajnar@…>
Branch:
default
rebase_source:
50b865d3d6764767b1ca538019c5b08631aee272
Message:

Finish the logreg refactoring, along with documentation improvement.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Orange/classification/logreg.py

    r9671 r9818  
    1 """ 
    2 .. index: logistic regression 
    3 .. index: 
    4    single: classification; logistic regression 
    5  
    6 ******************************** 
    7 Logistic regression (``logreg``) 
    8 ******************************** 
    9  
    10 Implements `logistic regression 
    11 <http://en.wikipedia.org/wiki/Logistic_regression>`_ with an extension for 
    12 proper treatment of discrete features.  The algorithm can handle various 
    13 anomalies in features, such as constant variables and singularities, that 
    14 could make fitting of logistic regression almost impossible. Stepwise 
    15 logistic regression, which iteratively selects the most informative 
    16 features, is also supported. 
    17  
    18 Logistic regression is a popular classification method that comes 
    19 from statistics. The model is described by a linear combination of 
    20 coefficients, 
    21  
    22 .. math:: 
    23      
    24     F = \\beta_0 + \\beta_1*X_1 + \\beta_2*X_2 + ... + \\beta_k*X_k 
    25  
    26 and the probability (p) of a class value is  computed as: 
    27  
    28 .. math:: 
    29  
    30     p = \\frac{\exp(F)}{1 + \exp(F)} 
    31  
    32  
    33 .. class :: LogRegClassifier 
    34  
    35     :obj:`LogRegClassifier` stores estimated values of regression 
    36     coefficients and their significances, and uses them to predict 
    37     classes and class probabilities using the equations described above. 
    38  
    39     .. attribute :: beta 
    40  
    41         Estimated regression coefficients. 
    42  
    43     .. attribute :: beta_se 
    44  
    45         Estimated standard errors for regression coefficients. 
    46  
    47     .. attribute :: wald_Z 
    48  
    49         Wald Z statistics for beta coefficients. Wald Z is computed 
    50         as beta/beta_se. 
    51  
    52     .. attribute :: P 
    53  
    54         List of P-values for beta coefficients, that is, the probability 
    55         that beta coefficients differ from 0.0. The probability is 
    56         computed from squared Wald Z statistics that is distributed with 
    57         Chi-Square distribution. 
    58  
    59     .. attribute :: likelihood 
    60  
    61         The probability of the sample (ie. learning examples) observed on 
    62         the basis of the derived model, as a function of the regression 
    63         parameters. 
    64  
    65     .. attribute :: fitStatus 
    66  
    67         Tells how the model fitting ended - either regularly 
    68         (:obj:`LogRegFitter.OK`), or it was interrupted due to one of beta 
    69         coefficients escaping towards infinity (:obj:`LogRegFitter.Infinity`) 
    70         or since the values didn't converge (:obj:`LogRegFitter.Divergence`). The 
    71         value tells about the classifier's "reliability"; the classifier 
    72         itself is useful in either case. 
    73  
    74 .. autoclass:: LogRegLearner 
    75  
    76 .. class:: LogRegFitter 
    77  
    78     :obj:`LogRegFitter` is the abstract base class for logistic fitters. It 
    79     defines the form of call operator and the constants denoting its 
    80     (un)success: 
    81  
    82     .. attribute:: OK 
    83  
    84         Fitter succeeded to converge to the optimal fit. 
    85  
    86     .. attribute:: Infinity 
    87  
    88         Fitter failed due to one or more beta coefficients escaping towards infinity. 
    89  
    90     .. attribute:: Divergence 
    91  
    92         Beta coefficients failed to converge, but none of beta coefficients escaped. 
    93  
    94     .. attribute:: Constant 
    95  
    96         There is a constant attribute that causes the matrix to be singular. 
    97  
    98     .. attribute:: Singularity 
    99  
    100         The matrix is singular. 
    101  
    102  
    103     .. method:: __call__(examples, weightID) 
    104  
    105         Performs the fitting. There can be two different cases: either 
    106         the fitting succeeded to find a set of beta coefficients (although 
    107         possibly with difficulties) or the fitting failed altogether. The 
    108         two cases return different results. 
    109  
    110         `(status, beta, beta_se, likelihood)` 
    111             The fitter managed to fit the model. The first element of 
    112             the tuple, result, tells about the problems occurred; it can 
    113             be either :obj:`OK`, :obj:`Infinity` or :obj:`Divergence`. In 
    114             the latter cases, returned values may still be useful for 
    115             making predictions, but it's recommended that you inspect 
    116             the coefficients and their errors and make your decision 
    117             whether to use the model or not. 
    118  
    119         `(status, attribute)` 
    120             The fitter failed and the returned attribute is responsible 
    121             for it. The type of failure is reported in status, which 
    122             can be either :obj:`Constant` or :obj:`Singularity`. 
    123  
    124         The proper way of calling the fitter is to expect and handle all 
    125         the situations described. For instance, if fitter is an instance 
    126         of some fitter and examples contain a set of suitable examples, 
    127         a script should look like this:: 
    128  
    129             res = fitter(examples) 
    130             if res[0] in [fitter.OK, fitter.Infinity, fitter.Divergence]: 
    131                status, beta, beta_se, likelihood = res 
    132                < proceed by doing something with what you got > 
    133             else: 
    134                status, attr = res 
    135                < remove the attribute or complain to the user or ... > 
    136  
    137  
    138 .. class :: LogRegFitter_Cholesky 
    139  
    140     :obj:`LogRegFitter_Cholesky` is the sole fitter available at the 
    141     moment. It is a C++ translation of `Alan Miller's logistic regression 
    142     code <http://users.bigpond.net.au/amiller/>`_. It uses Newton-Raphson 
    143     algorithm to iteratively minimize least squares error computed from 
    144     learning examples. 
    145  
    146  
    147 .. autoclass:: StepWiseFSS 
    148 .. autofunction:: dump 
    149  
    150  
    151  
    152 Examples 
    153 -------- 
    154  
    155 The first example shows a very simple induction of a logistic regression 
    156 classifier (:download:`logreg-run.py <code/logreg-run.py>`, uses :download:`titanic.tab <code/titanic.tab>`). 
    157  
    158 .. literalinclude:: code/logreg-run.py 
    159  
    160 Result:: 
    161  
    162     Classification accuracy: 0.778282598819 
    163  
    164     class attribute = survived 
    165     class values = <no, yes> 
    166  
    167         Attribute       beta  st. error     wald Z          P OR=exp(beta) 
    168  
    169         Intercept      -1.23       0.08     -15.15      -0.00 
    170      status=first       0.86       0.16       5.39       0.00       2.36 
    171     status=second      -0.16       0.18      -0.91       0.36       0.85 
    172      status=third      -0.92       0.15      -6.12       0.00       0.40 
    173         age=child       1.06       0.25       4.30       0.00       2.89 
    174        sex=female       2.42       0.14      17.04       0.00      11.25 
    175  
    176 The next examples shows how to handle singularities in data sets 
    177 (:download:`logreg-singularities.py <code/logreg-singularities.py>`, uses :download:`adult_sample.tab <code/adult_sample.tab>`). 
    178  
    179 .. literalinclude:: code/logreg-singularities.py 
    180  
    181 The first few lines of the output of this script are:: 
    182  
    183     <=50K <=50K 
    184     <=50K <=50K 
    185     <=50K <=50K 
    186     >50K >50K 
    187     <=50K >50K 
    188  
    189     class attribute = y 
    190     class values = <>50K, <=50K> 
    191  
    192                                Attribute       beta  st. error     wald Z          P OR=exp(beta) 
    193  
    194                                Intercept       6.62      -0.00       -inf       0.00 
    195                                      age      -0.04       0.00       -inf       0.00       0.96 
    196                                   fnlwgt      -0.00       0.00       -inf       0.00       1.00 
    197                            education-num      -0.28       0.00       -inf       0.00       0.76 
    198                  marital-status=Divorced       4.29       0.00        inf       0.00      72.62 
    199             marital-status=Never-married       3.79       0.00        inf       0.00      44.45 
    200                 marital-status=Separated       3.46       0.00        inf       0.00      31.95 
    201                   marital-status=Widowed       3.85       0.00        inf       0.00      46.96 
    202     marital-status=Married-spouse-absent       3.98       0.00        inf       0.00      53.63 
    203         marital-status=Married-AF-spouse       4.01       0.00        inf       0.00      55.19 
    204                  occupation=Tech-support      -0.32       0.00       -inf       0.00       0.72 
    205  
    206 If :obj:`removeSingular` is set to 0, inducing a logistic regression 
    207 classifier would return an error:: 
    208  
    209     Traceback (most recent call last): 
    210       File "logreg-singularities.py", line 4, in <module> 
    211         lr = classification.logreg.LogRegLearner(table, removeSingular=0) 
    212       File "/home/jure/devel/orange/Orange/classification/logreg.py", line 255, in LogRegLearner 
    213         return lr(examples, weightID) 
    214       File "/home/jure/devel/orange/Orange/classification/logreg.py", line 291, in __call__ 
    215         lr = learner(examples, weight) 
    216     orange.KernelException: 'orange.LogRegLearner': singularity in workclass=Never-worked 
    217  
    218 We can see that the attribute workclass is causing a singularity. 
    219  
    220 The example below shows, how the use of stepwise logistic regression can help to 
    221 gain in classification performance (:download:`logreg-stepwise.py <code/logreg-stepwise.py>`, uses :download:`ionosphere.tab <code/ionosphere.tab>`): 
    222  
    223 .. literalinclude:: code/logreg-stepwise.py 
    224  
    225 The output of this script is:: 
    226  
    227     Learner      CA 
    228     logistic     0.841 
    229     filtered     0.846 
    230  
    231     Number of times attributes were used in cross-validation: 
    232      1 x a21 
    233     10 x a22 
    234      8 x a23 
    235      7 x a24 
    236      1 x a25 
    237     10 x a26 
    238     10 x a27 
    239      3 x a28 
    240      7 x a29 
    241      9 x a31 
    242      2 x a16 
    243      7 x a12 
    244      1 x a32 
    245      8 x a15 
    246     10 x a14 
    247      4 x a17 
    248      7 x a30 
    249     10 x a11 
    250      1 x a10 
    251      1 x a13 
    252     10 x a34 
    253      2 x a19 
    254      1 x a18 
    255     10 x a3 
    256     10 x a5 
    257      4 x a4 
    258      4 x a7 
    259      8 x a6 
    260     10 x a9 
    261     10 x a8 
    262  
    263 """ 
    264  
    265 from Orange.core import LogRegLearner, LogRegClassifier, LogRegFitter, LogRegFitter_Cholesky 
    266  
    2671import Orange 
    268 import math, os 
    269 import warnings 
    270 from numpy import * 
    271 from numpy.linalg import * 
    272  
    273  
    274 ########################################################################## 
    275 ## Print out methods 
     2from Orange.misc import deprecated_keywords, deprecated_members 
     3import math 
     4from numpy import dot, array, identity, reshape, diagonal, \ 
     5    transpose, concatenate, sqrt, sign 
     6from numpy.linalg import inv 
     7from Orange.core import LogRegClassifier, LogRegFitter, LogRegFitter_Cholesky 
    2768 
    2779def dump(classifier): 
    278     """ Formatted string of all major features in logistic 
    279     regression classifier.  
    280  
    281     :param classifier: logistic regression classifier 
     10    """ Return a formatted string of all major features in logistic regression 
     11    classifier. 
     12 
     13    :param classifier: logistic regression classifier. 
    28214    """ 
    28315 
    28416    # print out class values 
    28517    out = [''] 
    286     out.append("class attribute = " + classifier.domain.classVar.name) 
    287     out.append("class values = " + str(classifier.domain.classVar.values)) 
     18    out.append("class attribute = " + classifier.domain.class_var.name) 
     19    out.append("class values = " + str(classifier.domain.class_var.values)) 
    28820    out.append('') 
    28921     
    29022    # get the longest attribute name 
    29123    longest=0 
    292     for at in classifier.continuizedDomain.attributes: 
     24    for at in classifier.continuized_domain.features: 
    29325        if len(at.name)>longest: 
    294             longest=len(at.name); 
     26            longest=len(at.name) 
    29527 
    29628    # print out the head 
     
    30133    out.append(formatstr % ("Intercept", classifier.beta[0], classifier.beta_se[0], classifier.wald_Z[0], classifier.P[0])) 
    30234    formatstr = "%"+str(longest)+"s %10.2f %10.2f %10.2f %10.2f %10.2f"     
    303     for i in range(len(classifier.continuizedDomain.attributes)): 
    304         out.append(formatstr % (classifier.continuizedDomain.attributes[i].name, classifier.beta[i+1], classifier.beta_se[i+1], classifier.wald_Z[i+1], abs(classifier.P[i+1]), math.exp(classifier.beta[i+1]))) 
     35    for i in range(len(classifier.continuized_domain.features)): 
     36        out.append(formatstr % (classifier.continuized_domain.features[i].name, classifier.beta[i+1], classifier.beta_se[i+1], classifier.wald_Z[i+1], abs(classifier.P[i+1]), math.exp(classifier.beta[i+1]))) 
    30537 
    30638    return '\n'.join(out) 
     
    30840 
    30941def has_discrete_values(domain): 
    310     for at in domain.attributes: 
    311         if at.varType == Orange.core.VarTypes.Discrete: 
    312             return 1 
    313     return 0 
     42    """ 
     43    Return 1 if the given domain contains any discrete features, else 0. 
     44 
     45    :param domain: domain. 
     46    :type domain: :class:`Orange.data.Domain` 
     47    """ 
     48    return any(at.var_type == Orange.data.Type.Discrete 
     49               for at in domain.features) 
     50 
    31451 
    31552class LogRegLearner(Orange.classification.Learner): 
    31653    """ Logistic regression learner. 
    31754 
    318     Implements logistic regression. If data instances are provided to 
     55    If data instances are provided to 
    31956    the constructor, the learning algorithm is called and the resulting 
    32057    classifier is returned instead of the learner. 
    32158 
    322     :param table: data table with either discrete or continuous features 
    323     :type table: Orange.data.Table 
    324     :param weightID: the ID of the weight meta attribute 
    325     :type weightID: int 
    326     :param removeSingular: set to 1 if you want automatic removal of disturbing features, such as constants and singularities 
    327     :type removeSingular: bool 
    328     :param fitter: the fitting algorithm (by default the Newton-Raphson fitting algorithm is used) 
    329     :param stepwiseLR: set to 1 if you wish to use stepwise logistic regression 
    330     :type stepwiseLR: bool 
    331     :param addCrit: parameter for stepwise feature selection 
    332     :type addCrit: float 
    333     :param deleteCrit: parameter for stepwise feature selection 
    334     :type deleteCrit: float 
    335     :param numFeatures: parameter for stepwise feature selection 
    336     :type numFeatures: int 
     59    :param instances: data table with either discrete or continuous features 
     60    :type instances: Orange.data.Table 
     61    :param weight_id: the ID of the weight meta attribute 
     62    :type weight_id: int 
     63    :param remove_singular: set to 1 if you want automatic removal of 
     64        disturbing features, such as constants and singularities 
     65    :type remove_singular: bool 
     66    :param fitter: the fitting algorithm (by default the Newton-Raphson 
     67        fitting algorithm is used) 
     68    :param stepwise_lr: set to 1 if you wish to use stepwise logistic 
     69        regression 
     70    :type stepwise_lr: bool 
     71    :param add_crit: parameter for stepwise feature selection 
     72    :type add_crit: float 
     73    :param delete_crit: parameter for stepwise feature selection 
     74    :type delete_crit: float 
     75    :param num_features: parameter for stepwise feature selection 
     76    :type num_features: int 
    33777    :rtype: :obj:`LogRegLearner` or :obj:`LogRegClassifier` 
    33878 
    33979    """ 
    340     def __new__(cls, instances=None, weightID=0, **argkw): 
     80 
     81    @deprecated_keywords({"weightID": "weight_id"}) 
     82    def __new__(cls, instances=None, weight_id=0, **argkw): 
    34183        self = Orange.classification.Learner.__new__(cls, **argkw) 
    34284        if instances: 
    34385            self.__init__(**argkw) 
    344             return self.__call__(instances, weightID) 
     86            return self.__call__(instances, weight_id) 
    34587        else: 
    34688            return self 
    34789 
    348     def __init__(self, removeSingular=0, fitter = None, **kwds): 
     90    @deprecated_keywords({"removeSingular": "remove_singular"}) 
     91    def __init__(self, remove_singular=0, fitter = None, **kwds): 
    34992        self.__dict__.update(kwds) 
    350         self.removeSingular = removeSingular 
     93        self.remove_singular = remove_singular 
    35194        self.fitter = None 
    35295 
    353     def __call__(self, examples, weight=0): 
     96    @deprecated_keywords({"examples": "instances"}) 
     97    def __call__(self, instances, weight=0): 
     98        """Learn from the given table of data instances. 
     99 
     100        :param instances: Data instances to learn from. 
     101        :type instances: :class:`~Orange.data.Table` 
     102        :param weight: Id of meta attribute with weights of instances 
     103        :type weight: int 
     104        :rtype: :class:`~Orange.classification.logreg.LogRegClassifier` 
     105        """ 
    354106        imputer = getattr(self, "imputer", None) or None 
    355         if getattr(self, "removeMissing", 0): 
    356             examples = Orange.core.Preprocessor_dropMissing(examples) 
     107        if getattr(self, "remove_missing", 0): 
     108            instances = Orange.core.Preprocessor_dropMissing(instances) 
    357109##        if hasDiscreteValues(examples.domain): 
    358110##            examples = createNoDiscTable(examples) 
    359         if not len(examples): 
     111        if not len(instances): 
    360112            return None 
    361         if getattr(self, "stepwiseLR", 0): 
    362             addCrit = getattr(self, "addCrit", 0.2) 
    363             removeCrit = getattr(self, "removeCrit", 0.3) 
    364             numFeatures = getattr(self, "numFeatures", -1) 
    365             attributes = StepWiseFSS(examples, addCrit = addCrit, deleteCrit = removeCrit, imputer = imputer, numFeatures = numFeatures) 
    366             tmpDomain = Orange.core.Domain(attributes, examples.domain.classVar) 
    367             tmpDomain.addmetas(examples.domain.getmetas()) 
    368             examples = examples.select(tmpDomain) 
    369         learner = Orange.core.LogRegLearner() 
    370         learner.imputerConstructor = imputer 
     113        if getattr(self, "stepwise_lr", 0): 
     114            add_crit = getattr(self, "add_crit", 0.2) 
     115            delete_crit = getattr(self, "delete_crit", 0.3) 
     116            num_features = getattr(self, "num_features", -1) 
     117            attributes = StepWiseFSS(instances, add_crit= add_crit, 
     118                delete_crit=delete_crit, imputer = imputer, num_features= num_features) 
     119            tmp_domain = Orange.data.Domain(attributes, 
     120                instances.domain.class_var) 
     121            tmp_domain.addmetas(instances.domain.getmetas()) 
     122            instances = instances.select(tmp_domain) 
     123        learner = Orange.core.LogRegLearner() # Yes, it has to be from core. 
     124        learner.imputer_constructor = imputer 
    371125        if imputer: 
    372             examples = self.imputer(examples)(examples) 
    373         examples = Orange.core.Preprocessor_dropMissing(examples) 
     126            instances = self.imputer(instances)(instances) 
     127        instances = Orange.core.Preprocessor_dropMissing(instances) 
    374128        if self.fitter: 
    375129            learner.fitter = self.fitter 
    376         if self.removeSingular: 
    377             lr = learner.fitModel(examples, weight) 
     130        if self.remove_singular: 
     131            lr = learner.fit_model(instances, weight) 
    378132        else: 
    379             lr = learner(examples, weight) 
    380         while isinstance(lr, Orange.core.Variable): 
     133            lr = learner(instances, weight) 
     134        while isinstance(lr, Orange.data.variable.Variable): 
    381135            if isinstance(lr.getValueFrom, Orange.core.ClassifierFromVar) and isinstance(lr.getValueFrom.transformer, Orange.core.Discrete2Continuous): 
    382136                lr = lr.getValueFrom.variable 
    383             attributes = examples.domain.attributes[:] 
     137            attributes = instances.domain.features[:] 
    384138            if lr in attributes: 
    385139                attributes.remove(lr) 
    386140            else: 
    387141                attributes.remove(lr.getValueFrom.variable) 
    388             newDomain = Orange.core.Domain(attributes, examples.domain.classVar) 
    389             newDomain.addmetas(examples.domain.getmetas()) 
    390             examples = examples.select(newDomain) 
    391             lr = learner.fitModel(examples, weight) 
     142            new_domain = Orange.data.Domain(attributes,  
     143                instances.domain.class_var) 
     144            new_domain.addmetas(instances.domain.getmetas()) 
     145            instances = instances.select(new_domain) 
     146            lr = learner.fit_model(instances, weight) 
    392147        return lr 
    393148 
    394  
     149LogRegLearner = deprecated_members({"removeSingular": "remove_singular", 
     150                                    "weightID": "weight_id", 
     151                                    "stepwiseLR": "stepwise_lr", 
     152                                    "addCrit": "add_crit", 
     153                                    "deleteCrit": "delete_crit", 
     154                                    "numFeatures": "num_features", 
     155                                    "removeMissing": "remove_missing" 
     156                                    })(LogRegLearner) 
    395157 
    396158class UnivariateLogRegLearner(Orange.classification.Learner): 
     
    406168        self.__dict__.update(kwds) 
    407169 
    408     def __call__(self, examples): 
    409         examples = createFullNoDiscTable(examples) 
    410         classifiers = map(lambda x: LogRegLearner(Orange.core.Preprocessor_dropMissing(examples.select(Orange.core.Domain(x, examples.domain.classVar)))), examples.domain.attributes) 
    411         maj_classifier = LogRegLearner(Orange.core.Preprocessor_dropMissing(examples.select(Orange.core.Domain(examples.domain.classVar)))) 
     170    @deprecated_keywords({"examples": "instances"}) 
     171    def __call__(self, instances): 
     172        instances = createFullNoDiscTable(instances) 
     173        classifiers = map(lambda x: LogRegLearner(Orange.core.Preprocessor_dropMissing( 
     174            instances.select(Orange.data.Domain(x,  
     175            instances.domain.class_var)))), instances.domain.features) 
     176        maj_classifier = LogRegLearner(Orange.core.Preprocessor_dropMissing 
     177            (instances.select(Orange.data.Domain(instances.domain.class_var)))) 
    412178        beta = [maj_classifier.beta[0]] + [x.beta[1] for x in classifiers] 
    413179        beta_se = [maj_classifier.beta_se[0]] + [x.beta_se[1] for x in classifiers] 
    414180        P = [maj_classifier.P[0]] + [x.P[1] for x in classifiers] 
    415181        wald_Z = [maj_classifier.wald_Z[0]] + [x.wald_Z[1] for x in classifiers] 
    416         domain = examples.domain 
     182        domain = instances.domain 
    417183 
    418184        return Univariate_LogRegClassifier(beta = beta, beta_se = beta_se, P = P, wald_Z = wald_Z, domain = domain) 
    419185 
    420 class UnivariateLogRegClassifier(Orange.core.Classifier): 
     186class UnivariateLogRegClassifier(Orange.classification.Classifier): 
    421187    def __init__(self, **kwds): 
    422188        self.__dict__.update(kwds) 
    423189 
    424     def __call__(self, example, resultType = Orange.core.GetValue): 
     190    def __call__(self, instance, resultType = Orange.classification.Classifier.GetValue): 
    425191        # classification not implemented yet. For now its use is only to provide regression coefficients and its statistics 
    426192        pass 
     
    436202            return self 
    437203 
    438     def __init__(self, removeSingular=0, **kwds): 
     204    @deprecated_keywords({"removeSingular": "remove_singular"}) 
     205    def __init__(self, remove_singular=0, **kwds): 
    439206        self.__dict__.update(kwds) 
    440         self.removeSingular = removeSingular 
    441     def __call__(self, examples, weight=0): 
     207        self.remove_singular = remove_singular 
     208 
     209    @deprecated_keywords({"examples": "instances"}) 
     210    def __call__(self, instances, weight=0): 
    442211        # next function changes data set to a extended with unknown values  
    443         def createLogRegExampleTable(data, weightID): 
    444             setsOfData = [] 
    445             for at in data.domain.attributes: 
    446                 # za vsak atribut kreiraj nov newExampleTable newData 
    447                 # v dataOrig, dataFinal in newData dodaj nov atribut -- continuous variable 
    448                 if at.varType == Orange.core.VarTypes.Continuous: 
    449                     atDisc = Orange.core.FloatVariable(at.name + "Disc") 
    450                     newDomain = Orange.core.Domain(data.domain.attributes+[atDisc,data.domain.classVar]) 
    451                     newDomain.addmetas(data.domain.getmetas()) 
    452                     newData = Orange.core.ExampleTable(newDomain,data) 
    453                     altData = Orange.core.ExampleTable(newDomain,data) 
    454                     for i,d in enumerate(newData): 
    455                         d[atDisc] = 0 
    456                         d[weightID] = 1*data[i][weightID] 
    457                     for i,d in enumerate(altData): 
    458                         d[atDisc] = 1 
     212        def createLogRegExampleTable(data, weight_id): 
     213            sets_of_data = [] 
     214            for at in data.domain.features: 
     215                # za vsak atribut kreiraj nov newExampleTable new_data 
     216                # v dataOrig, dataFinal in new_data dodaj nov atribut -- continuous variable 
     217                if at.var_type == Orange.data.Type.Continuous: 
     218                    at_disc = Orange.data.variable.Continuous(at.name+ "Disc") 
     219                    new_domain = Orange.data.Domain(data.domain.features+[at_disc,data.domain.class_var]) 
     220                    new_domain.addmetas(data.domain.getmetas()) 
     221                    new_data = Orange.data.Table(new_domain,data) 
     222                    alt_data = Orange.data.Table(new_domain,data) 
     223                    for i,d in enumerate(new_data): 
     224                        d[at_disc] = 0 
     225                        d[weight_id] = 1*data[i][weight_id] 
     226                    for i,d in enumerate(alt_data): 
     227                        d[at_disc] = 1 
    459228                        d[at] = 0 
    460                         d[weightID] = 0.000001*data[i][weightID] 
    461                 elif at.varType == Orange.core.VarTypes.Discrete: 
    462                 # v dataOrig, dataFinal in newData atributu "at" dodaj ee  eno  vreednost, ki ima vrednost kar  ime atributa +  "X" 
    463                     atNew = Orange.core.EnumVariable(at.name, values = at.values + [at.name+"X"]) 
    464                     newDomain = Orange.core.Domain(filter(lambda x: x!=at, data.domain.attributes)+[atNew,data.domain.classVar]) 
    465                     newDomain.addmetas(data.domain.getmetas()) 
    466                     newData = Orange.core.ExampleTable(newDomain,data) 
    467                     altData = Orange.core.ExampleTable(newDomain,data) 
    468                     for i,d in enumerate(newData): 
    469                         d[atNew] = data[i][at] 
    470                         d[weightID] = 1*data[i][weightID] 
    471                     for i,d in enumerate(altData): 
    472                         d[atNew] = at.name+"X" 
    473                         d[weightID] = 0.000001*data[i][weightID] 
    474                 newData.extend(altData) 
    475                 setsOfData.append(newData) 
    476             return setsOfData 
     229                        d[weight_id] = 0.000001*data[i][weight_id] 
     230                elif at.var_type == Orange.data.Type.Discrete: 
     231                # v dataOrig, dataFinal in new_data atributu "at" dodaj ee  eno  vreednost, ki ima vrednost kar  ime atributa +  "X" 
     232                    at_new = Orange.data.variable.Discrete(at.name, values = at.values + [at.name+"X"]) 
     233                    new_domain = Orange.data.Domain(filter(lambda x: x!=at, data.domain.features)+[at_new,data.domain.class_var]) 
     234                    new_domain.addmetas(data.domain.getmetas()) 
     235                    new_data = Orange.data.Table(new_domain,data) 
     236                    alt_data = Orange.data.Table(new_domain,data) 
     237                    for i,d in enumerate(new_data): 
     238                        d[at_new] = data[i][at] 
     239                        d[weight_id] = 1*data[i][weight_id] 
     240                    for i,d in enumerate(alt_data): 
     241                        d[at_new] = at.name+"X" 
     242                        d[weight_id] = 0.000001*data[i][weight_id] 
     243                new_data.extend(alt_data) 
     244                sets_of_data.append(new_data) 
     245            return sets_of_data 
    477246                   
    478         learner = LogRegLearner(imputer = Orange.core.ImputerConstructor_average(), removeSingular = self.removeSingular) 
     247        learner = LogRegLearner(imputer=Orange.feature.imputation.ImputerConstructor_average(), 
     248            remove_singular = self.remove_singular) 
    479249        # get Original Model 
    480         orig_model = learner(examples,weight) 
     250        orig_model = learner(instances,weight) 
    481251        if orig_model.fit_status: 
    482252            print "Warning: model did not converge" 
     
    485255        if weight == 0: 
    486256            weight = Orange.data.new_meta_id() 
    487             examples.addMetaAttribute(weight, 1.0) 
    488         extended_set_of_examples = createLogRegExampleTable(examples, weight) 
     257            instances.addMetaAttribute(weight, 1.0) 
     258        extended_set_of_examples = createLogRegExampleTable(instances, weight) 
    489259        extended_models = [learner(extended_examples, weight) \ 
    490260                           for extended_examples in extended_set_of_examples] 
     
    494264##        print orig_model.domain 
    495265##        print orig_model.beta 
    496 ##        print orig_model.beta[orig_model.continuizedDomain.attributes[-1]] 
     266##        print orig_model.beta[orig_model.continuized_domain.features[-1]] 
    497267##        for i,m in enumerate(extended_models): 
    498 ##            print examples.domain.attributes[i] 
     268##            print examples.domain.features[i] 
    499269##            printOUT(m) 
    500270             
     
    505275        betas_ap = [] 
    506276        for m in extended_models: 
    507             beta_add = m.beta[m.continuizedDomain.attributes[-1]] 
     277            beta_add = m.beta[m.continuized_domain.features[-1]] 
    508278            betas_ap.append(beta_add) 
    509279            beta = beta + beta_add 
     
    514284         
    515285        # compare it to bayes prior 
    516         bayes = Orange.core.BayesLearner(examples) 
     286        bayes = Orange.classification.bayes.NaiveLearner(instances) 
    517287        bayes_prior = math.log(bayes.distribution[1]/bayes.distribution[0]) 
    518288 
     
    521291##        print "lr", orig_model.beta[0] 
    522292##        print "lr2", logistic_prior 
    523 ##        print "dist", Orange.core.Distribution(examples.domain.classVar,examples) 
     293##        print "dist", Orange.statistics.distribution.Distribution(examples.domain.class_var,examples) 
    524294##        print "prej", betas_ap 
    525295 
     
    544314        # vrni originalni model in pripadajoce apriorne niclele 
    545315        return (orig_model, betas_ap) 
    546         #return (bayes_prior,orig_model.beta[examples.domain.classVar],logistic_prior) 
     316        #return (bayes_prior,orig_model.beta[examples.domain.class_var],logistic_prior) 
     317 
     318LogRegLearnerGetPriors = deprecated_members({"removeSingular": 
     319                                                 "remove_singular"} 
     320)(LogRegLearnerGetPriors) 
    547321 
    548322class LogRegLearnerGetPriorsOneTable: 
    549     def __init__(self, removeSingular=0, **kwds): 
     323    @deprecated_keywords({"removeSingular": "remove_singular"}) 
     324    def __init__(self, remove_singular=0, **kwds): 
    550325        self.__dict__.update(kwds) 
    551         self.removeSingular = removeSingular 
    552     def __call__(self, examples, weight=0): 
     326        self.remove_singular = remove_singular 
     327 
     328    @deprecated_keywords({"examples": "instances"}) 
     329    def __call__(self, instances, weight=0): 
    553330        # next function changes data set to a extended with unknown values  
    554331        def createLogRegExampleTable(data, weightID): 
    555             finalData = Orange.core.ExampleTable(data) 
    556             origData = Orange.core.ExampleTable(data) 
    557             for at in data.domain.attributes: 
     332            finalData = Orange.data.Table(data) 
     333            orig_data = Orange.data.Table(data) 
     334            for at in data.domain.features: 
    558335                # za vsak atribut kreiraj nov newExampleTable newData 
    559336                # v dataOrig, dataFinal in newData dodaj nov atribut -- continuous variable 
    560                 if at.varType == Orange.core.VarTypes.Continuous: 
    561                     atDisc = Orange.core.FloatVariable(at.name + "Disc") 
    562                     newDomain = Orange.core.Domain(origData.domain.attributes+[atDisc,data.domain.classVar]) 
     337                if at.var_type == Orange.data.Type.Continuous: 
     338                    atDisc = Orange.data.variable.Continuous(at.name + "Disc") 
     339                    newDomain = Orange.data.Domain(orig_data.domain.features+[atDisc,data.domain.class_var]) 
    563340                    newDomain.addmetas(newData.domain.getmetas()) 
    564                     finalData = Orange.core.ExampleTable(newDomain,finalData) 
    565                     newData = Orange.core.ExampleTable(newDomain,origData) 
    566                     origData = Orange.core.ExampleTable(newDomain,origData) 
    567                     for d in origData: 
     341                    finalData = Orange.data.Table(newDomain,finalData) 
     342                    newData = Orange.data.Table(newDomain,orig_data) 
     343                    orig_data = Orange.data.Table(newDomain,orig_data) 
     344                    for d in orig_data: 
    568345                        d[atDisc] = 0 
    569346                    for d in finalData: 
     
    574351                        d[weightID] = 100*data[i][weightID] 
    575352                         
    576                 elif at.varType == Orange.core.VarTypes.Discrete: 
     353                elif at.var_type == Orange.data.Type.Discrete: 
    577354                # v dataOrig, dataFinal in newData atributu "at" dodaj ee  eno  vreednost, ki ima vrednost kar  ime atributa +  "X" 
    578                     atNew = Orange.core.EnumVariable(at.name, values = at.values + [at.name+"X"]) 
    579                     newDomain = Orange.core.Domain(filter(lambda x: x!=at, origData.domain.attributes)+[atNew,origData.domain.classVar]) 
    580                     newDomain.addmetas(origData.domain.getmetas()) 
    581                     temp_finalData = Orange.core.ExampleTable(finalData) 
    582                     finalData = Orange.core.ExampleTable(newDomain,finalData) 
    583                     newData = Orange.core.ExampleTable(newDomain,origData) 
    584                     temp_origData = Orange.core.ExampleTable(origData) 
    585                     origData = Orange.core.ExampleTable(newDomain,origData) 
    586                     for i,d in enumerate(origData): 
    587                         d[atNew] = temp_origData[i][at] 
     355                    at_new = Orange.data.variable.Discrete(at.name, values = at.values + [at.name+"X"]) 
     356                    newDomain = Orange.data.Domain(filter(lambda x: x!=at, orig_data.domain.features)+[at_new,orig_data.domain.class_var]) 
     357                    newDomain.addmetas(orig_data.domain.getmetas()) 
     358                    temp_finalData = Orange.data.Table(finalData) 
     359                    finalData = Orange.data.Table(newDomain,finalData) 
     360                    newData = Orange.data.Table(newDomain,orig_data) 
     361                    temp_origData = Orange.data.Table(orig_data) 
     362                    orig_data = Orange.data.Table(newDomain,orig_data) 
     363                    for i,d in enumerate(orig_data): 
     364                        d[at_new] = temp_origData[i][at] 
    588365                    for i,d in enumerate(finalData): 
    589                         d[atNew] = temp_finalData[i][at]                         
     366                        d[at_new] = temp_finalData[i][at] 
    590367                    for i,d in enumerate(newData): 
    591                         d[atNew] = at.name+"X" 
     368                        d[at_new] = at.name+"X" 
    592369                        d[weightID] = 10*data[i][weightID] 
    593370                finalData.extend(newData) 
    594371            return finalData 
    595372                   
    596         learner = LogRegLearner(imputer = Orange.core.ImputerConstructor_average(), removeSingular = self.removeSingular) 
     373        learner = LogRegLearner(imputer = Orange.feature.imputation.ImputerConstructor_average(), removeSingular = self.remove_singular) 
    597374        # get Original Model 
    598         orig_model = learner(examples,weight) 
     375        orig_model = learner(instances,weight) 
    599376 
    600377        # get extended Model (you should not change data) 
    601378        if weight == 0: 
    602379            weight = Orange.data.new_meta_id() 
    603             examples.addMetaAttribute(weight, 1.0) 
    604         extended_examples = createLogRegExampleTable(examples, weight) 
     380            instances.addMetaAttribute(weight, 1.0) 
     381        extended_examples = createLogRegExampleTable(instances, weight) 
    605382        extended_model = learner(extended_examples, weight) 
    606383 
     
    616393        betas_ap = [] 
    617394        for m in extended_models: 
    618             beta_add = m.beta[m.continuizedDomain.attributes[-1]] 
     395            beta_add = m.beta[m.continuized_domain.features[-1]] 
    619396            betas_ap.append(beta_add) 
    620397            beta = beta + beta_add 
     
    625402         
    626403        # compare it to bayes prior 
    627         bayes = Orange.core.BayesLearner(examples) 
     404        bayes = Orange.classification.bayes.NaiveLearner(instances) 
    628405        bayes_prior = math.log(bayes.distribution[1]/bayes.distribution[0]) 
    629406 
     
    632409        #print "lr", orig_model.beta[0] 
    633410        #print "lr2", logistic_prior 
    634         #print "dist", Orange.core.Distribution(examples.domain.classVar,examples) 
     411        #print "dist", Orange.statistics.distribution.Distribution(examples.domain.class_var,examples) 
    635412        k = (bayes_prior-orig_model.beta[0])/(logistic_prior-orig_model.beta[0]) 
    636413        #print "prej", betas_ap 
     
    640417        # vrni originalni model in pripadajoce apriorne niclele 
    641418        return (orig_model, betas_ap) 
    642         #return (bayes_prior,orig_model.beta[data.domain.classVar],logistic_prior) 
     419        #return (bayes_prior,orig_model.beta[data.domain.class_var],logistic_prior) 
     420 
     421LogRegLearnerGetPriorsOneTable = deprecated_members({"removeSingular": 
     422                                                         "remove_singular"} 
     423)(LogRegLearnerGetPriorsOneTable) 
    643424 
    644425 
     
    655436    for i,x_i in enumerate(x): 
    656437        pr = pr(x_i,betas) 
    657         llh += y[i]*log(max(pr,1e-6)) + (1-y[i])*log(max(1-pr,1e-6)) 
     438        llh += y[i]*math.log(max(pr,1e-6)) + (1-y[i])*log(max(1-pr,1e-6)) 
    658439    return llh 
    659440 
    660441 
    661442def diag(vector): 
    662     mat = identity(len(vector), Float) 
     443    mat = identity(len(vector)) 
    663444    for i,v in enumerate(vector): 
    664445        mat[i][i] = v 
    665446    return mat 
    666447     
    667 class SimpleFitter(Orange.core.LogRegFitter): 
     448class SimpleFitter(LogRegFitter): 
    668449    def __init__(self, penalty=0, se_penalty = False): 
    669450        self.penalty = penalty 
    670451        self.se_penalty = se_penalty 
     452 
    671453    def __call__(self, data, weight=0): 
    672454        ml = data.native(0) 
    673         for i in range(len(data.domain.attributes)): 
    674           a = data.domain.attributes[i] 
    675           if a.varType == Orange.core.VarTypes.Discrete: 
     455        for i in range(len(data.domain.features)): 
     456          a = data.domain.features[i] 
     457          if a.var_type == Orange.data.Type.Discrete: 
    676458            for m in ml: 
    677459              m[i] = a.values.index(m[i]) 
    678460        for m in ml: 
    679           m[-1] = data.domain.classVar.values.index(m[-1]) 
     461          m[-1] = data.domain.class_var.values.index(m[-1]) 
    680462        Xtmp = array(ml) 
    681463        y = Xtmp[:,-1]   # true probabilities (1's or 0's) 
     
    683465        X=concatenate((one, Xtmp[:,:-1]),1)  # intercept first, then data 
    684466 
    685         betas = array([0.0] * (len(data.domain.attributes)+1)) 
    686         oldBetas = array([1.0] * (len(data.domain.attributes)+1)) 
     467        betas = array([0.0] * (len(data.domain.features)+1)) 
     468        oldBetas = array([1.0] * (len(data.domain.features)+1)) 
    687469        N = len(data) 
    688470 
    689         pen_matrix = array([self.penalty] * (len(data.domain.attributes)+1)) 
     471        pen_matrix = array([self.penalty] * (len(data.domain.features)+1)) 
    690472        if self.se_penalty: 
    691473            p = array([pr(X[i], betas) for i in range(len(data))]) 
    692             W = identity(len(data), Float) 
     474            W = identity(len(data)) 
    693475            pp = p * (1.0-p) 
    694476            for i in range(N): 
    695477                W[i,i] = pp[i] 
    696             se = sqrt(diagonal(inverse(matrixmultiply(transpose(X), matrixmultiply(W, X))))) 
     478            se = sqrt(diagonal(inv(dot(transpose(X), dot(W, X))))) 
    697479            for i,p in enumerate(pen_matrix): 
    698480                pen_matrix[i] *= se[i] 
     
    706488            p = array([pr(X[i], betas) for i in range(len(data))]) 
    707489 
    708             W = identity(len(data), Float) 
     490            W = identity(len(data)) 
    709491            pp = p * (1.0-p) 
    710492            for i in range(N): 
    711493                W[i,i] = pp[i] 
    712494 
    713             WI = inverse(W) 
    714             z = matrixmultiply(X, betas) + matrixmultiply(WI, y - p) 
    715  
    716             tmpA = inverse(matrixmultiply(transpose(X), matrixmultiply(W, X))+diag(pen_matrix)) 
    717             tmpB = matrixmultiply(transpose(X), y-p) 
    718             betas = oldBetas + matrixmultiply(tmpA,tmpB) 
    719 #            betaTemp = matrixmultiply(matrixmultiply(matrixmultiply(matrixmultiply(tmpA,transpose(X)),W),X),oldBetas) 
     495            WI = inv(W) 
     496            z = dot(X, betas) + dot(WI, y - p) 
     497 
     498            tmpA = inv(dot(transpose(X), dot(W, X))+diag(pen_matrix)) 
     499            tmpB = dot(transpose(X), y-p) 
     500            betas = oldBetas + dot(tmpA,tmpB) 
     501#            betaTemp = dot(dot(dot(dot(tmpA,transpose(X)),W),X),oldBetas) 
    720502#            print betaTemp 
    721 #            tmpB = matrixmultiply(transpose(X), matrixmultiply(W, z)) 
    722 #            betas = matrixmultiply(tmpA, tmpB) 
     503#            tmpB = dot(transpose(X), dot(W, z)) 
     504#            betas = dot(tmpA, tmpB) 
    723505            likelihood_new = lh(X,y,betas)-self.penalty*sum([b*b for b in betas]) 
    724506            print likelihood_new 
     
    726508             
    727509             
    728 ##        XX = sqrt(diagonal(inverse(matrixmultiply(transpose(X),X)))) 
     510##        XX = sqrt(diagonal(inv(dot(transpose(X),X)))) 
    729511##        yhat = array([pr(X[i], betas) for i in range(len(data))]) 
    730 ##        ss = sum((y - yhat) ** 2) / (N - len(data.domain.attributes) - 1) 
     512##        ss = sum((y - yhat) ** 2) / (N - len(data.domain.features) - 1) 
    731513##        sigma = math.sqrt(ss) 
    732514        p = array([pr(X[i], betas) for i in range(len(data))]) 
    733         W = identity(len(data), Float) 
     515        W = identity(len(data)) 
    734516        pp = p * (1.0-p) 
    735517        for i in range(N): 
    736518            W[i,i] = pp[i] 
    737         diXWX = sqrt(diagonal(inverse(matrixmultiply(transpose(X), matrixmultiply(W, X))))) 
    738         xTemp = matrixmultiply(matrixmultiply(inverse(matrixmultiply(transpose(X), matrixmultiply(W, X))),transpose(X)),y) 
     519        diXWX = sqrt(diagonal(inv(dot(transpose(X), dot(W, X))))) 
     520        xTemp = dot(dot(inv(dot(transpose(X), dot(W, X))),transpose(X)),y) 
    739521        beta = [] 
    740522        beta_se = [] 
     
    752534    return exp(bx)/(1+exp(bx)) 
    753535 
    754 class BayesianFitter(Orange.core.LogRegFitter): 
     536class BayesianFitter(LogRegFitter): 
    755537    def __init__(self, penalty=0, anch_examples=[], tau = 0): 
    756538        self.penalty = penalty 
     
    763545        # convert data to numeric 
    764546        ml = data.native(0) 
    765         for i,a in enumerate(data.domain.attributes): 
    766           if a.varType == Orange.core.VarTypes.Discrete: 
     547        for i,a in enumerate(data.domain.features): 
     548          if a.var_type == Orange.data.Type.Discrete: 
    767549            for m in ml: 
    768550              m[i] = a.values.index(m[i]) 
    769551        for m in ml: 
    770           m[-1] = data.domain.classVar.values.index(m[-1]) 
     552          m[-1] = data.domain.class_var.values.index(m[-1]) 
    771553        Xtmp = array(ml) 
    772554        y = Xtmp[:,-1]   # true probabilities (1's or 0's) 
     
    778560        (X,y)=self.create_array_data(data) 
    779561 
    780         exTable = Orange.core.ExampleTable(data.domain) 
     562        exTable = Orange.data.Table(data.domain) 
    781563        for id,ex in self.anch_examples: 
    782             exTable.extend(Orange.core.ExampleTable(ex,data.domain)) 
     564            exTable.extend(Orange.data.Table(ex,data.domain)) 
    783565        (X_anch,y_anch)=self.create_array_data(exTable) 
    784566 
    785         betas = array([0.0] * (len(data.domain.attributes)+1)) 
     567        betas = array([0.0] * (len(data.domain.features)+1)) 
    786568 
    787569        likelihood,betas = self.estimate_beta(X,y,betas,[0]*(len(betas)),X_anch,y_anch) 
    788570 
    789571        # get attribute groups atGroup = [(startIndex, number of values), ...) 
    790         ats = data.domain.attributes 
     572        ats = data.domain.features 
    791573        atVec=reduce(lambda x,y: x+[(y,not y==x[-1][0])], [a.getValueFrom and a.getValueFrom.whichVar or a for a in ats],[(ats[0].getValueFrom and ats[0].getValueFrom.whichVar or ats[0],0)])[1:] 
    792574        atGroup=[[0,0]] 
     
    808590            print "betas", betas[0], betas_temp[0] 
    809591            sumB += betas[0]-betas_temp[0] 
    810         apriori = Orange.core.Distribution(data.domain.classVar, data) 
     592        apriori = Orange.statistics.distribution.Distribution(data.domain.class_var, data) 
    811593        aprioriProb = apriori[0]/apriori.abs 
    812594         
     
    839621            for j in range(len(betas)): 
    840622                if const_betas[j]: continue 
    841                 dl = matrixmultiply(X[:,j],transpose(y-p)) 
     623                dl = dot(X[:,j], transpose(y-p)) 
    842624                for xi,x in enumerate(X_anch): 
    843625                    dl += self.penalty*x[j]*(y_anch[xi] - pr_bx(r_anch[xi]*self.penalty)) 
    844626 
    845                 ddl = matrixmultiply(X_sq[:,j],transpose(p*(1-p))) 
     627                ddl = dot(X_sq[:,j], transpose(p*(1-p))) 
    846628                for xi,x in enumerate(X_anch): 
    847629                    ddl += self.penalty*x[j]*pr_bx(r[xi]*self.penalty)*(1-pr_bx(r[xi]*self.penalty)) 
     
    887669#  Feature subset selection for logistic regression 
    888670 
    889 def get_likelihood(fitter, examples): 
    890     res = fitter(examples) 
     671@deprecated_keywords({"examples": "instances"}) 
     672def get_likelihood(fitter, instances): 
     673    res = fitter(instances) 
    891674    if res[0] in [fitter.OK]: #, fitter.Infinity, fitter.Divergence]: 
    892675       status, beta, beta_se, likelihood = res 
    893676       if sum([abs(b) for b in beta])<sum([abs(b) for b in beta_se]): 
    894            return -100*len(examples) 
     677           return -100*len(instances) 
    895678       return likelihood 
    896679    else: 
    897        return -100*len(examples) 
     680       return -100*len(instances) 
    898681         
    899682 
    900683 
    901684class StepWiseFSS(Orange.classification.Learner): 
    902   """Implementation of algorithm described in [Hosmer and Lemeshow, Applied Logistic Regression, 2000]. 
     685  """ 
     686  Algorithm described in Hosmer and Lemeshow, 
     687  Applied Logistic Regression, 2000. 
    903688 
    904689  Perform stepwise logistic regression and return a list of the 
     
    907692  chosen feature is tested for a significant contribution to the overall 
    908693  model. If the worst among all tested features has higher significance 
    909   than is specified in :obj:`deleteCrit`, the feature is removed from 
     694  than is specified in :obj:`delete_crit`, the feature is removed from 
    910695  the model. The second step is forward selection, which is similar to 
    911696  backward elimination. It loops through all the features that are not 
    912697  in the model and tests whether they contribute to the common model 
    913   with significance lower that :obj:`addCrit`. The algorithm stops when 
     698  with significance lower that :obj:`add_crit`. The algorithm stops when 
    914699  no feature in the model is to be removed and no feature not in the 
    915   model is to be added. By setting :obj:`numFeatures` larger than -1, 
     700  model is to be added. By setting :obj:`num_features` larger than -1, 
    916701  the algorithm will stop its execution when the number of features in model 
    917702  exceeds that number. 
     
    923708  If :obj:`table` is specified, stepwise logistic regression implemented 
    924709  in :obj:`StepWiseFSS` is performed and a list of chosen features 
    925   is returned. If :obj:`table` is not specified an instance of 
    926   :obj:`StepWiseFSS` with all parameters set is returned. 
    927  
    928   :param table: data set 
     710  is returned. If :obj:`table` is not specified, an instance of 
     711  :obj:`StepWiseFSS` with all parameters set is returned and can be called 
     712  with data later. 
     713 
     714  :param table: data set. 
    929715  :type table: Orange.data.Table 
    930716 
    931   :param addCrit: "Alpha" level to judge if variable has enough importance to be added in the new set. (e.g. if addCrit is 0.2, then features is added if its P is lower than 0.2) 
    932   :type addCrit: float 
    933  
    934   :param deleteCrit: Similar to addCrit, just that it is used at backward elimination. It should be higher than addCrit! 
    935   :type deleteCrit: float 
    936  
    937   :param numFeatures: maximum number of selected features, use -1 for infinity. 
    938   :type numFeatures: int 
     717  :param add_crit: "Alpha" level to judge if variable has enough importance to 
     718       be added in the new set. (e.g. if add_crit is 0.2, 
     719       then features is added if its P is lower than 0.2). 
     720  :type add_crit: float 
     721 
     722  :param delete_crit: Similar to add_crit, just that it is used at backward 
     723      elimination. It should be higher than add_crit! 
     724  :type delete_crit: float 
     725 
     726  :param num_features: maximum number of selected features, 
     727      use -1 for infinity. 
     728  :type num_features: int 
    939729  :rtype: :obj:`StepWiseFSS` or list of features 
    940730 
     
    949739          return self 
    950740 
    951  
    952   def __init__(self, addCrit=0.2, deleteCrit=0.3, numFeatures = -1, **kwds): 
     741  @deprecated_keywords({"addCrit": "add_crit", "deleteCrit": "delete_crit", 
     742                        "numFeatures": "num_features"}) 
     743  def __init__(self, add_crit=0.2, delete_crit=0.3, num_features = -1, **kwds): 
    953744    self.__dict__.update(kwds) 
    954     self.addCrit = addCrit 
    955     self.deleteCrit = deleteCrit 
    956     self.numFeatures = numFeatures 
     745    self.add_crit = add_crit 
     746    self.delete_crit = delete_crit 
     747    self.num_features = num_features 
     748 
    957749  def __call__(self, examples): 
    958750    if getattr(self, "imputer", 0): 
     
    960752    if getattr(self, "removeMissing", 0): 
    961753        examples = Orange.core.Preprocessor_dropMissing(examples) 
    962     continuizer = Orange.core.DomainContinuizer(zeroBased=1,continuousTreatment=Orange.core.DomainContinuizer.Leave, 
    963                                            multinomialTreatment = Orange.core.DomainContinuizer.FrequentIsBase, 
    964                                            classTreatment = Orange.core.DomainContinuizer.Ignore) 
     754    continuizer = Orange.preprocess.DomainContinuizer(zeroBased=1, 
     755        continuousTreatment=Orange.preprocess.DomainContinuizer.Leave, 
     756                                           multinomialTreatment = Orange.preprocess.DomainContinuizer.FrequentIsBase, 
     757                                           classTreatment = Orange.preprocess.DomainContinuizer.Ignore) 
    965758    attr = [] 
    966     remain_attr = examples.domain.attributes[:] 
     759    remain_attr = examples.domain.features[:] 
    967760 
    968761    # get LL for Majority Learner  
    969     tempDomain = Orange.core.Domain(attr,examples.domain.classVar) 
     762    tempDomain = Orange.data.Domain(attr,examples.domain.class_var) 
    970763    #tempData  = Orange.core.Preprocessor_dropMissing(examples.select(tempDomain)) 
    971764    tempData  = Orange.core.Preprocessor_dropMissing(examples.select(tempDomain)) 
    972765 
    973     ll_Old = get_likelihood(Orange.core.LogRegFitter_Cholesky(), tempData) 
     766    ll_Old = get_likelihood(LogRegFitter_Cholesky(), tempData) 
    974767    ll_Best = -1000000 
    975768    length_Old = float(len(tempData)) 
     
    989782 
    990783                tempAttr = filter(lambda x: x!=at, attr) 
    991                 tempDomain = Orange.core.Domain(tempAttr,examples.domain.classVar) 
     784                tempDomain = Orange.data.Domain(tempAttr,examples.domain.class_var) 
    992785                tempDomain.addmetas(examples.domain.getmetas()) 
    993786                # domain, calculate P for LL improvement. 
     
    995788                tempData = Orange.core.Preprocessor_dropMissing(examples.select(tempDomain)) 
    996789 
    997                 ll_Delete = get_likelihood(Orange.core.LogRegFitter_Cholesky(), tempData) 
     790                ll_Delete = get_likelihood(LogRegFitter_Cholesky(), tempData) 
    998791                length_Delete = float(len(tempData)) 
    999792                length_Avg = (length_Delete + length_Old)/2.0 
     
    1001794                G=-2*length_Avg*(ll_Delete/length_Delete-ll_Old/length_Old) 
    1002795 
    1003                 # set new worst attribute                 
     796                # set new worst attribute 
    1004797                if G<minG: 
    1005798                    worstAt = at 
     
    1008801                    length_Best = length_Delete 
    1009802            # deletion of attribute 
    1010              
    1011             if worstAt.varType==Orange.core.VarTypes.Continuous: 
     803 
     804            if worstAt.var_type==Orange.data.Type.Continuous: 
    1012805                P=lchisqprob(minG,1); 
    1013806            else: 
    1014807                P=lchisqprob(minG,len(worstAt.values)-1); 
    1015             if P>=self.deleteCrit: 
     808            if P>=self.delete_crit: 
    1016809                attr.remove(worstAt) 
    1017810                remain_attr.append(worstAt) 
     
    1024817            nodeletion = 1 
    1025818            # END OF DELETION PART 
    1026              
     819 
    1027820        # if enough attributes has been chosen, stop the procedure 
    1028         if self.numFeatures>-1 and len(attr)>=self.numFeatures: 
     821        if self.num_features>-1 and len(attr)>=self.num_features: 
    1029822            remain_attr=[] 
    1030           
     823 
    1031824        # for each attribute in the remaining 
    1032825        maxG=-1 
     
    1036829        for at in remain_attr: 
    1037830            tempAttr = attr + [at] 
    1038             tempDomain = Orange.core.Domain(tempAttr,examples.domain.classVar) 
     831            tempDomain = Orange.data.Domain(tempAttr,examples.domain.class_var) 
    1039832            tempDomain.addmetas(examples.domain.getmetas()) 
    1040833            # domain, calculate P for LL improvement. 
    1041834            tempDomain  = continuizer(Orange.core.Preprocessor_dropMissing(examples.select(tempDomain))) 
    1042835            tempData = Orange.core.Preprocessor_dropMissing(examples.select(tempDomain)) 
    1043             ll_New = get_likelihood(Orange.core.LogRegFitter_Cholesky(), tempData) 
     836            ll_New = get_likelihood(LogRegFitter_Cholesky(), tempData) 
    1044837 
    1045838            length_New = float(len(tempData)) # get number of examples in tempData to normalize likelihood 
     
    1056849            stop = 1 
    1057850            continue 
    1058          
    1059         if bestAt.varType==Orange.core.VarTypes.Continuous: 
     851 
     852        if bestAt.var_type==Orange.data.Type.Continuous: 
    1060853            P=lchisqprob(maxG,1); 
    1061854        else: 
    1062855            P=lchisqprob(maxG,len(bestAt.values)-1); 
    1063856        # Add attribute with smallest P to attributes(attr) 
    1064         if P<=self.addCrit: 
     857        if P<=self.add_crit: 
    1065858            attr.append(bestAt) 
    1066859            remain_attr.remove(bestAt) 
     
    1068861            length_Old = length_Best 
    1069862 
    1070         if (P>self.addCrit and nodeletion) or (bestAt == worstAt): 
     863        if (P>self.add_crit and nodeletion) or (bestAt == worstAt): 
    1071864            stop = 1 
    1072865 
    1073866    return attr 
     867 
     868StepWiseFSS = deprecated_members({"addCrit": "add_crit", 
     869                                   "deleteCrit": "delete_crit", 
     870                                   "numFeatures": "num_features"})(StepWiseFSS) 
    1074871 
    1075872 
     
    1082879        else: 
    1083880            return self 
    1084      
    1085     def __init__(self, addCrit=0.2, deleteCrit=0.3, numFeatures = -1): 
    1086         self.addCrit = addCrit 
    1087         self.deleteCrit = deleteCrit 
    1088         self.numFeatures = numFeatures 
    1089  
    1090     def __call__(self, examples): 
    1091         attr = StepWiseFSS(examples, addCrit=self.addCrit, deleteCrit = self.deleteCrit, numFeatures = self.numFeatures) 
    1092         return examples.select(Orange.core.Domain(attr, examples.domain.classVar)) 
    1093                  
     881 
     882    @deprecated_keywords({"addCrit": "add_crit", "deleteCrit": "delete_crit", 
     883                          "numFeatures": "num_features"}) 
     884    def __init__(self, add_crit=0.2, delete_crit=0.3, num_features = -1): 
     885        self.add_crit = add_crit 
     886        self.delete_crit = delete_crit 
     887        self.num_features = num_features 
     888 
     889    @deprecated_keywords({"examples": "instances"}) 
     890    def __call__(self, instances): 
     891        attr = StepWiseFSS(instances, add_crit=self.add_crit, 
     892            delete_crit= self.delete_crit, num_features= self.num_features) 
     893        return instances.select(Orange.data.Domain(attr, instances.domain.class_var)) 
     894 
     895StepWiseFSSFilter = deprecated_members({"addCrit": "add_crit", 
     896                                        "deleteCrit": "delete_crit", 
     897                                        "numFeatures": "num_features"})\ 
     898    (StepWiseFSSFilter) 
     899 
    1094900 
    1095901#################################### 
Note: See TracChangeset for help on using the changeset viewer.