02/23/12 22:47:51 (2 years ago)
janezd <janez.demsar@…>

Polished documentation for logistic regression

1 edited


  • Orange/classification/logreg.py

    r10246 r10346  
    99def dump(classifier): 
    10     """ Return a formatted string of all major features in logistic regression 
    11     classifier. 
     10    """ Return a formatted string describing the logistic regression model 
    1312    :param classifier: logistic regression classifier. 
    5352    """ Logistic regression learner. 
    55     If data instances are provided to 
    56     the constructor, the learning algorithm is called and the resulting 
    57     classifier is returned instead of the learner. 
    59     :param data: data table with either discrete or continuous features 
     54    Returns either a learning algorithm (instance of 
     55    :obj:`LogRegLearner`) or, if data is provided, a fitted model 
     56    (instance of :obj:`LogRegClassifier`). 
     58    :param data: data table; it may contain discrete and continuous features 
    6059    :type data: Orange.data.Table 
    6160    :param weight_id: the ID of the weight meta attribute 
    6261    :type weight_id: int 
    63     :param remove_singular: set to 1 if you want automatic removal of 
    64         disturbing features, such as constants and singularities 
     62    :param remove_singular: automated removal of constant 
     63        features and singularities (default: `False`) 
    6564    :type remove_singular: bool 
    66     :param fitter: the fitting algorithm (by default the Newton-Raphson 
    67         fitting algorithm is used) 
    68     :param stepwise_lr: set to 1 if you wish to use stepwise logistic 
    69         regression 
     65    :param fitter: the fitting algorithm (default: :obj:`LogRegFitter_Cholesky`) 
     66    :param stepwise_lr: enables stepwise feature selection (default: `False`) 
    7067    :type stepwise_lr: bool 
    71     :param add_crit: parameter for stepwise feature selection 
     68    :param add_crit: threshold for adding a feature in stepwise 
     69        selection (default: 0.2) 
    7270    :type add_crit: float 
    73     :param delete_crit: parameter for stepwise feature selection 
     71    :param delete_crit: threshold for removing a feature in stepwise 
     72        selection (default: 0.3) 
    7473    :type delete_crit: float 
    75     :param num_features: parameter for stepwise feature selection 
     74    :param num_features: number of features in stepwise selection 
     75        (default: -1, no limit) 
    7676    :type num_features: int 
    7777    :rtype: :obj:`LogRegLearner` or :obj:`LogRegClassifier` 
    9696    @deprecated_keywords({"examples": "data"}) 
    9797    def __call__(self, data, weight=0): 
    98         """Learn from the given table of data instances. 
    100         :param data: Data instances to learn from. 
     98        """Fit a model to the given data. 
     100        :param data: Data instances. 
    101101        :type data: :class:`~Orange.data.Table` 
    102         :param weight: Id of meta attribute with weights of instances 
     102        :param weight: Id of meta attribute with instance weights 
    103103        :type weight: int 
    104104        :rtype: :class:`~Orange.classification.logreg.LogRegClassifier` 
    685685class StepWiseFSS(Orange.classification.Learner): 
    686686  """ 
    687   Algorithm described in Hosmer and Lemeshow, 
    688   Applied Logistic Regression, 2000. 
    690   Perform stepwise logistic regression and return a list of the 
    691   most "informative" features. Each step of the algorithm is composed 
    692   of two parts. The first is backward elimination, where each already 
    693   chosen feature is tested for a significant contribution to the overall 
    694   model. If the worst among all tested features has higher significance 
    695   than is specified in :obj:`delete_crit`, the feature is removed from 
    696   the model. The second step is forward selection, which is similar to 
    697   backward elimination. It loops through all the features that are not 
    698   in the model and tests whether they contribute to the common model 
    699   with significance lower that :obj:`add_crit`. The algorithm stops when 
    700   no feature in the model is to be removed and no feature not in the 
    701   model is to be added. By setting :obj:`num_features` larger than -1, 
    702   the algorithm will stop its execution when the number of features in model 
    703   exceeds that number. 
    705   Significances are assesed via the likelihood ration chi-square 
    706   test. Normal F test is not appropriate, because errors are assumed to 
    707   follow a binomial distribution. 
    709   If :obj:`table` is specified, stepwise logistic regression implemented 
    710   in :obj:`StepWiseFSS` is performed and a list of chosen features 
    711   is returned. If :obj:`table` is not specified, an instance of 
    712   :obj:`StepWiseFSS` with all parameters set is returned and can be called 
    713   with data later. 
    715   :param table: data set. 
     687  A learning algorithm for logistic regression that implements a 
     688  stepwise feature subset selection as described in Applied Logistic 
     689  Regression (Hosmer and Lemeshow, 2000). 
     691  Each step of the algorithm is composed of two parts. The first is 
     692  backward elimination in which the least significant variable in the 
     693  model is removed if its p-value is above the prescribed threshold 
     694  :obj:`delete_crit`. The second step is forward selection in which 
     695  all variables are tested for addition to the model, and the one with 
     696  the most significant contribution is added if the corresponding 
     697  p-value is smaller than the prescribed :obj:d`add_crit`. The 
     698  algorithm stops when no more variables can be added or removed. 
     700  The model can be additionaly constrained by setting 
     701  :obj:`num_features` to a non-negative value. The algorithm will then 
     702  stop when the number of variables exceeds the given limit. 
     704  Significances are assesed by the likelihood ratio chi-square 
     705  test. Normal F test is not appropriate since the errors are assumed 
     706  to follow a binomial distribution. 
     708  The class constructor returns an instance of learning algorithm or, 
     709  if given training data, a list of selected variables. 
     711  :param table: training data. 
    716712  :type table: Orange.data.Table 
    718   :param add_crit: "Alpha" level to judge if variable has enough importance to 
    719        be added in the new set. (e.g. if add_crit is 0.2, 
    720        then features is added if its P is lower than 0.2). 
     714  :param add_crit: threshold for adding a variable (default: 0.2) 
    721715  :type add_crit: float 
    723   :param delete_crit: Similar to add_crit, just that it is used at backward 
    724       elimination. It should be higher than add_crit! 
     717  :param delete_crit: threshold for removing a variable 
     718      (default: 0.3); should be higher than :obj:`add_crit`. 
    725719  :type delete_crit: float 
Note: See TracChangeset for help on using the changeset viewer.