Index: Orange/classification/logreg.py
===================================================================
 Orange/classification/logreg.py (revision 10246)
+++ Orange/classification/logreg.py (revision 10346)
@@ 8,6 +8,5 @@
def dump(classifier):
 """ Return a formatted string of all major features in logistic regression
 classifier.
+ """ Return a formatted string describing the logistic regression model
:param classifier: logistic regression classifier.
@@ 53,25 +52,26 @@
""" Logistic regression learner.
 If data instances are provided to
 the constructor, the learning algorithm is called and the resulting
 classifier is returned instead of the learner.

 :param data: data table with either discrete or continuous features
+ Returns either a learning algorithm (instance of
+ :obj:`LogRegLearner`) or, if data is provided, a fitted model
+ (instance of :obj:`LogRegClassifier`).
+
+ :param data: data table; it may contain discrete and continuous features
:type data: Orange.data.Table
:param weight_id: the ID of the weight meta attribute
:type weight_id: int
 :param remove_singular: set to 1 if you want automatic removal of
 disturbing features, such as constants and singularities
+ :param remove_singular: automated removal of constant
+ features and singularities (default: `False`)
:type remove_singular: bool
 :param fitter: the fitting algorithm (by default the NewtonRaphson
 fitting algorithm is used)
 :param stepwise_lr: set to 1 if you wish to use stepwise logistic
 regression
+ :param fitter: the fitting algorithm (default: :obj:`LogRegFitter_Cholesky`)
+ :param stepwise_lr: enables stepwise feature selection (default: `False`)
:type stepwise_lr: bool
 :param add_crit: parameter for stepwise feature selection
+ :param add_crit: threshold for adding a feature in stepwise
+ selection (default: 0.2)
:type add_crit: float
 :param delete_crit: parameter for stepwise feature selection
+ :param delete_crit: threshold for removing a feature in stepwise
+ selection (default: 0.3)
:type delete_crit: float
 :param num_features: parameter for stepwise feature selection
+ :param num_features: number of features in stepwise selection
+ (default: 1, no limit)
:type num_features: int
:rtype: :obj:`LogRegLearner` or :obj:`LogRegClassifier`
@@ 96,9 +96,9 @@
@deprecated_keywords({"examples": "data"})
def __call__(self, data, weight=0):
 """Learn from the given table of data instances.

 :param data: Data instances to learn from.
+ """Fit a model to the given data.
+
+ :param data: Data instances.
:type data: :class:`~Orange.data.Table`
 :param weight: Id of meta attribute with weights of instances
+ :param weight: Id of meta attribute with instance weights
:type weight: int
:rtype: :class:`~Orange.classification.logreg.LogRegClassifier`
@@ 685,42 +685,36 @@
class StepWiseFSS(Orange.classification.Learner):
"""
 Algorithm described in Hosmer and Lemeshow,
 Applied Logistic Regression, 2000.

 Perform stepwise logistic regression and return a list of the
 most "informative" features. Each step of the algorithm is composed
 of two parts. The first is backward elimination, where each already
 chosen feature is tested for a significant contribution to the overall
 model. If the worst among all tested features has higher significance
 than is specified in :obj:`delete_crit`, the feature is removed from
 the model. The second step is forward selection, which is similar to
 backward elimination. It loops through all the features that are not
 in the model and tests whether they contribute to the common model
 with significance lower that :obj:`add_crit`. The algorithm stops when
 no feature in the model is to be removed and no feature not in the
 model is to be added. By setting :obj:`num_features` larger than 1,
 the algorithm will stop its execution when the number of features in model
 exceeds that number.

 Significances are assesed via the likelihood ration chisquare
 test. Normal F test is not appropriate, because errors are assumed to
 follow a binomial distribution.

 If :obj:`table` is specified, stepwise logistic regression implemented
 in :obj:`StepWiseFSS` is performed and a list of chosen features
 is returned. If :obj:`table` is not specified, an instance of
 :obj:`StepWiseFSS` with all parameters set is returned and can be called
 with data later.

 :param table: data set.
+ A learning algorithm for logistic regression that implements a
+ stepwise feature subset selection as described in Applied Logistic
+ Regression (Hosmer and Lemeshow, 2000).
+
+ Each step of the algorithm is composed of two parts. The first is
+ backward elimination in which the least significant variable in the
+ model is removed if its pvalue is above the prescribed threshold
+ :obj:`delete_crit`. The second step is forward selection in which
+ all variables are tested for addition to the model, and the one with
+ the most significant contribution is added if the corresponding
+ pvalue is smaller than the prescribed :obj:d`add_crit`. The
+ algorithm stops when no more variables can be added or removed.
+
+ The model can be additionaly constrained by setting
+ :obj:`num_features` to a nonnegative value. The algorithm will then
+ stop when the number of variables exceeds the given limit.
+
+ Significances are assesed by the likelihood ratio chisquare
+ test. Normal F test is not appropriate since the errors are assumed
+ to follow a binomial distribution.
+
+ The class constructor returns an instance of learning algorithm or,
+ if given training data, a list of selected variables.
+
+ :param table: training data.
:type table: Orange.data.Table
 :param add_crit: "Alpha" level to judge if variable has enough importance to
 be added in the new set. (e.g. if add_crit is 0.2,
 then features is added if its P is lower than 0.2).
+ :param add_crit: threshold for adding a variable (default: 0.2)
:type add_crit: float
 :param delete_crit: Similar to add_crit, just that it is used at backward
 elimination. It should be higher than add_crit!
+ :param delete_crit: threshold for removing a variable
+ (default: 0.3); should be higher than :obj:`add_crit`.
:type delete_crit: float
Index: docs/reference/rst/Orange.classification.logreg.rst
===================================================================
 docs/reference/rst/Orange.classification.logreg.rst (revision 10246)
+++ docs/reference/rst/Orange.classification.logreg.rst (revision 10346)
@@ 9,11 +9,9 @@
********************************
`Logistic regression `_
is a statistical classification methods that fits data to a logistic
function. Orange's implementation of algorithm
can handle various anomalies in features, such as constant variables and
singularities, that could make direct fitting of logistic regression almost
impossible. Stepwise logistic regression, which iteratively selects the most
informative features, is also supported.
+`Logistic regression
+`_ is a statistical
+classification method that fits data to a logistic function. Orange
+provides various enhancement of the method, such as stepwise selection
+of variables and handling of constant variables and singularities.
.. autoclass:: LogRegLearner
@@ 44,20 +42,22 @@
that beta coefficients differ from 0.0. The probability is
computed from squared Wald Z statistics that is distributed with
 ChiSquare distribution.
+ chisquared distribution.
.. attribute :: likelihood
 The probability of the sample (ie. learning examples) observed on
 the basis of the derived model, as a function of the regression
 parameters.
+ The likelihood of the sample (ie. learning data) given the
+ fitted model.
.. attribute :: fit_status
 Tells how the model fitting ended  either regularly
 (:obj:`LogRegFitter.OK`), or it was interrupted due to one of beta
 coefficients escaping towards infinity (:obj:`LogRegFitter.Infinity`)
 or since the values didn't converge (:obj:`LogRegFitter.Divergence`). The
 value tells about the classifier's "reliability"; the classifier
 itself is useful in either case.
+ Tells how the model fitting ended, either regularly
+ (:obj:`LogRegFitter.OK`), or it was interrupted due to one of
+ beta coefficients escaping towards infinity
+ (:obj:`LogRegFitter.Infinity`) or since the values did not
+ converge (:obj:`LogRegFitter.Divergence`).
+
+ Although the model is functional in all cases, it is
+ recommended to inspect whether the coefficients of the model
+ if the fitting did not end normally.
.. method:: __call__(instance, result_type)
@@ 78,54 +78,53 @@
.. class:: LogRegFitter
 :obj:`LogRegFitter` is the abstract base class for logistic fitters. It
 defines the form of call operator and the constants denoting its
 (un)success:

 .. attribute:: OK

 Fitter succeeded to converge to the optimal fit.

 .. attribute:: Infinity

 Fitter failed due to one or more beta coefficients escaping towards infinity.

 .. attribute:: Divergence

 Beta coefficients failed to converge, but none of beta coefficients escaped.

 .. attribute:: Constant

 There is a constant attribute that causes the matrix to be singular.

 .. attribute:: Singularity

 The matrix is singular.
+ :obj:`LogRegFitter` is the abstract base class for logistic
+ fitters. Fitters can be called with a data table and return a
+ vector of coefficients and the corresponding statistics, or a
+ status signifying an error. The possible statuses are
+
+ .. attribute:: OK
+
+ Optimization converged
+
+ .. attribute:: Infinity
+
+ Optimization failed due to one or more beta coefficients
+ escaping towards infinity.
+
+ .. attribute:: Divergence
+
+ Beta coefficients failed to converge, but without any of beta
+ coefficients escaping toward infinity.
+
+ .. attribute:: Constant
+
+ The data is singular due to a constant variable.
+
+ .. attribute:: Singularity
+
+ The data is singular.
.. method:: __call__(data, weight_id)
 Performs the fitting. There can be two different cases: either
 the fitting succeeded to find a set of beta coefficients (although
 possibly with difficulties) or the fitting failed altogether. The
 two cases return different results.

 `(status, beta, beta_se, likelihood)`
 The fitter managed to fit the model. The first element of
 the tuple, result, tells about the problems occurred; it can
 be either :obj:`OK`, :obj:`Infinity` or :obj:`Divergence`. In
 the latter cases, returned values may still be useful for
 making predictions, but it's recommended that you inspect
 the coefficients and their errors and make your decision
 whether to use the model or not.

 `(status, attribute)`
 The fitter failed and the returned attribute is responsible
 for it. The type of failure is reported in status, which
 can be either :obj:`Constant` or :obj:`Singularity`.

 The proper way of calling the fitter is to expect and handle all
 the situations described. For instance, if fitter is an instance
 of some fitter and examples contain a set of suitable examples,
 a script should look like this::
+ Fit the model and return a tuple with the fitted values and
+ the corresponding statistics or an error indicator. The two
+ cases differ by the tuple length and the status (the first
+ tuple element).
+
+ ``(status, beta, beta_se, likelihood)`` Fitting succeeded. The
+ first element, ``status`` is either :obj:`OK`,
+ :obj:`Infinity` or :obj:`Divergence`. In the latter cases,
+ returned values may still be useful for making
+ predictions, but it is recommended to inspect the
+ coefficients and their errors and decide whether to use
+ the model or not.
+
+ ``(status, variable)``
+ The fitter failed due to the indicated
+ ``variable``. ``status`` is either :obj:`Constant` or
+ :obj:`Singularity`.
+
+ The proper way of calling the fitter is to handle both scenarios ::
res = fitter(examples)
@@ 141,8 +140,8 @@
The sole fitter available at the
 moment. It is a C++ translation of `Alan Miller's logistic regression
 code `_. It uses NewtonRaphson
+ moment. This is a C++ translation of `Alan Miller's logistic regression
+ code `_ that uses NewtonRaphson
algorithm to iteratively minimize least squares error computed from
 learning examples.
+ training data.
@@ 158,6 +157,5 @@

The first example shows a very simple induction of a logistic regression
classifier (:download:`logregrun.py `).
+The first example shows a straightforward use a logistic regression (:download:`logregrun.py `).
.. literalinclude:: code/logregrun.py
@@ 210,5 +208,5 @@
If :obj:`remove_singular` is set to 0, inducing a logistic regression
classifier would return an error::
+classifier returns an error::
Traceback (most recent call last):
@@ 221,5 +219,5 @@
orange.KernelException: 'orange.LogRegLearner': singularity in workclass=Neverworked
We can see that the attribute workclass is causing a singularity.
+The attribute variable which causes the singularity is ``workclass``.
The example below shows how the use of stepwise logistic regression can help to