Lasso regression (lasso)

The lasso (least absolute shrinkage and selection operator) is a regularized version of least squares regression. It minimizes the sum of squared errors while also penalizing the L_1 norm (sum of absolute values) of the coefficients.

Concretely, the function that is minimized in Orange is:

\frac{1}{n}\|Xw - y\|_2^2 + \frac{\lambda}{m} \|w\|_1

Where X is a n \times m data matrix, y the vector of class values and w the regression coefficients to be estimated.

class Orange.regression.lasso.LassoRegressionLearner(lasso_lambda=0.1, max_iter=20000, eps=1e-06, n_boot=0, n_perm=0, imputer=None, continuizer=None, name=Lasso)

Bases: Orange.regression.base.BaseRegressionLearner

Fits the lasso regression model using FISTA (Fast Iterative Shrinkage-Thresholding Algorithm).

__call__(data, weight=None)
  • data ( – Training data.
  • weight – Weights for instances. Not implemented yet.
__init__(lasso_lambda=0.1, max_iter=20000, eps=1e-06, n_boot=0, n_perm=0, imputer=None, continuizer=None, name=Lasso)
  • lasso_lambda (float) – Regularization parameter.
  • max_iter (int) – Maximum number of iterations for the optimization method.
  • eps (float) – Stop optimization when improvements are lower than eps.
  • n_boot (int) – Number of bootstrap samples used for non-parametric estimation of standard errors.
  • n_perm (int) – Number of permuations used for non-parametric estimation of p-values.
  • name (str) – Learner name.
fista(X, y, l, lipschitz, w_init=None)

Fast Iterative Shrinkage-Thresholding Algorithm (FISTA).


Return the Lipschitz constant of \nabla f, where f(w) = \frac{1}{2}||Xw-y||^2.

class Orange.regression.lasso.LassoRegression(domain=None, class_var=None, coef0=None, coefficients=None, std_errors=None, p_vals=None, model=None, mu_x=None)

Bases: Orange.classification.Classifier

Lasso regression predicts the value of the response variable based on the values of independent variables.


Intercept (sample mean of the response variable).


Regression coefficients.


Standard errors of coefficient estimates for a fixed regularization parameter. The standard errors are estimated using the bootstrapping method.


List of p-values for the null hypotheses that the regression coefficients equal 0 based on a non-parametric permutation test.


Dictionary with the statistical properties of the model: Keys - names of the independent variables Values - tuples (coefficient, standard error, p-value)


Sample mean of independent variables.

__call__(instance, result_type=0)
Parameters:instance ( – Data instance for which the value of the response variable will be predicted.

Pretty-prints a lasso regression model, i.e. estimated regression coefficients with standard errors and significances. Standard errors are obtained using the bootstrapping method and significances by a permuation test.

Parameters:skip_zero (bool) – If True, variables with estimated coefficient equal to 0 are omitted.

Utility functions


Generate a bootstrap sample of a given data set.

Parameters:data ( – the original data sample

Permute values of the class (response) variable. The independence between independent variables and the response is obtained but the distribution of the response variable is kept.

Parameters:data ( – Original data.


To fit the regression parameters on housing data set use the following code:

housing ="housing")
learner = Orange.regression.lasso.LassoRegressionLearner(
    lasso_lambda=1, n_boot=100, n_perm=100)

To predict values of the response for the first five instances:

for ins in housing[:5]:
    print "Actual: %3.2f, predicted: %3.2f" % (


Actual: 24.00, predicted: 30.45
Actual: 21.60, predicted: 25.60
Actual: 34.70, predicted: 31.48
Actual: 33.40, predicted: 30.18
Actual: 36.20, predicted: 29.59

To see the fitted regression coefficients, print the model:

print classifier


  Variable  Coeff Est  Std Error          p
 Intercept     22.533
      CRIM     -0.023      0.024      0.050     .
      CHAS      1.970      1.331      0.040     *
       NOX     -4.226      2.944      0.010     *
        RM      4.270      0.934      0.000   ***
       DIS     -0.373      0.170      0.010     *
   PTRATIO     -0.798      0.117      0.000   ***
         B      0.007      0.003      0.020     *
     LSTAT     -0.519      0.102      0.000   ***
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1 empty 1

For 5 variables the regression coefficient equals 0:

Note that some of the regression coefficients are equal to 0.