Changeset 9776:600ac31393ee in orange for Orange/regression/linear.py
 Timestamp:
 02/06/12 17:39:53 (2 years ago)
 Branch:
 default
 rebase_source:
 aeea356a746ae91cb578766fe56376f160e20257
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

Orange/regression/linear.py
r9725 r9776 6 6 .. index:: regression, linear model 7 7 8 .. _`Linear regression`: http://en.wikipedia.org/wiki/Linear_regression 9 10 Example :: 11 12 >>> from Orange.regression import linear 13 >>> table = Orange.data.Table("housing") 14 >>> c = linear.LinearRegressionLearner(table) 15 >>> print c 16 17 Variable Coeff Est Std Error tvalue p 8 .. `Linear regression`: http://en.wikipedia.org/wiki/Linear_regression 9 10 11 `Linear regression <http://en.wikipedia.org/wiki/Linear_regression>`_ is a statistical regression method 12 which tries to predict a value of a continuous response (class) variable based on the values of several predictors. 13 The model assumes that the response variable is a linear combination of the predictors, the task of linear regression 14 is therefore to fit the unknown coefficients. 15 16 To fit the regression parameters on housing data set use the following code: 17 18 .. literalinclude:: code/linearexample.py 19 :lines: 7,9,10,11 20 21 22 .. autoclass:: LinearRegressionLearner 23 :members: 24 25 .. autoclass:: LinearRegression 26 :members: 27 28 Utility functions 29  30 31 .. autofunction:: stepwise 32 33 34 ======== 35 Examples 36 ======== 37 38 ========== 39 Prediction 40 ========== 41 42 Predict values of the first 5 data instances 43 44 .. literalinclude:: code/linearexample.py 45 :lines: 1315 46 47 The output of this code is 48 49 :: 50 51 Actual: 24.000, predicted: 30.004 52 Actual: 21.600, predicted: 25.026 53 Actual: 34.700, predicted: 30.568 54 Actual: 33.400, predicted: 28.607 55 Actual: 36.200, predicted: 27.944 56 57 ========================= 58 Poperties of fitted model 59 ========================= 60 61 62 Print regression coefficients with standard errors, tscores, pvalues 63 and significances 64 65 .. literalinclude:: code/linearexample.py 66 :lines: 17 67 68 The code output is 69 70 :: 71 72 Variable Coeff Est Std Error tvalue p 18 73 Intercept 36.459 5.103 7.144 0.000 *** 19 74 CRIM 0.108 0.033 3.287 0.001 ** … … 31 86 LSTAT 0.525 0.051 10.347 0.000 *** 32 87 Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 empty 1 33 34 >>> 35 36 37 .. autoclass:: LinearRegressionLearner 38 :members: 39 40 .. autoclass:: LinearRegression 41 :members: 42 43 Utility functions 44  45 46 .. autofunction:: stepwise 88 89 90 91 =================== 92 Stepwise regression 93 =================== 94 95 To use stepwise regression initialize learner with stepwise=True. 96 The upper and lower bound for significance are contolled with 97 add_sig and remove_sig. 98 99 .. literalinclude:: code/linearexample.py 100 :lines: 2023,25 101 102 As you can see from the output the nonsignificant coefficients 103 have been removed from the output 104 105 :: 106 107 Variable Coeff Est Std Error tvalue p 108 Intercept 36.341 5.067 7.171 0.000 *** 109 LSTAT 0.523 0.047 11.019 0.000 *** 110 RM 3.802 0.406 9.356 0.000 *** 111 PTRATIO 0.947 0.129 7.334 0.000 *** 112 DIS 1.493 0.186 8.037 0.000 *** 113 NOX 17.376 3.535 4.915 0.000 *** 114 CHAS 2.719 0.854 3.183 0.002 ** 115 B 0.009 0.003 3.475 0.001 *** 116 ZN 0.046 0.014 3.390 0.001 *** 117 CRIM 0.108 0.033 3.307 0.001 ** 118 RAD 0.300 0.063 4.726 0.000 *** 119 TAX 0.012 0.003 3.493 0.001 *** 120 Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 empty 1 121 47 122 48 123 … … 57 132 from scipy import stats 58 133 except ImportError: 59 import Orange.statc as stats134 import statc as stats 60 135 61 136 from numpy import dot, sqrt … … 97 172 regression model. If None (default) all variables are used 98 173 :type use_vars: list of Orange.data.variable or None 99 :param stepwise: if True, _`stepwise regression`:100 http://en.wikipedia.org/wiki/Stepwise_regression174 :param stepwise: if True, `stepwise regression 175 <http://en.wikipedia.org/wiki/Stepwise_regression>`_ 101 176 based on Ftest is performed. The significance parameters are 102 177 add_sig and remove_sig … … 268 343 .. attribute:: coefficients 269 344 270 list of regression coefficients. If the intercept is included271 the first item corresponds to the estimated intercept 345 Regression coefficients stored in list. If the intercept is included 346 the first item corresponds to the estimated intercept. 272 347 273 348 .. attribute:: std_error 274 349 275 list of standard errors of the coefficient estimator.350 Standard errors of the coefficient estimator, stored in list. 276 351 277 352 .. attribute:: t_scores 278 353 279 list of tscores for the estimated regression coefficients354 List of tscores for the estimated regression coefficients. 280 355 281 356 .. attribute:: p_vals 282 357 283 list of pvalues for the null hypothesis that the regression358 List of pvalues for the null hypothesis that the regression 284 359 coefficients equal 0 based on tscores and two sided 285 alternative hypothesis 360 alternative hypothesis. 286 361 287 362 .. attribute:: dict_model 288 363 289 dictionary of statistical properties of the model.364 Statistical properties of the model in a dictionary: 290 365 Keys  names of the independent variables (or "Intercept") 291 366 Values  tuples (coefficient, standard error, … … 294 369 .. attribute:: fitted 295 370 296 estimated values of the dependent variable for all instances297 from the training table 371 Estimated values of the dependent variable for all instances 372 from the training table. 298 373 299 374 .. attribute:: residuals 300 375 301 differences between estimated and actual values of the302 dependent variable for all instances from the training table 376 Differences between estimated and actual values of the 377 dependent variable for all instances from the training table. 303 378 304 379 .. attribute:: m 305 380 306 number of independent variables381 Number of independent (predictor) variables. 307 382 308 383 .. attribute:: n 309 384 310 number of instances385 Number of instances. 311 386 312 387 .. attribute:: mu_y 313 388 314 the sample mean of the dependent variable389 Sample mean of the dependent variable. 315 390 316 391 .. attribute:: r2 317 392 318 _`coefficient of determination`: 319 http://en.wikipedia.org/wiki/Coefficient_of_determination 393 `Coefficient of determination 394 <http://en.wikipedia.org/wiki/Coefficient_of_determination>`_. 395 320 396 321 397 .. attribute:: r2adj 322 398 323 adjusted coefficient of determination399 Adjusted coefficient of determination. 324 400 325 401 .. attribute:: sst, sse, ssr 326 402 327 total sum of squares, explained sum of squares and328 residual sum of squares respectively 403 Total sum of squares, explained sum of squares and 404 residual sum of squares respectively. 329 405 330 406 .. attribute:: std_coefficients 331 407 332 standardized regression coefficients408 Standardized regression coefficients. 333 409 334 410 """ … … 467 543 @deprecated_keywords({"addSig": "add_sig", "removeSig": "remove_sig"}) 468 544 def stepwise(table, weight, add_sig=0.05, remove_sig=0.2): 469 """ Performs _`stepwise linear regression`:470 http://en.wikipedia.org/wiki/Stepwise_regression545 """ Performs `stepwise linear regression 546 <http://en.wikipedia.org/wiki/Stepwise_regression>`_: 471 547 on table and returns the list of remaing independent variables 472 548 which fit a significant linear regression model.coefficients
Note: See TracChangeset
for help on using the changeset viewer.