source: orange/docs/tutorial/rst/regression.rst @ 11090:418ccce5b803

Revision 11090:418ccce5b803, 2.1 KB checked in by blaz <blaz.zupan@…>, 15 months ago (diff)

Tutorial-regression documentation replaced reference to a file.

Line 
1Regression
2==========
3
4.. index:: regression
5
6From the interface point of view, regression methods in Orange are very similar to classification. Both intended for supervised data mining, they require class-labeled data. Just like in classification, regression is implemented with learners and regression models (regressors). Regression learners are objects that accept data and return regressors. Regression models are given data items to predict the value of continuous class:
7
8.. literalinclude:: code/regression.py
9
10
11Handful of Regressors
12---------------------
13
14.. index::
15   single: regression; tree
16
17Let us start with regression trees. Below is an example script that builds the tree from data on housing prices and prints out the tree in textual form:
18
19.. literalinclude:: code/regression-tree.py
20   :lines: 3-
21
22The script outputs the tree::
23   
24   RM<=6.941: 19.9
25   RM>6.941
26   |    RM<=7.437
27   |    |    CRIM>7.393: 14.4
28   |    |    CRIM<=7.393
29   |    |    |    DIS<=1.886: 45.7
30   |    |    |    DIS>1.886: 32.7
31   |    RM>7.437
32   |    |    TAX<=534.500: 45.9
33   |    |    TAX>534.500: 21.9
34
35Following is initialization of few other regressors and their prediction of the first five data instances in housing price data set:
36
37.. index::
38   single: regression; mars
39   single: regression; linear
40
41.. literalinclude:: code/regression-other.py
42   :lines: 3-
43
44Looks like the housing prices are not that hard to predict::
45
46   y    lin  mars tree
47   21.4 24.8 23.0 20.1
48   15.7 14.4 19.0 17.3
49   36.5 35.7 35.6 33.8
50
51Cross Validation
52----------------
53
54Just like for classification, the same evaluation module (``Orange.evaluation``) is available for regression. Its testing submodule includes procedures such as cross-validation, leave-one-out testing and similar, and functions in scoring submodule can assess the accuracy from the testing:
55
56.. literalinclude:: code/regression-cv.py
57   :lines: 3-
58
59.. index: 
60   single: regression; root mean squared error
61
62`MARS <http://en.wikipedia.org/wiki/Multivariate_adaptive_regression_splines>`_ has the lowest root mean squared error::
63
64   Learner  RMSE
65   lin      4.83
66   mars     3.84
67   tree     5.10
68
Note: See TracBrowser for help on using the repository browser.