# source:orange/docs/tutorial/rst/regression.rst@11692:356c325c0efb

Revision 11692:356c325c0efb, 2.0 KB checked in by Ales Erjavec <ales.erjavec@…>, 7 months ago (diff)

Removed 'Earth' code from Orange (moved to 'orangecontrib.earth' package).

Line
1Regression
2==========
3
4.. index:: regression
5
6From the interface point of view, regression methods in Orange are very similar to classification. Both intended for supervised data mining, they require class-labeled data. Just like in classification, regression is implemented with learners and regression models (regressors). Regression learners are objects that accept data and return regressors. Regression models are given data items to predict the value of continuous class:
7
8.. literalinclude:: code/regression.py
9
10
11Handful of Regressors
12---------------------
13
14.. index::
15   single: regression; tree
16
17Let us start with regression trees. Below is an example script that builds the tree from data on housing prices and prints out the tree in textual form:
18
19.. literalinclude:: code/regression-tree.py
20   :lines: 3-
21
22The script outputs the tree::
23
24   RM<=6.941: 19.9
25   RM>6.941
26   |    RM<=7.437
27   |    |    CRIM>7.393: 14.4
28   |    |    CRIM<=7.393
29   |    |    |    DIS<=1.886: 45.7
30   |    |    |    DIS>1.886: 32.7
31   |    RM>7.437
32   |    |    TAX<=534.500: 45.9
33   |    |    TAX>534.500: 21.9
34
35Following is initialization of few other regressors and their prediction of the first five data instances in housing price data set:
36
37.. index::
38   single: regression; linear
39
40.. literalinclude:: code/regression-other.py
41   :lines: 3-
42
43Looks like the housing prices are not that hard to predict::
44
45   y    lin  rf   tree
46   12.7 11.3 15.3 19.1
47   13.8 20.2 14.1 13.1
48   19.3 20.8 20.7 23.3
49
50
51Cross Validation
52----------------
53
54Just like for classification, the same evaluation module (``Orange.evaluation``) is available for regression. Its testing submodule includes procedures such as cross-validation, leave-one-out testing and similar, and functions in scoring submodule can assess the accuracy from the testing:
55
56.. literalinclude:: code/regression-cv.py
57   :lines: 3-
58
59.. index:
60   single: regression; root mean squared error
61
62Random forest has the lowest root mean squared error::
63
64   Learner  RMSE
65   lin      4.83
66   rf       3.73
67   tree     5.10
Note: See TracBrowser for help on using the repository browser.