Revision 11090:418ccce5b803,
2.1 KB
checked in by blaz <blaz.zupan@…>, 15 months ago
(diff) 
Tutorialregression documentation replaced reference to a file.

Line  

1  Regression 

2  ========== 

3  

4  .. index:: regression 

5  

6  From the interface point of view, regression methods in Orange are very similar to classification. Both intended for supervised data mining, they require classlabeled data. Just like in classification, regression is implemented with learners and regression models (regressors). Regression learners are objects that accept data and return regressors. Regression models are given data items to predict the value of continuous class: 

7  

8  .. literalinclude:: code/regression.py 

9  

10  

11  Handful of Regressors 

12   

13  

14  .. index:: 

15  single: regression; tree 

16  

17  Let us start with regression trees. Below is an example script that builds the tree from data on housing prices and prints out the tree in textual form: 

18  

19  .. literalinclude:: code/regressiontree.py 

20  :lines: 3 

21  

22  The script outputs the tree:: 

23  

24  RM<=6.941: 19.9 

25  RM>6.941 

26   RM<=7.437 

27    CRIM>7.393: 14.4 

28    CRIM<=7.393 

29     DIS<=1.886: 45.7 

30     DIS>1.886: 32.7 

31   RM>7.437 

32    TAX<=534.500: 45.9 

33    TAX>534.500: 21.9 

34  

35  Following is initialization of few other regressors and their prediction of the first five data instances in housing price data set: 

36  

37  .. index:: 

38  single: regression; mars 

39  single: regression; linear 

40  

41  .. literalinclude:: code/regressionother.py 

42  :lines: 3 

43  

44  Looks like the housing prices are not that hard to predict:: 

45  

46  y lin mars tree 

47  21.4 24.8 23.0 20.1 

48  15.7 14.4 19.0 17.3 

49  36.5 35.7 35.6 33.8 

50  

51  Cross Validation 

52   

53  

54  Just like for classification, the same evaluation module (``Orange.evaluation``) is available for regression. Its testing submodule includes procedures such as crossvalidation, leaveoneout testing and similar, and functions in scoring submodule can assess the accuracy from the testing: 

55  

56  .. literalinclude:: code/regressioncv.py 

57  :lines: 3 

58  

59  .. index: 

60  single: regression; root mean squared error 

61  

62  `MARS <http://en.wikipedia.org/wiki/Multivariate_adaptive_regression_splines>`_ has the lowest root mean squared error:: 

63  

64  Learner RMSE 

65  lin 4.83 

66  mars 3.84 

67  tree 5.10 

68  

Note: See
TracBrowser
for help on using the repository browser.