source: orange/docs/widgets/rst/classify/c45.rst @ 11050:e3c4699ca155

Revision 11050:e3c4699ca155, 4.1 KB checked in by Miha Stajdohar <miha.stajdohar@…>, 16 months ago (diff)

Widget docs From HTML to Sphinx.

Line 
1.. _C4.5:
2
3C4.5 Learner
4============
5
6.. image:: ../icons/C4.5.png
7
8C4.5 learner by Ross Quinlan
9
10Signals
11-------
12
13Inputs:
14
15
16   - Examples (ExampleTable)
17      A table with training examples
18
19
20Outputs:
21
22   - Learner
23      The C4.5 learning algorithm with settings as specified in the dialog.
24
25   - Classifier
26      Trained C4.5 classifier
27
28   - C45 Tree (C45Classifier)
29      Induced tree in the original Quinlan's format
30
31   - Classification Tree (TreeClassifier)
32      Induced tree in Orange's native format
33
34
35:code:`Classifier`, :code:`C45 Tree` and :code:`Classification Tree` are available only if examples are present on the input. Which of the latter two output signals is active is determined by setting :obj:`Convert to orange tree structure` (see the description below.
36
37Description
38-----------
39
40This widget provides a graphical interface to the well-known Quinlan's C4.5 algorithm for construction of classification tree. Orange uses the original Quinlan's code which must be, due to copyright issues, built and linked in separately.
41
42Orange also implements its own classification tree induction algorithm which is comparable to Quinlan's, though the results may differ due to technical details. It is accessible in widget :code:`Classification Tree`.
43
44As all widgets for classification, C4.5 widget provides learner and classifier on the output. Learner is a learning algorithm with settings as specified by the user. It can be fed into widgets for testing learners, namely :code:`Test Learners`. Classifier is a classification tree build from the training examples on the input. If examples are not given, the widget outputs no classifier.
45
46.. image:: images/C4.5.png
47   :alt: C4.5 Widget
48
49Learner can be given a name under which it will appear in, say, :code:`Test Learners`. The default name is "C4.5".
50
51The next block of options deals with splitting. C4.5 uses gain ratio by default; to override this, check :obj:`Use information gain instead of ratio`, which is equivalent to C4.5's command line option :code:`-g`. If you enable :obj:`subsetting` (equivalent to :code:`-s`), C4.5 will merge values of multivalued discrete attributes instead of creating one branch for each node. :obj:`Probabilistic threshold for continuous attributes` (:code:`-p`) makes C4.5 compute the lower and upper boundaries for values of continuous attributes for which the number of misclassified examples would be within one standard deviation from the base error.
52
53As for pruning, you can set the :obj:`Minimal number of examples in the leaves` (Quinlan's default is 2, but you may want to disable this for noiseless data), and the :obj:`Post prunning with confidence level`; the default confidence is 25.
54
55Trees can be constructed iteratively, with ever larger number of examples. If enable, you can set the :obj:`Number of trials`, the :obj:`initial windows size` and :obj:`window increment`.
56
57The resulting classifier can be left in the original Quinlan's structure, as returned by his underlying code, or :obj:`converted to orange the structure` that is used by Orange's tree induction algorithm. This setting decides which of the two signals that output the tree - :code:`C45 Classifier` or :code:`Tree Classifier` will be active. As Orange's structure is more general and can easily accommodate all the data that C4.5 tree needs for classification, we believe that the converted tree behave exactly the same as the original tree, so the results should not depend on this setting. You should therefore leave it enabled since only the converted trees can be shown in the tree displaying widgets.
58
59When you change one or more settings, you need to push :obj:`Apply`; this will put the new learner on the output and, if the training examples are given, construct a new classifier and output it as well.
60
61
62Examples
63--------
64
65There are two typical uses of this widget. First, you may want to induce the tree and see what it looks like, like in the schema on the right.
66
67.. image:: images/C4.5-SchemaClassifier2.png
68   :alt: C4.5 - Schema with a Classifier
69
70The second schema shows how to compare the results of C4.5 learner with another classifier, naive Bayesian Learner.
71
72.. image:: images/C4.5-SchemaLearner.png
73   :alt: C4.5 - Schema with a Learner
74
Note: See TracBrowser for help on using the repository browser.