source: orange/docs/widgets/rst/classify/classificationtree.rst @ 11359:8d54e79aa135

Revision 11359:8d54e79aa135, 4.0 KB checked in by Ales Erjavec <ales.erjavec@…>, 14 months ago (diff)

Cleanup of 'Widget catalog' documentation.

Fixed rst text formating, replaced dead hardcoded reference links (now using
:ref:), etc.

RevLine 
[11050]1.. _Classification Tree:
2
3Classification Tree Learner
4===========================
5
6.. image:: ../icons/ClassificationTree.png
7
8Classification Tree Learner
9
10Signals
11-------
12
13Inputs:
14
15
16   - Examples (ExampleTable)
17      A table with training examples
18
19
20Outputs:
21
22   - Learner
[11359]23      The classification tree learning algorithm with settings as specified in
24      the dialog.
[11050]25
26   - Classification Tree
27      Trained classifier (a subtype of Classifier)
28
29
[11359]30Signal :code:`Classification Tree` sends data only if the learning data
31(signal :code:`Classified Examples` is present.
[11050]32
33Description
34-----------
35
[11359]36This widget provides a graphical interface to the classification tree learning
37algorithm.
[11050]38
[11359]39As all widgets for classification, this widget provides a learner and
40classifier on the output. Learner is a learning algorithm with settings
41as specified by the user. It can be fed into widgets for testing learners,
42for instance :ref:`Test Learners`. Classifier is a Classification Tree
43Classifier (a subtype of a general classifier), built from the training
44examples on the input. If examples are not given, there is no classifier on
45the output.
[11050]46
47.. image:: images/ClassificationTree.png
48   :alt: Classification Tree Widget
49
[11359]50Learner can be given a name under which it will appear in, say,
51:ref:`Test Learners`. The default name is "Classification Tree".
[11050]52
[11359]53The first block of options deals with the :obj:`Attribute selection criterion`,
54where you can choose between the information gain, gain ratio, gini index and
55ReliefF. For the latter, it is possible to :obj:`Limit the number of reference
56examples` (more examples give more accuracy and less speed) and the
57:obj:`Number of neighbours` considered in the estimation.
[11050]58
[11359]59If :code:`Binarization` is checked, the values of multivalued attributes
60are split into two groups (based on the statistics in the particular node)
61to yield a binary tree. Binarization gets rid of the usual measures'
62bias towards attributes with more values and is generally recommended.
[11050]63
[11359]64Pruning during induction can be based on the :obj:`Minimal number of
65instance in leaves`; if checked, the algorithm will never construct a split
66which would put less than the specified number of training examples into any
67of the branches. You can also forbid the algorithm to split the nodes with
68less than the given number of instances (:obj:`Stop splitting nodes with
69less instances than`)or the nodes with a large enough majority class
70(:obj:`Stop splitting nodes with a majority class of (%)`.
[11050]71
[11359]72During induction, the algorithm can produce a tree in which entire subtrees
73predict the same class, but with different probabilities. This can increase
74probability based measures of classifier quality, like the Brier score
75or AUC, but the trees tend to be much larger and more difficult to grasp.
76To avoid it, tell it to :obj:`Recursively merge the leaves with same
77majority class`. The widget also supports :obj:`pruning with m-estimate`.
[11050]78
[11359]79After changing one or more settings, you need to push :obj:`Apply`, which
80will put the new learner on the output and, if the training examples are
81given, construct a new classifier and output it as well.
[11050]82
[11359]83The tree can deal with missing data. Orange's tree learner actually
84supports quite a few methods for that, but when used from canvas,
85it effectively splits the example into multiple examples with different
86weights. If you had data with 25% males and 75% females, then when the
87gender is unknown, the examples splits into two, a male and a female
88with weights .25 and .75, respectively. This goes for both learning
89and classification.
[11050]90
91Examples
92--------
93
[11359]94There are two typical uses of this widget. First, you may want to induce
95the model and check what it looks like. You do it with the schema below;
96to learn more about it, see the documentation on :ref:`Classification Tree
97Graph`
[11050]98
99.. image:: images/ClassificationTreeGraph-SimpleSchema-S.gif
100   :alt: Classification Trees - Schema with a Classifier
101
102The second schema checks the accuracy of the algorithm.
103
104.. image:: images/ClassificationTree-SchemaLearner.png
105   :alt: Classification Tree - Schema with a Learner
Note: See TracBrowser for help on using the repository browser.