Ignore:
Timestamp:
02/27/13 15:02:50 (14 months ago)
Author:
Ales Erjavec <ales.erjavec@…>
Branch:
default
Message:

Cleanup of 'Widget catalog' documentation.

Fixed rst text formating, replaced dead hardcoded reference links (now using
:ref:), etc.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/widgets/rst/classify/interactivetreebuilder.rst

    r11050 r11359  
    3030 
    3131   - Tree Learner (orange.Learner) 
    32       A learner which always returns the same tree - the one constructed in the widget 
     32      A learner which always returns the same tree - the one constructed in 
     33      the widget 
    3334 
    3435 
    35 Signal :code:`Examples` sends data only if some tree node is selected and contains some examples. 
     36Signal :code:`Examples` sends data only if some tree node is selected and 
     37contains some examples. 
    3638 
    3739Description 
    3840----------- 
    3941 
    40 This is a very exciting widget which is useful for teaching induction of classification trees and also in practice, where a data miner and an area expert can use it to manually construct a classification tree helped by the entire Orange's widgetry. 
     42This is a very exciting widget which is useful for teaching induction of 
     43classification trees and also in practice, where a data miner and an area 
     44expert can use it to manually construct a classification tree helped by the 
     45entire Orange's widgetry. 
    4146 
    42 The widget is based on `Classification Tree Viewer <ClassificationTreeViewer.htm>`_. It is mostly the same (so you are encouraged to read the related documentation), except for the different input/output signals and the addition of a few buttons. 
     47The widget is based on :ref:`Classification Tree Viewer`. It is mostly the 
     48same (so you are encouraged to read the related documentation), except for 
     49the different input/output signals and the addition of a few buttons. 
    4350 
    4451.. image:: images/InteractiveTreeBuilder.png 
    4552   :alt: Interactive Tree Builder widget 
    4653 
    47 Button :obj:`Split` splits the selected tree node according to the criterion above the button. For instance, if we pressed Split in the above widget, the animals that don't give milk and have no feathers (the pictures shows a tree for the zoo data set) would be split according to whether they are :code:`aquatic` or not. In case of continuous attributes, a cut off point needs to be specified as well. 
     54Button :obj:`Split` splits the selected tree node according to the criterion 
     55above the button. For instance, if we pressed Split in the above widget, 
     56the animals that don't give milk and have no feathers (the pictures shows 
     57a tree for the zoo data set) would be split according to whether they are 
     58:code:`aquatic` or not. In case of continuous attributes, a cut off point 
     59needs to be specified as well. 
    4860 
    49 If Split is used on a node which is not a leaf, the criterion at that node is replaced. If we, for instance, selected the &lt;root&gt; node and pushed Split, the criterion :code:`milk` would be replaced with :code:`aquatic` and the nodes below (:code:`feathers`) are removed. 
     61If Split is used on a node which is not a leaf, the criterion at that node 
     62is replaced. If we, for instance, selected the &lt;root&gt; node and pushed 
     63Split, the criterion :code:`milk` would be replaced with :code:`aquatic` 
     64and the nodes below (:code:`feathers`) are removed. 
    5065 
    51 Button :obj:`Cut` cuts the tree at the selected node. If we pushed Cut in the situation in the picture, nothing would happen since the selected node (:code:`feathers=0`) is already a leaf. If we selected :code:`&lt;root&gt;` and pushed Cut, the entire tree would be cut off. 
     66Button :obj:`Cut` cuts the tree at the selected node. If we pushed Cut 
     67in the situation in the picture, nothing would happen since the selected 
     68node (:code:`feathers=0`) is already a leaf. If we selected :code:`<root>` 
     69and pushed Cut, the entire tree would be cut off. 
    5270 
    53 Cut is especially useful in combination with :code:`Build` which builds a subtree at the current node. So, if we push Build in the situation depicted above, a subtree would be built for the milkless featherless animals, leaving the rest of the tree (that is, the existing two nodes) intact. If Build is pressed at a node which is not leaf, the entire subtree at that node is replaced with an automatically induced tree. 
     71Cut is especially useful in combination with :code:`Build` which builds 
     72a subtree at the current node. So, if we push Build in the situation 
     73depicted above, a subtree would be built for the milkless featherless 
     74animals, leaving the rest of the tree (that is, the existing two nodes) 
     75intact. If Build is pressed at a node which is not leaf, the entire subtree 
     76at that node is replaced with an automatically induced tree. 
    5477 
    55 Build uses some reasonable default parameters for tree learning (information gain ratio is used for attribute selection with a minimum of 2 examples per leaf, which gives an algorithm equivalent to Quinlan's C4.5). To gain more control on the tree construction arguments, use a `Classification Tree widget <ClassificationTree.htm>`_ or `C4.5 <C4.5.htm>`_ widget, set its parameters and connect it to the input of Interactive Tree Builder. The set parameters will the be used for the tree induction. (If you use C4.5, the original Quinlan's algorithm, don't forget to check :obj:`Convert to orange tree structure`.) 
     78Build uses some reasonable default parameters for tree learning (information 
     79gain ratio is used for attribute selection with a minimum of 2 examples per 
     80leaf, which gives an algorithm equivalent to Quinlan's C4.5). To gain more 
     81control on the tree construction arguments, use a :ref:`Classification Tree` 
     82widget or :ref:`C4.5` widget, set its parameters and connect it to the 
     83input of Interactive Tree Builder. The set parameters will the be used for 
     84the tree induction. (If you use C4.5, the original Quinlan's algorithm, 
     85don't forget to check :obj:`Convert to orange tree structure`.) 
    5686 
    57 The widget has several outputs. :obj:`Examples` gives, as in `Classification Tree Viewer <ClassificationTreeViewer.htm>`_ the list of examples from the selected node. This output can be used to observe the statistical properties or visualizations of various attributes for a specific node, based on which we should decide whether we should split the examples and how. 
     87The widget has several outputs. :obj:`Examples` gives, as in 
     88:ref:`Classification Tree Viewer` the list of examples from the selected node. 
     89This output can be used to observe the statistical properties or 
     90visualizations of various attributes for a specific node, based on which 
     91we should decide whether we should split the examples and how. 
    5892 
    59 Signal :obj:`Classification Tree` can be attached to another tree viewer. Using a Classification Tree Viewer is not really useful as it will show the same picture as Interactive Tree Builder. We can however connect the more colorful `Classification Tree Graph <ClassificationTreeGraph.htm>`_. 
     93Signal :obj:`Classification Tree` can be attached to another tree viewer. 
     94Using a :ref:`Classification Tree Viewer` is not really useful as it will 
     95show the same picture as Interactive Tree Builder. We can however connect 
     96the more colorful :ref:`Classification Tree Graph`. 
    6097 
    61 The last output is :obj:`Tree Learner`. This is a tree learner which always gives the same tree - the one we constructed in this widget. This can be used to assess the tree's quality with the `Test Learners <../Evaluate/TestLearners.htm>`_ widget. This requires some caution, though: you should not test the tree on the same data you used to induce it. See the Examples section below for the correct procedure. 
     98The last output is :obj:`Tree Learner`. This is a tree learner which always 
     99gives the same tree - the one we constructed in this widget. This can be used 
     100to assess the tree's quality with the :ref:`Test Learners` widget. This 
     101requires some caution, though: you should not test the tree on the same 
     102data you used to induce it. See the Examples section below for the correct 
     103procedure. 
    62104 
    63105Examples 
    64106-------- 
    65107 
    66 The first snapshot shows the typical "environment" of the Interactive Tree Builder. 
     108The first snapshot shows the typical "environment" of the Interactive 
     109Tree Builder. 
    67110 
    68111.. image:: images/InteractiveTreeBuilder-SchemaInduction.png 
    69112   :alt: A schema with Interactive Tree Builder 
    70113 
    71 The learning examples may come from a file. We also use a `Classification Tree <ClassificationTree.htm>`_ widget to able to set the tree induction parameters for the parts of the tree we want to induce automatically. 
     114The learning examples may come from a file. We also use a 
     115:ref:`Classification Tree` widget to able to set the tree induction parameters 
     116for the parts of the tree we want to induce automatically. 
    72117 
    73 On the right hand side, we have the `Rank <../Data/Rank.htm>`_ widget which assesses the quality of attributes through measures like information gain, gini index and others. Emulating the induction algorithm by selecting the attributes having the highest value for one of these measures should give the same results as using Classification Tree widget instead of the Interactive Builder. However, in manual construction we can (and should) also rely on the visualization widgets. One-dimensional visualizations like `Distributions <../Visualize/Distributions.htm>`_ give us an impression about the properties of a single attribute, while two- and more dimensional visualizations like `Scatterplot <../Visualize/Scatterplot.htm>`_ and `Linear Projection <../Visualize/LinearProjection.htm>`_ will give us a kind of lookahead by telling us about the useful combinations of attributes. We have also deployed the `Data Table <../Data/DataTable.htm>`_ widget since seeing particular examples in a tree node may also sometimes help the expert. 
     118On the right hand side, we have the :ref:`Rank` widget which assesses the 
     119quality of attributes through measures like information gain, gini index 
     120and others. Emulating the induction algorithm by selecting the attributes 
     121having the highest value for one of these measures should give the same 
     122results as using Classification Tree widget instead of the Interactive 
     123Builder. However, in manual construction we can (and should) also rely on 
     124the visualization widgets. One-dimensional visualizations like 
     125:ref:`Distributions` give us an impression about the properties of a single 
     126attribute, while two- and more dimensional visualizations like 
     127:ref:`Scatter Plot` and :ref:`Linear Projection` will give us a kind of 
     128lookahead by telling us about the useful combinations of attributes. We 
     129have also deployed the :ref:`Data Table` widget since seeing particular 
     130examples in a tree node may also sometimes help the expert. 
    74131 
    75 Finally, we use the `Classification Tree Graph <ClassificationTreeGraph.htm>`_ to present the resulting tree in a fancy looking picture. 
     132Finally, we use the :ref:`Classification Tree Graph` to present the resulting 
     133tree in a fancy looking picture. 
    76134 
    77 As the widget name suggests, the tree construction should be interactive, making the best use of the available Orange's visualization techniques and help of the area expert. At the beginning the widget presents a tree containing only the root. One way to proceed is to immediately click Build and then study the resulting tree. Data examples for various nodes can be presented and visualized to decide which parts of the tree make sense, which don't and should better be reconstructed manually, and which subtrees should be cut off. The other way is to start constructing the tree manually, adding the nodes according to the expert's knowledge and occasionally use Build button to let Orange make a suggestion. 
     135As the widget name suggests, the tree construction should be interactive, 
     136making the best use of the available Orange's visualization techniques 
     137and help of the area expert. At the beginning the widget presents a tree 
     138containing only the root. One way to proceed is to immediately click 
     139Build and then study the resulting tree. Data examples for various nodes 
     140can be presented and visualized to decide which parts of the tree make sense, 
     141which don't and should better be reconstructed manually, and which subtrees 
     142should be cut off. The other way is to start constructing the tree 
     143manually, adding the nodes according to the expert's knowledge and 
     144occasionally use Build button to let Orange make a suggestion. 
    78145 
    79146 
    80 Although expert's help will usually prevent overfitting the data, special care still needs to be taken when we are interested in knowing the performance of the induced tree. Since the widely used cross-validation is for obvious reasons inapplicable when the model is constructed manually, we should split the data into training and testing set prior to building the tree. 
     147Although expert's help will usually prevent overfitting the data, 
     148special care still needs to be taken when we are interested in knowing 
     149the performance of the induced tree. Since the widely used cross-validation 
     150is for obvious reasons inapplicable when the model is constructed 
     151manually, we should split the data into training and testing set prior 
     152to building the tree. 
    81153 
    82154.. image:: images/InteractiveTreeBuilder-SchemaSampling.png 
    83155   :alt: A schema with Interactive Tree Builder 
    84156 
    85 We have used the `Data Sampler <../Data/DataSampler>`_ widget for splitting the data; in most cases we recommend using stratified random sampling with a sample size of 70% for training. These examples (denoted as "Examples" in the snapshot) are fed to the Interactive Tree Builder where we employ the Orange's armory to construct the tree as described above. 
     157We have used the :ref:`Data Sampler` widget for splitting the data; in most 
     158cases we recommend using stratified random sampling with a sample size 
     159of 70% for training. These examples (denoted as "Examples" in the snapshot) 
     160are fed to the Interactive Tree Builder where we employ the Orange's armory 
     161to construct the tree as described above. 
    86162 
    87 The tricky part is connecting the :code:`Test Learners`: Data Sampler's Examples should be used as Test Learners' Data, and Data Sampler's Remaining Examples are the Test Learners' Separate Test Data. 
     163The tricky part is connecting the :ref:`Test Learners`: Data Sampler's 
     164Examples should be used as Test Learners' Data, and Data Sampler's 
     165Remaining Examples are the Test Learners' Separate Test Data. 
    88166 
    89167.. image:: images/InteractiveTreeBuilder-SchemaSampling-Wiring.png 
    90    :alt: Connecting Data Sampler to Test Learners when using Interactive Tree Builder 
     168   :alt: Connecting Data Sampler to Test Learners when using Interactive 
     169         Tree Builder 
    91170 
    92 In Test Learners, don't forget to set the Sampling type to :obj:`Test on test data`. Interactive Tree Builder should then give its Tree Learner to Test Learners. To compare the manually constructed tree with, say, an automatically constructed one and with a Naive Bayesian classifier, we can include these two in the schema. 
     171In Test Learners, don't forget to set the Sampling type to 
     172:obj:`Test on test data`. Interactive Tree Builder should then give its 
     173Tree Learner to Test Learners. To compare the manually constructed tree 
     174with, say, an automatically constructed one and with a Naive Bayesian 
     175classifier, we can include these two in the schema. 
    93176 
    94 Test Learners will now feed the training data (70% sample it gets from Data Sampler) to all three learning algorithms. While Naive Bayes and Classification Tree will actually learn, Interactive Tree Builder will ignore the training examples and return the manually built tree. All three models will then be tested on the remaining 30% examples. 
     177Test Learners will now feed the training data (70% sample it gets from 
     178Data Sampler) to all three learning algorithms. While Naive Bayes and 
     179Classification Tree will actually learn, Interactive Tree Builder will 
     180ignore the training examples and return the manually built tree. 
     181All three models will then be tested on the remaining 30% examples. 
Note: See TracChangeset for help on using the changeset viewer.