Changeset 7737:03f50b3da484 in orange


Ignore:
Timestamp:
03/14/11 16:44:47 (3 years ago)
Author:
markotoplak
Branch:
default
Convert:
c9a03a6a7d35606da31442504c5b0ad0033db5b7
Message:

tree: documentation updates.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/classification/tree.py

    r7719 r7737  
    202202filled temporarily but later cleared again. 
    203203 
    204 FIXME: the following example is not true anymore. 
    205  
    206204:: 
    207205 
     
    218216The stop is trivial. The default is set by 
    219217 
    220 :: 
    221  
    222218    >>> learner.stop = Orange.classification.tree.StopCriteria_common() 
    223  
    224 Well, this is actually done in C++ and it uses a global component 
    225 that is constructed once for all, but apart from that we did 
    226 effectively the same thing. 
    227219 
    228220We can now examine the default stopping parameters. 
     
    258250 
    259251 
    260 ================= 
    261252Printing the Tree 
    262253================= 
     
    888879        node, excluding null-nodes). 
    889880 
    890 ============== 
    891 Classification 
    892 ============== 
     881============ 
     882Base classes 
     883============ 
    893884 
    894885.. class:: _TreeClassifier 
    895886 
    896887    Classifies examples according to a tree stored in :obj:`tree`. 
     888    Not meant to be used directly. The :class:`TreeLearner` class 
     889    constructs :class:`TreeClassifier`. 
    897890 
    898891    .. attribute:: tree 
     
    952945    .. method:: classDistribution() 
    953946 
    954  
    955 The rest of this section is only for those interested in the C++ code. 
    956 ====================================================================== 
    957  
    958 If you'd like to understand how the classification works in C++,  
    959 start reading at :obj:`TTreeClassifier::vote`. It gets a  
    960 :obj:`Node`, an :obj:`Orange.data.Instance`> and a distribution of  
    961 vote weights. For each node, it calls the  
    962 :obj:`TTreeClassifier::classDistribution` and then multiplies  
    963 and sums the distribution. :obj:`vote` returns a normalized  
    964 distribution of predictions. 
    965  
    966 A new overload of :obj:`TTreeClassifier::classDistribution` gets 
    967 an additional parameter, a :obj:`Node`. This is done  
    968 for the sake of recursion. The normal version of  
    969 :obj:`classDistribution` simply calls the overloaded with a  
    970 tree root as an additional parameter. :obj:`classDistribution`  
    971 uses :obj:`descender`. If descender reaches a leaf, it calls  
    972 :obj:`nodeClassifier`, otherwise it calls :obj:`vote`. 
    973  
    974 Thus, the :obj:`TreeClassifier`'s :obj:`vote` and  
    975 :obj:`classDistribution` are written in a form of double  
    976 recursion. The recursive calls do not happen at each node of the  
    977 tree but only at nodes where a vote is needed (that is, at nodes  
    978 where the descender halts). 
    979  
    980 For predicting a class, :obj:`operator()`, calls the 
    981 descender. If it reaches a leaf, the class is predicted by the  
    982 leaf's :obj:`nodeClassifier`. Otherwise, it calls  
    983 :obj:`vote`>. From now on, :obj:`vote` and  
    984 :obj:`classDistribution` interweave down the tree and return  
    985 a distribution of predictions. :obj:`operator()` then simply  
    986 chooses the most probable class. 
    987  
    988 ======== 
    989 Learning 
    990 ======== 
    991  
    992 The main learning object is :obj:`TreeLearnerBase`. It is basically  
    993 a skeleton into which the user must plug the components for particular  
    994 functions. For easier use, defaults are provided. 
    995  
    996 Components that govern the structure of the tree are :obj:`split` 
    997 (of type :obj:`SplitConstructor`), :obj:`stop` (of  
    998 type :obj:`StopCriteria` and :obj:`exampleSplitter` 
    999 (of type :obj:`ExampleSplitter`). 
    1000  
     947    If you'd like to understand how the classification works in C++,  
     948    start reading at :obj:`TTreeClassifier::vote`. It gets a  
     949    :obj:`Node`, an :obj:`Orange.data.Instance`> and a distribution of  
     950    vote weights. For each node, it calls the  
     951    :obj:`TTreeClassifier::classDistribution` and then multiplies  
     952    and sums the distribution. :obj:`vote` returns a normalized  
     953    distribution of predictions. 
     954 
     955    A new overload of :obj:`TTreeClassifier::classDistribution` gets 
     956    an additional parameter, a :obj:`Node`. This is done  
     957    for the sake of recursion. The normal version of  
     958    :obj:`classDistribution` simply calls the overloaded with a  
     959    tree root as an additional parameter. :obj:`classDistribution`  
     960    uses :obj:`descender`. If descender reaches a leaf, it calls  
     961    :obj:`nodeClassifier`, otherwise it calls :obj:`vote`. 
     962 
     963    Thus, the :obj:`TreeClassifier`'s :obj:`vote` and  
     964    :obj:`classDistribution` are written in a form of double  
     965    recursion. The recursive calls do not happen at each node of the  
     966    tree but only at nodes where a vote is needed (that is, at nodes  
     967    where the descender halts). 
     968 
     969    For predicting a class, :obj:`operator()`, calls the 
     970    descender. If it reaches a leaf, the class is predicted by the  
     971    leaf's :obj:`nodeClassifier`. Otherwise, it calls  
     972    :obj:`vote`>. From now on, :obj:`vote` and  
     973    :obj:`classDistribution` interweave down the tree and return  
     974    a distribution of predictions. :obj:`operator()` then simply  
     975    chooses the most probable class. 
    1001976 
    1002977.. class:: TreeLearnerBase 
    1003978 
    1004     TreeLearnerBase has a number of components. 
     979    The main learning object is :obj:`TreeLearnerBase`. It is basically  
     980    a skeleton into which the user must plug the components for particular  
     981    functions. This class is not meant to be used directly. You should 
     982    rather use :class:`TreeLearner`. 
     983 
     984    Components that govern the structure of the tree are :obj:`split` 
     985    (of type :obj:`SplitConstructor`), :obj:`stop` (of  
     986    type :obj:`StopCriteria` and :obj:`exampleSplitter` 
     987    (of type :obj:`ExampleSplitter`). 
    1005988 
    1006989    .. attribute:: split 
     
    11391122    if so requested. If not, the new weight attributes are removed (if  
    11401123    any were created). 
    1141  
    1142 Pruning 
    1143 ======= 
    1144  
    1145 Tree pruners derived from :obj:`Pruner` can be given either a 
    1146 :obj:`Node` (presumably, but not necessarily a root) or a 
    1147 :obj:`TreeClassifier`. The result is a new, pruned :obj:`Node` 
    1148 or a new :obj:`TreeClassifier` with a pruned tree. The original 
    1149 tree remains intact. 
    1150  
    1151 Note however that pruners construct only a shallow copy of a tree. 
    1152 The pruned tree's :obj:`Node` contain references to the same 
    1153 contingency matrices, node classifiers, branch selectors, ... 
    1154 as the original tree. Thus, you may modify a pruned tree structure 
    1155 (manually cut it, add new nodes, replace components) but modifying, 
    1156 for instance, some node's :obj:`nodeClassifier` (a 
    1157 :obj:`nodeClassifier` itself, not a reference to it!) would modify 
    1158 the node's :obj:`nodeClassifier` in the corresponding node of 
    1159 the original tree. 
    1160  
    1161 Talking about node classifiers - pruners cannot construct a 
    1162 :obj:`nodeClassifier` nor merge :obj:`nodeClassifier` of the pruned 
    1163 subtrees into classifiers for new leaves. Thus, if you want to build 
    1164 a prunable tree, internal nodes must have their :obj:`nodeClassifier` 
    1165 defined. Fortunately, all you need to do is nothing; if you leave 
    1166 the :obj:`TreeLearnerBase`'s flags as they are by default, the 
    1167 :obj:`nodeClassifier` are created. 
    11681124 
    11691125======= 
     
    16581614    weighted according to the selectors proposal. 
    16591615 
    1660 Pruner and derived classes 
    1661 ============================== 
     1616Pruning 
     1617======= 
    16621618 
    16631619.. index:: 
    16641620    pair: classification trees; pruning 
    16651621 
    1666 Classes derived from :obj:`Pruner` prune the trees as a 
    1667 described in the section pruning XXXXXXXX - make sure you read it  
    1668 to understand what the pruners will do to your trees. 
     1622Tree pruners derived from :obj:`Pruner` can be given either a 
     1623:obj:`Node` (presumably, but not necessarily a root) or a 
     1624:obj:`_TreeClassifier`. The result is a new :obj:`Node` 
     1625or a :obj:`_TreeClassifier` with a pruned tree. The original 
     1626tree remains intact. 
     1627 
     1628The pruners construct only a shallow copy of a tree. 
     1629The pruned tree's :obj:`Node` contain references to the same 
     1630contingency matrices, node classifiers, branch selectors, ... 
     1631as the original tree. Thus, you may modify a pruned tree structure 
     1632(manually cut it, add new nodes, replace components) but modifying, 
     1633for instance, some node's :obj:`nodeClassifier` (a 
     1634:obj:`nodeClassifier` itself, not a reference to it!) would modify 
     1635the node's :obj:`nodeClassifier` in the corresponding node of 
     1636the original tree. 
     1637 
     1638Pruners cannot construct a 
     1639:obj:`nodeClassifier` nor merge :obj:`nodeClassifier` of the pruned 
     1640subtrees into classifiers for new leaves. Thus, if you want to build 
     1641a prunable tree, internal nodes must have their :obj:`nodeClassifier` 
     1642defined. Fortunately, this is the default. 
    16691643 
    16701644.. class:: Pruner 
Note: See TracChangeset for help on using the changeset viewer.