Changeset 7212:a9437dc5ffef in orange


Ignore:
Timestamp:
02/02/11 18:03:20 (3 years ago)
Author:
markotoplak
Branch:
default
Convert:
f5709d9a53a2280c79d9bb06934ebef29b796daa
Message:

Tree docomentation progressing.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/classification/tree.py

    r7197 r7212  
    55This page describes the Orange trees. It first describes the basic components and procedures: it starts with <A href="#structure">the structure</A> that represents the tree, then it defines <A href="#classification">how the tree is used for classification</A>, then <A href="#learning">how it is built</A> and <a href="#pruning">pruned</A>. The order might seem strange, but the things are rather complex and this order is perhaps a bit easier to follow. After you have some idea about what the principal components do, we described the <a href="#classes">concrete classes</A> that you can use as components for a tree learner. 
    66 
    7 Classification trees are represented as a tree-like hierarchy of TreeNode classes. 
     7Classification trees are represented as a tree-like hierarchy of :obj:`TreeNode` classes. 
    88 
    99 
     
    1717     
    1818        Stores a distribution for learning examples belonging to the node. 
    19         Storing distributions can be disabled by setting the TreeLearners's 
    20         storeDistributions flag to false. 
     19        Storing distributions can be disabled by setting the  
     20        :obj:`TreeLearner`'s storeDistributions flag to false. 
    2121 
    2222    .. attribute:: contingency 
     
    2424        Stores complete contingency matrices for the learning examples  
    2525        belonging to the node. Storing contingencies can be enabled by  
    26         setting <code>TreeLearner</code>'s <code>storeContingencies</code>  
     26        setting :obj:`TreeLearner`'s :obj:`storeContingencies`  
    2727        flag to <CODE>true</CODE>. Note that even when the flag is not  
    2828        set, the contingencies get computed and stored to  
    29         <code>TreeNode</code>, but are removed shortly afterwards.  
     29        :obj:`TreeNone`, but are removed shortly afterwards.  
    3030        The details are given in the  
    31         description of the <code>TreeLearner</code> object. 
     31        description of the :obj:`TreeLearner`object. 
    3232 
    3333    .. attribute:: examples, weightID 
    3434 
    3535        Store a set of learning examples for the node and the 
    36         corresponding ID of weight meta attribute. The root of the 
     36        corresponding ID of /weight meta attribute. The root of the 
    3737        tree stores a "master table" of examples, while other nodes' 
    38         <CODE>ExampleTable</CODE>s contain reference to examples in 
    39         the root's <CODE>ExampleTable</CODE>. Examples are only stored 
    40         if a corresponding flag (<code>storeExamples</code>) has been 
     38        :obj:`orange.ExampleTable` contain reference to examples in 
     39        the root's :obj:`orange.ExampleTable`. Examples are only stored 
     40        if a corresponding flag (:obj:`storeExamples`) has been 
    4141        set while building the tree; to conserve the space, storing 
    42         is disabled by default.</DD> 
     42        is disabled by default. 
    4343 
    4444    .. attribute:: nodeClassifier 
    4545 
    4646        A classifier (usually, but not necessarily, a 
    47         <code>DefaultClassifier</code>) that can be used to classify 
     47        :obj:`DefaultClassifier`) that can be used to classify 
    4848        examples coming to the node. If the node is a leaf, this is 
    4949        used to decide the final class (or class distribution) of an 
    5050        example. If it's an internal node, it is stored if 
    51         <code>TreeNode</code>'s flag <code>storeNodeClassifier</code> 
    52         is set. Since the <code>nodeClassifier</code> is needed by 
    53         some <code>TreeDescenders</code> and for pruning (see far below), 
     51        :obj:`TreeNode`'s flag :obj:`storeNodeClassifier` 
     52        is set. Since the :obj:`nodeClassifier` is needed by 
     53        :obj:`TreeDescender` and for pruning (see far below), 
    5454        this is the default behaviour; space consumption of the default 
    55         <code>DefaultClassifier</code> is rather small. You should 
     55        :obj:`DefaultClassifier` is rather small. You should 
    5656        never disable this if you intend to prune the tree later. 
    5757 
     
    6060    .. attribute:: branches 
    6161 
    62         Stores a list of subtrees, given as <code>TreeNode</code>s.  
     62        Stores a list of subtrees, given as :obj:`TreeNode`. 
    6363        An element can be <code>None</code>; in this case the node is empty. 
    6464 
    65     .. attribute:: branchDescriptionsa 
     65    .. attribute:: branchDescriptions 
    6666 
    6767        A list with string descriptions for branches, constructed by 
    68         <code>TreeSplitConstructor</code>. It can contain different kinds 
     68        :obj:`TreeSplitConstructor`. It can contain different kinds 
    6969        of descriptions, but basically, expect things like 'red' or '>12.3'. 
    7070 
     
    7979 
    8080        Gives a branch for each example. The same object is used during 
    81         learning and classifying. The <code>branchSelector</code> is of type <code>Classifier</code>, since its job is similar to that of a classifier: it gets an example and returns discrete <code>Value</code> in range [0, <CODE>len(branches)-1</CODE>]. When an example cannot be classified to any branch, the selector can return a <CODE>Value</CODE> containing a special value (<code>sVal</code>) which should be a discrete distribution (<code>DiscDistribution</code>). This should represent a <code>branchSelector</code>'s opinion of how to divide the example between the branches. Whether the proposition will be used or not depends upon the chosen <code>TreeExampleSplitter</code> (when learning) or <code>TreeDescender</code> (when classifying).</DD> 
    82 </DL> 
    83  
    84 <p>The three lists (<code>branches</code>, <code>branchDescriptions</code> and <code>branchSizes</code>) are of the same length; all of them are defined if the node is internal and none if it is a leaf.</p> 
    85  
    86 <p><code>TreeNode</code> has a method <code>treesize()</code> that returns the number of nodes in the subtrees (including the node, excluding null-nodes).</p> 
     81        learning and classifying. The :obj:`branchSelector` is of 
     82        type :obj:`orange.Classifier`, since its job is similar to that 
     83        of a classifier: it gets an example and returns discrete 
     84        :obj:`orange.Value` in range [0, <CODE>len(branches)-1</CODE>]. 
     85        When an example cannot be classified to any branch, the selector 
     86        can return a :obj:`orange.Value` containing a special value 
     87        (<code>sVal</code>) which should be a discrete distribution 
     88        (<code>DiscDistribution</code>). This should represent a 
     89        :obj:`branchSelector`'s opinion of how to divide the 
     90        example between the branches. Whether the proposition will be 
     91        used or not depends upon the chosen :obj:`TreeExampleSplitter` 
     92        (when learning) or :obj:`TreeDescender` (when classifying). 
     93 
     94    The lists :obj:`branches`, :obj:`branchDescriptions` and :obj:`branchSizes` are of the same length; all of them are defined if the node is internal and none if it is a leaf. 
     95 
     96    .. method:: treeSize(): 
     97         
     98        Return the number of nodes in the subtrees (including the node, excluding null-nodes). 
    8799 
    88100<A name="classification"></A> 
    89101<H3>Classification</H3> 
    90102 
    91 <p>A <code><INDEX name="classes/TreeClassifier">TreeClassifier</code> is an object that classifies examples according to a tree stored in a field <code>tree</code>.</p> 
    92  
    93 <p>Classification would be straightforward if there were no unknown values or, in general, examples that cannot be placed into a single branch. The response in such cases is determined by a component <code>descender</code>.</p> 
    94  
    95 <p><code><INDEX name="classes/TreeDescender">TreeDescender</code> is an abstract object which is given an example and whose basic job is to descend as far down the tree as possible, according to the values of example's attributes. The <code>TreeDescender</code> calls the node's <code>branchSelector</code> to get the branch index. If it's a simple index, the corresponding branch is followed. If not, it's up to descender to decide what to do, and that's where descenders differ. A <code>descender</code> can choose a single branch (for instance, the one that is the most recommended by the <code>branchSelector</code>) or it can let the branches vote.</p> 
    96  
    97 <p>In general there are three possible outcomes of a descent.</p> 
    98 <UL> 
    99 <LI>Descender reaches a leaf. This happens when nothing went wrong (there are no unknown or out-of-range values in the example) or when things went wrong, but the descender smoothed them by selecting a single branch and continued the descend. In this case, the descender returns the reached <code>TreeNode</code>.</li> 
    100  
    101 <LI><code>branchSelector</code> returned a distribution and the <code>TreeDescender</code> decided to stop the descend at this (internal) node. Again, descender returns the current <code>TreeNode</code> and nothing else.</LI> 
    102  
    103 <LI><code>branchSelector</code> returned a distribution and the <code>TreeNode</code> wants to split the example (i.e., to decide the class by voting). It returns a <code>TreeNode</code> and the vote-weights for the branches. The weights can correspond to the distribution returned by 
     103.. class:: TreeClassifier 
     104 
     105    Classifies examples according to a tree stored in :obj:`tree`. 
     106 
     107    Classification would be straightforward if there were no unknown  
     108    values or, in general, examples that cannot be placed into a  
     109    single branch. The response in such cases is determined by a 
     110    component :obj:`descender`. 
     111 
     112    :obj:`TreeDescender` is an abstract object which is given an example 
     113    and whose basic job is to descend as far down the tree as possible, 
     114    according to the values of example's attributes. The 
     115    :obj:`TreeDescender`: calls the node's :obj:`branchSelector` to get  
     116    the branch index. If it's a simple index, the corresponding branch  
     117    is followed. If not, it's up to descender to decide what to do, and 
     118    that's where descenders differ. A :obj:`descender` can choose  
     119    a single branch (for instance, the one that is the most recommended  
     120    by the :obj:`branchSelector`) or it can let the branches vote. 
     121 
     122    In general there are three possible outcomes of a descent. 
     123 
     124    # Descender reaches a leaf. This happens when nothing went wrong (there are no unknown or out-of-range values in the example) or when things went wrong, but the descender smoothed them by selecting a single branch and continued the descend. In this case, the descender returns the reached :obj:`TreeNode`. 
     125 
     126    # :obj:`branchSelector` returned a distribution and the :obj:`TreeDescender` decided to stop the descend at this (internal) node. Again, descender returns the current <code>TreeNode</code> and nothing else.</LI> 
     127 
     128    # :obj:`branchSelector` returned a distribution and the <code>TreeNode</code> wants to split the example (i.e., to decide the class by voting). It returns a <code>TreeNode</code> and the vote-weights for the branches. The weights can correspond to the distribution returned by 
    104129<code>branchSelector</code>, to the number of learning examples that were assigned to each branch, or to something else.</LI> 
    105130</UL> 
Note: See TracChangeset for help on using the changeset viewer.