Changeset 7808:b46d30b4318f in orange


Ignore:
Timestamp:
04/04/11 09:51:07 (3 years ago)
Author:
markotoplak
Branch:
default
Convert:
b2e1866eac45a4e310409ead7559439778b15ddc
Message:

Classification tree: reformatting of documentation.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/classification/tree.py

    r7764 r7808  
    1010******************** 
    1111 
    12 To build a small tree (:obj:`TreeClassifier`) from the iris data set  
    13 (with the depth limited to three levels), use  
    14 (part of `orngTree1.py`_, uses `iris.tab`_): 
     12To build a small tree (:obj:`TreeClassifier`) from the iris data set 
     13(with the depth limited to three levels), use (part of `orngTree1.py`_, 
     14uses `iris.tab`_): 
    1515 
    1616.. literalinclude:: code/orngTree1.py 
     
    2121 
    2222This page first describes the learner and the classifier, and then 
    23 defines the base classes (individual components) of the trees 
    24 and the tree-building process. 
     23defines the base classes (individual components) of the trees and the 
     24tree-building process. 
    2525 
    2626.. autoclass:: TreeLearner 
     
    3434======== 
    3535 
    36 For example, here's how to write your own stop 
    37 function. The example constructs 
    38 and prints two trees. For the first one we define the *defStop* 
    39 function, which is used by default, and combine it with a random function 
    40 so that the stop criteria will also be met in 20% of the cases 
    41 when *defStop* is false. For the second tree the stopping criteria 
    42 is random. Note that in 
    43 the second case lambda function still has three parameters, since this is 
    44 a necessary number of parameters for the stop function (:obj:`StopCriteria`). 
    45 Part of `tree3.py`_ (uses  `iris.tab`_): 
     36For example, here's how to write your own stop function. The example 
     37constructs and prints two trees. For the first one we define the 
     38*defStop* function, which is used by default, and combine it with a 
     39random function so that the stop criteria will also be met in 20% of the 
     40cases when *defStop* is false. For the second tree the stopping criteria 
     41is random. Note that in the second case lambda function still has three 
     42parameters, since this is a necessary number of parameters for the stop 
     43function (:obj:`StopCriteria`).  Part of `tree3.py`_ (uses  `iris.tab`_): 
    4644 
    4745.. _tree3.py: code/tree3.py 
     
    5654============== 
    5755 
    58 To have something to work on, we'll take the data from lenses dataset  
    59 and build a tree using the default components (part of `treestructure.py`_, uses `lenses.tab`_): 
     56To have something to work on, we'll take the data from lenses dataset and 
     57build a tree using the default components (part of `treestructure.py`_, 
     58uses `lenses.tab`_): 
    6059 
    6160.. literalinclude:: code/treestructure.py 
     
    7069   :lines: 12-21 
    7170 
    72 If node is None, we have a null-node; null nodes don't count,  
    73 so we return 0. Otherwise, the size is 1 (this node) plus the 
    74 sizes of all subtrees. The node is an internal node if it has a  
    75 :obj:`branchSelector`; it there's no selector, it's a leaf. Don't 
    76 attempt to skip the if statement: leaves don't have an empty list  
    77 of branches, they don't have a list of branches at all. 
     71If node is None, we have a null-node; null nodes don't count, so we 
     72return 0. Otherwise, the size is 1 (this node) plus the sizes of all 
     73subtrees. The node is an internal node if it has a :obj:`branchSelector`; 
     74it there's no selector, it's a leaf. Don't attempt to skip the if 
     75statement: leaves don't have an empty list of branches, they don't have 
     76a list of branches at all. 
    7877 
    7978    >>> treeSize(treeClassifier.tree) 
    8079    10 
    8180 
    82 Don't forget that this was only an excercise - :obj:`Node` has a  
    83 built-in method :obj:`Node.treeSize` that does exactly the same. 
    84  
    85 Let us now write a script that prints out a tree. The recursive 
    86 part of the function will get a node and its level  
    87 (part of `treestructure.py`_, uses `lenses.tab`_). 
     81Don't forget that this was only an excercise - :obj:`Node` has a built-in 
     82method :obj:`Node.treeSize` that does exactly the same. 
     83 
     84Let us now write a script that prints out a tree. The recursive part of 
     85the function will get a node and its level (part of `treestructure.py`_, 
     86uses `lenses.tab`_). 
    8887 
    8988.. literalinclude:: code/treestructure.py 
     
    9695"<null node>" and return. 
    9796 
    98 After handling null nodes, remaining nodes are internal nodes and leaves. 
    99 For internal nodes, we print a node description consisting of the 
    100 attribute's name and distribution of classes. :obj:`Node`'s branch 
    101 description is, for all currently defined splits, an instance of a 
    102 class derived from :obj:`orange.Classifier` (in fact, it is a 
    103 :obj:`orange.ClassifierFromVarFD`, but a :obj:`orange.Classifier` would  
    104 suffice), and its :obj:`classVar` points to the attribute we seek.  
    105 So we print its name. We will also assume that storing class distributions  
    106 has not been disabled and print them as well.   
    107 Then we iterate  
    108 through branches; for each we print a branch description and iteratively  
    109 call the :obj:`printTree0` with a level increased by 1 (to increase  
    110 the indent). 
    111  
    112 Finally, if the node is a leaf, we print out the distribution of  
    113 learning examples in the node and the class to which the examples in  
    114 the node would be classified. We again assume that the :obj:`nodeClassifier`  
    115 is the default one - a :obj:`DefaultClassifier`. A better print  
    116 function should be aware of possible alternatives. 
    117  
    118 Now, we just need to write a simple function to call our printTree0.  
     97After handling null nodes, remaining nodes are internal nodes and 
     98leaves.  For internal nodes, we print a node description consisting 
     99of the attribute's name and distribution of classes. :obj:`Node`'s 
     100branch description is, for all currently defined splits, an instance 
     101of a class derived from :obj:`orange.Classifier` (in fact, it is 
     102a :obj:`orange.ClassifierFromVarFD`, but a :obj:`orange.Classifier` 
     103would suffice), and its :obj:`classVar` points to the attribute we seek. 
     104So we print its name. We will also assume that storing class distributions 
     105has not been disabled and print them as well.  Then we iterate through 
     106branches; for each we print a branch description and iteratively call the 
     107:obj:`printTree0` with a level increased by 1 (to increase the indent). 
     108 
     109Finally, if the node is a leaf, we print out the distribution of learning 
     110examples in the node and the class to which the examples in the node 
     111would be classified. We again assume that the :obj:`nodeClassifier` is 
     112the default one - a :obj:`DefaultClassifier`. A better print function 
     113should be aware of possible alternatives. 
     114 
     115Now, we just need to write a simple function to call our printTree0. 
    119116We could write something like... 
    120117 
     
    124121        printTree0(x.tree, 0) 
    125122 
    126 ... but we won't. Let us learn how to handle arguments of different 
    127 types. Let's write a function that will accept either a  
    128 :obj:`TreeClassifier` or a :obj:`Node`. 
    129 Part of `treestructure.py`_, uses `lenses.tab`_. 
     123... but we won't. Let us learn how to handle arguments of 
     124different types. Let's write a function that will accept either a 
     125:obj:`TreeClassifier` or a :obj:`Node`.  Part of `treestructure.py`_, 
     126uses `lenses.tab`_. 
    130127 
    131128.. literalinclude:: code/treestructure.py 
    132129   :lines: 43-49 
    133130 
    134 It's fairly straightforward: if :obj:`x` is of type derived from  
    135 :obj:`TreeClassifier`, we print :obj:`x.tree`; if it's  
    136 :obj:`Node` we just call :obj:`printTree0` with :obj:`x`. If it's  
    137 of some other type, we don't know how to handle it and thus raise  
    138 an exception. The output:: 
     131It's fairly straightforward: if :obj:`x` is of type derived from 
     132:obj:`TreeClassifier`, we print :obj:`x.tree`; if it's :obj:`Node` we 
     133just call :obj:`printTree0` with :obj:`x`. If it's of some other type, 
     134we don't know how to handle it and thus raise an exception. The output:: 
    139135 
    140136    >>> printTree(treeClassifier) 
     
    153149          : hypermetrope --> none (<2.000, 0.000, 1.000>) 
    154150 
    155 For a final exercise, let us write a simple pruning function. It will  
    156 be written entirely in Python, unrelated to any :obj:`Pruner`. It 
    157 will limit the maximal tree depth (the number of internal nodes on any 
    158 path down the tree) given as an argument. 
    159 For example, to get a two-level tree, we would 
    160 call cutTree(root, 2). The function will be recursive, with the second  
    161 argument (level) decreasing at each call; when zero, the current node  
    162 will be made a leaf (part of `treestructure.py`_, uses `lenses.tab`_): 
     151For a final exercise, let us write a simple pruning function. It will 
     152be written entirely in Python, unrelated to any :obj:`Pruner`. It will 
     153limit the maximal tree depth (the number of internal nodes on any path 
     154down the tree) given as an argument.  For example, to get a two-level 
     155tree, we would call cutTree(root, 2). The function will be recursive, 
     156with the second argument (level) decreasing at each call; when zero, 
     157the current node will be made a leaf (part of `treestructure.py`_, uses 
     158`lenses.tab`_): 
    163159 
    164160.. literalinclude:: code/treestructure.py 
    165161   :lines: 54-62 
    166162 
    167 There's nothing to prune at null-nodes or leaves, so we act only when  
    168 :obj:`node` and :obj:`node.branchSelector` are defined. If level is  
    169 not zero, we call the function for each branch. Otherwise, we clear  
    170 the selector, branches and branch descriptions. 
     163There's nothing to prune at null-nodes or leaves, so we act only when 
     164:obj:`node` and :obj:`node.branchSelector` are defined. If level is 
     165not zero, we call the function for each branch. Otherwise, we clear the 
     166selector, branches and branch descriptions. 
    171167 
    172168    >>> cutTree(tree.tree, 2) 
     
    182178======== 
    183179 
    184 You could just call :class:`TreeLearner` and let it fill the empty 
    185 slots with the default 
    186 components. This section will teach you three things: what are the 
    187 missing components (and how to set the same components yourself), 
    188 how to use alternative components to get a different tree and, 
     180You could just call :class:`TreeLearner` and let it fill the empty slots 
     181with the default components. This section will teach you three things: 
     182what are the missing components (and how to set the same components 
     183yourself), how to use alternative components to get a different tree and, 
    189184finally, how to write a skeleton for tree induction in Python. 
    190185 
    191186.. _treelearner.py: code/treelearner.py 
    192187 
    193 Let us construct a :obj:`TreeLearner` to play with (`treelearner.py`_, uses `lenses.tab`_): 
     188Let us construct a :obj:`TreeLearner` to play with (`treelearner.py`_, 
     189uses `lenses.tab`_): 
    194190 
    195191.. literalinclude:: code/treelearner.py 
    196192   :lines: 7-10 
    197193 
    198 There are three crucial components in learning: the :obj:`~TreeLearner.split` and 
    199 :obj:`~TreeLearner.stop` 
    200 criteria, and the example :obj:`~TreeLearner.splitter` (there are some others, 
    201 which become important during classification; we'll talk about them 
    202 later). They are not defined; if you use the learner, the slots are 
    203 filled temporarily but later cleared again. 
     194There are three crucial components in learning: the 
     195:obj:`~TreeLearner.split` and :obj:`~TreeLearner.stop` criteria, and the 
     196example :obj:`~TreeLearner.splitter` (there are some others, which become 
     197important during classification; we'll talk about them later). They are 
     198not defined; if you use the learner, the slots are filled temporarily 
     199but later cleared again. 
    204200 
    205201:: 
     
    224220    1.0 0.0 
    225221 
    226 Not very restrictive. This keeps splitting the examples until 
    227 there's nothing left to split or all the examples are in the same 
    228 class. Let us set the minimal subset that we allow to be split to 
    229 five examples and see what comes out. 
     222Not very restrictive. This keeps splitting the examples until there's 
     223nothing left to split or all the examples are in the same class. Let us 
     224set the minimal subset that we allow to be split to five examples and 
     225see what comes out. 
    230226 
    231227    >>> learner.stop.minExamples = 5.0 
     
    259255 
    260256Several classes described above are already functional and can 
    261 (and mostly will) be used as they are: :obj:`Node`, 
    262 :obj:`_TreeLearner` and :obj:`TreeClassifier`.  
    263 Classes :obj:`SplitConstructor`, :obj:`StopCriteria`,  
    264 :obj:`ExampleSplitter`, :obj:`Descender`  
    265 are among the Orange (C++ implemented)  
    266 classes that can be subtyped  
    267 in Python. You can thus program your own components based on 
    268 these classes. 
     257(and mostly will) be used as they are: :obj:`Node`, :obj:`_TreeLearner` 
     258and :obj:`TreeClassifier`.  Classes :obj:`SplitConstructor`, 
     259:obj:`StopCriteria`, :obj:`ExampleSplitter`, :obj:`Descender` are among 
     260the Orange (C++ implemented) classes that can be subtyped in Python. You 
     261can thus program your own components based on these classes. 
    269262 
    270263.. class:: Node 
    271264 
    272     Node stores information about the learning examples belonging  
    273     to the node, a branch selector, a list of branches (if the node is  
    274     not a leaf) with their descriptions and strengths, and a classifier. 
     265    Node stores information about the learning examples belonging to 
     266    the node, a branch selector, a list of branches (if the node is not 
     267    a leaf) with their descriptions and strengths, and a classifier. 
    275268 
    276269    .. attribute:: distribution 
    277270     
    278         Stores a distribution for learning examples belonging to the node. 
    279         Storing distributions can be disabled by setting the  
     271        Stores a distribution for learning examples belonging to the 
     272        node.  Storing distributions can be disabled by setting the 
    280273        :obj:`_TreeLearner`'s storeDistributions flag to false. 
    281274 
    282275    .. attribute:: contingency 
    283276 
    284         Stores complete contingency matrices for the learning examples  
    285         belonging to the node. Storing contingencies can be enabled by  
    286         setting :obj:`_TreeLearner`'s :obj:`storeContingencies`  
    287         flag to true. Note that even when the flag is not  
    288         set, the contingencies get computed and stored to  
    289         :obj:`Node`, but are removed shortly afterwards.  
    290         The details are given in the  
    291         description of the :obj:`_TreeLearner` object. 
     277        Stores complete contingency matrices for the learning examples 
     278        belonging to the node. Storing contingencies can be enabled by 
     279        setting :obj:`_TreeLearner`'s :obj:`storeContingencies` flag to 
     280        true. Note that even when the flag is not set, the contingencies 
     281        get computed and stored to :obj:`Node`, but are removed shortly 
     282        afterwards.  The details are given in the description of the 
     283        :obj:`_TreeLearner` object. 
    292284 
    293285    .. attribute:: examples, weightID 
     
    296288        corresponding ID of /weight meta attribute. The root of the 
    297289        tree stores a "master table" of examples, while other nodes' 
    298         :obj:`Orange.data.Table` contain reference to examples in 
    299         the root's :obj:`Orange.data.Table`. Examples are only stored 
    300         if a corresponding flag (:obj:`storeExamples`) has been 
    301         set while building the tree; to conserve the space, storing 
    302         is disabled by default. 
     290        :obj:`Orange.data.Table` contain reference to examples in the 
     291        root's :obj:`Orange.data.Table`. Examples are only stored if 
     292        a corresponding flag (:obj:`storeExamples`) has been set while 
     293        building the tree; to conserve the space, storing is disabled 
     294        by default. 
    303295 
    304296    .. attribute:: nodeClassifier 
    305297 
    306298        A classifier (usually, but not necessarily, a 
    307         :obj:`DefaultClassifier`) that can be used to classify 
    308         examples coming to the node. If the node is a leaf, this is 
    309         used to decide the final class (or class distribution) of an 
    310         example. If it's an internal node, it is stored if 
    311         :obj:`Node`'s flag :obj:`storeNodeClassifier` 
    312         is set. Since the :obj:`nodeClassifier` is needed by 
    313         :obj:`Descender` and for pruning (see far below), 
     299        :obj:`DefaultClassifier`) that can be used to classify examples 
     300        coming to the node. If the node is a leaf, this is used to 
     301        decide the final class (or class distribution) of an example. If 
     302        it's an internal node, it is stored if :obj:`Node`'s flag 
     303        :obj:`storeNodeClassifier` is set. Since the :obj:`nodeClassifier` 
     304        is needed by :obj:`Descender` and for pruning (see far below), 
    314305        this is the default behaviour; space consumption of the default 
    315         :obj:`DefaultClassifier` is rather small. You should 
    316         never disable this if you intend to prune the tree later. 
    317  
    318     If the node is a leaf, the remaining fields are None.  
    319     If it's an internal node, there are several additional fields. 
     306        :obj:`DefaultClassifier` is rather small. You should never 
     307        disable this if you intend to prune the tree later. 
     308 
     309    If the node is a leaf, the remaining fields are None.  If it's an 
     310    internal node, there are several additional fields. 
    320311 
    321312    .. attribute:: branches 
    322313 
    323         Stores a list of subtrees, given as :obj:`Node`. 
    324         An element can be None; in this case the node is empty. 
     314        Stores a list of subtrees, given as :obj:`Node`.  An element 
     315        can be None; in this case the node is empty. 
    325316 
    326317    .. attribute:: branchDescriptions 
    327318 
    328319        A list with string descriptions for branches, constructed by 
    329         :obj:`SplitConstructor`. It can contain different kinds 
    330         of descriptions, but basically, expect things like 'red' or '>12.3'. 
     320        :obj:`SplitConstructor`. It can contain different kinds of 
     321        descriptions, but basically, expect things like 'red' or '>12.3'. 
    331322 
    332323    .. attribute:: branchSizes 
    333324 
    334325        Gives a (weighted) number of training examples that went into 
    335         each branch. This can be used later, for instance, for 
    336         modeling probabilities when classifying examples with 
    337         unknown values. 
     326        each branch. This can be used later, for instance, for modeling 
     327        probabilities when classifying examples with unknown values. 
    338328 
    339329    .. attribute:: branchSelector 
    340330 
    341         Gives a branch for each example. The same object is used during 
    342         learning and classifying. The :obj:`branchSelector` is of 
    343         type :obj:`orange.Classifier`, since its job is similar to that 
    344         of a classifier: it gets an example and returns discrete 
    345         :obj:`Orange.data.Value` in range :samp:`[0, len(branches)-1]`. 
    346         When an example cannot be classified to any branch, the selector 
    347         can return a :obj:`Orange.data.Value` containing a special value 
    348         (sVal) which should be a discrete distribution 
    349         (DiscDistribution). This should represent a 
    350         :obj:`branchSelector`'s opinion of how to divide the 
    351         example between the branches. Whether the proposition will be 
    352         used or not depends upon the chosen :obj:`ExampleSplitter` 
    353         (when learning) or :obj:`Descender` (when classifying). 
     331        Gives a branch for each example. The same object is used 
     332        during learning and classifying. The :obj:`branchSelector` 
     333        is of type :obj:`orange.Classifier`, since its job is 
     334        similar to that of a classifier: it gets an example and 
     335        returns discrete :obj:`Orange.data.Value` in range :samp:`[0, 
     336        len(branches)-1]`.  When an example cannot be classified to 
     337        any branch, the selector can return a :obj:`Orange.data.Value` 
     338        containing a special value (sVal) which should be a discrete 
     339        distribution (DiscDistribution). This should represent a 
     340        :obj:`branchSelector`'s opinion of how to divide the example 
     341        between the branches. Whether the proposition will be used or not 
     342        depends upon the chosen :obj:`ExampleSplitter` (when learning) 
     343        or :obj:`Descender` (when classifying). 
    354344 
    355345    The lists :obj:`branches`, :obj:`branchDescriptions` and 
     
    359349    .. method:: treeSize() 
    360350         
    361         Return the number of nodes in the subtrees (including the 
    362         node, excluding null-nodes). 
     351        Return the number of nodes in the subtrees (including the node, 
     352        excluding null-nodes). 
    363353 
    364354 
     
    373363        The root of the tree, represented as a :class:`Node`. 
    374364     
    375     Classification would be straightforward if there were no unknown  
    376     values or, in general, examples that cannot be placed into a  
    377     single branch. The response in such cases is determined by a 
    378     component :obj:`descender`. 
     365    Classification would be straightforward if there were no unknown 
     366    values or, in general, examples that cannot be placed into a single 
     367    branch. The response in such cases is determined by a component 
     368    :obj:`descender`. 
    379369 
    380370    :obj:`Descender` is an abstract object which is given an example 
    381371    and whose basic job is to descend as far down the tree as possible, 
    382     according to the values of example's attributes. The 
    383     :obj:`Descender`: calls the node's :obj:`branchSelector` to get  
    384     the branch index. If it's a simple index, the corresponding branch  
    385     is followed. If not, it's up to descender to decide what to do, and 
    386     that's where descenders differ. A :obj:`descender` can choose  
    387     a single branch (for instance, the one that is the most recommended  
    388     by the :obj:`branchSelector`) or it can let the branches vote. 
     372    according to the values of example's attributes. The :obj:`Descender`: 
     373    calls the node's :obj:`branchSelector` to get the branch index. If 
     374    it's a simple index, the corresponding branch is followed. If not, 
     375    it's up to descender to decide what to do, and that's where descenders 
     376    differ. A :obj:`descender` can choose a single branch (for instance, 
     377    the one that is the most recommended by the :obj:`branchSelector`) 
     378    or it can let the branches vote. 
    389379 
    390380    In general there are three possible outcomes of a descent. 
    391381 
    392     #. Descender reaches a leaf. This happens when nothing went wrong  
    393        (there are no unknown or out-of-range values in the example) or  
    394        when things went wrong, but the descender smoothed them by  
    395        selecting a single branch and continued the descend. In this 
    396        case, the descender returns the reached :obj:`Node`. 
    397     #. :obj:`branchSelector` returned a distribution and the  
    398        :obj:`Descender` decided to stop the descend at this  
    399        (internal) node.  Again, descender returns the current  
    400        :obj:`Node` and nothing else. 
    401     #. :obj:`branchSelector` returned a distribution and the  
    402        :obj:`Node` wants to split the example (i.e., to decide the  
    403        class by voting).  
    404  
    405     It returns a :obj:`Node` and the vote-weights for the branches.  
     382    #. Descender reaches a leaf. This happens when nothing went wrong 
     383       (there are no unknown or out-of-range values in the example) 
     384       or when things went wrong, but the descender smoothed them by 
     385       selecting a single branch and continued the descend. In this case, 
     386       the descender returns the reached :obj:`Node`. 
     387    #. :obj:`branchSelector` returned a distribution and the 
     388       :obj:`Descender` decided to stop the descend at this (internal) 
     389       node.  Again, descender returns the current :obj:`Node` and 
     390       nothing else. 
     391    #. :obj:`branchSelector` returned a distribution and the 
     392       :obj:`Node` wants to split the example (i.e., to decide the class 
     393       by voting). 
     394 
     395    It returns a :obj:`Node` and the vote-weights for the branches. 
    406396    The weights can correspond to the distribution returned by 
    407     :obj:`branchSelector`, to the number of learning examples that 
    408     were assigned to each branch, or to something else. 
    409  
    410     :obj:`TreeClassifier` uses the descender to descend from the root.  
    411     If it returns only a :obj:`Node` and no distribution, the  
    412     descend should stop; it does not matter whether it's a leaf (the 
    413     first case above) or an internal node (the second case). The node's 
     397    :obj:`branchSelector`, to the number of learning examples that were 
     398    assigned to each branch, or to something else. 
     399 
     400    :obj:`TreeClassifier` uses the descender to descend from the root. 
     401    If it returns only a :obj:`Node` and no distribution, the descend 
     402    should stop; it does not matter whether it's a leaf (the first 
     403    case above) or an internal node (the second case). The node's 
    414404    :obj:`nodeClassifier` is used to decide the class. If the descender 
    415405    returns a :obj:`Node` and a distribution, the :obj:`TreeClassifier` 
    416     recursively calls itself for each of the subtrees and the  
    417     predictions are weighted as requested by the descender. 
    418  
    419     When voting, subtrees do not predict the class but probabilities  
    420     of classes. The predictions are multiplied by weights, summed and  
    421     the most probable class is returned. 
     406    recursively calls itself for each of the subtrees and the predictions 
     407    are weighted as requested by the descender. 
     408 
     409    When voting, subtrees do not predict the class but probabilities of 
     410    classes. The predictions are multiplied by weights, summed and the 
     411    most probable class is returned. 
    422412 
    423413    .. method:: vote() 
    424      
    425         It gets a  
    426         :obj:`Node`, an :obj:`Orange.data.Instance` and a distribution of  
    427         vote weights. For each node, it calls the  
    428         :obj:`classDistribution` and then multiplies  
    429         and sums the distribution. :obj:`vote` returns a normalized  
    430         distribution of predictions. 
     414 
     415        It gets a :obj:`Node`, an :obj:`Orange.data.Instance` and a 
     416        distribution of vote weights. For each node, it calls the 
     417        :obj:`classDistribution` and then multiplies and sums the 
     418        distribution. :obj:`vote` returns a normalized distribution 
     419        of predictions. 
    431420 
    432421    .. method:: classDistribution() 
    433422 
    434         Gets 
    435         an additional parameter, a :obj:`Node` (default tree root). 
    436         :obj:`classDistribution` uses :obj:`descender`. If descender  
    437         reaches a leaf, it calls  
    438         :obj:`nodeClassifier`, otherwise it calls :obj:`vote`. 
    439  
    440         Thus, the :obj:`vote` and  
    441         :obj:`classDistribution` are written in a form of double  
    442         recursion. The recursive calls do not happen at each node of the  
    443         tree but only at nodes where a vote is needed (that is, at nodes  
    444         where the descender halts). 
     423        Gets an additional parameter, a :obj:`Node` (default tree root). 
     424        :obj:`classDistribution` uses :obj:`descender`. If descender 
     425        reaches a leaf, it calls :obj:`nodeClassifier`, otherwise it 
     426        calls :obj:`vote`. 
     427 
     428        Thus, the :obj:`vote` and :obj:`classDistribution` are written 
     429        in a form of double recursion. The recursive calls do not happen 
     430        at each node of the tree but only at nodes where a vote is needed 
     431        (that is, at nodes where the descender halts). 
    445432 
    446433    .. method:: __call__ 
    447434 
    448         Calls the 
    449         descender. If it reaches a leaf, the class is predicted by the  
    450         leaf's :obj:`nodeClassifier`. Otherwise, it calls  
    451         :obj:`vote`. From now on, :obj:`vote` and  
    452         :obj:`classDistribution` interweave down the tree and return  
    453         a distribution of predictions. This method simply  
    454         chooses the most probable class. 
     435        Calls the descender. If it reaches a leaf, the class is 
     436        predicted by the leaf's :obj:`nodeClassifier`. Otherwise, it calls 
     437        :obj:`vote`. From now on, :obj:`vote` and :obj:`classDistribution` 
     438        interweave down the tree and return a distribution of 
     439        predictions. This method simply chooses the most probable class. 
    455440 
    456441.. class:: _TreeLearner 
    457442 
    458     The main learning object is :obj:`_TreeLearner`. It is basically  
    459     a skeleton into which the user must plug the components for particular  
     443    The main learning object is :obj:`_TreeLearner`. It is basically a 
     444    skeleton into which the user must plug the components for particular 
    460445    functions. This class is not meant to be used directly. You should 
    461446    rather use :class:`TreeLearner`. 
    462447 
    463     Components that govern the structure of the tree are :obj:`split` 
    464     (of type :obj:`SplitConstructor`), :obj:`stop` (of  
    465     type :obj:`StopCriteria` and :obj:`exampleSplitter` 
    466     (of type :obj:`ExampleSplitter`). 
     448    Components that govern the structure of the tree are 
     449    :obj:`split` (of type :obj:`SplitConstructor`), :obj:`stop` 
     450    (of type :obj:`StopCriteria` and :obj:`exampleSplitter` (of type 
     451    :obj:`ExampleSplitter`). 
    467452 
    468453    .. attribute:: split 
    469454 
    470         Object of type :obj:`SplitConstructor`. Default value,  
    471         provided by :obj:`_TreeLearner`, is :obj:`SplitConstructor_Combined` 
    472         with separate constructors for discrete and continuous attributes.  
     455        Object of type :obj:`SplitConstructor`. Default value, provided 
     456        by :obj:`_TreeLearner`, is :obj:`SplitConstructor_Combined` with 
     457        separate constructors for discrete and continuous attributes. 
    473458        Discrete attributes are used as are, while continuous attributes 
    474         are binarized. Gain ratio is used to select attributes. 
    475         A minimum of two examples in a leaf is required for discreter 
    476         and five examples in a leaf for continuous attributes.</DD> 
    477      
     459        are binarized. Gain ratio is used to select attributes.  A minimum 
     460        of two examples in a leaf is required for discreter and five 
     461        examples in a leaf for continuous attributes.</DD> 
     462 
    478463    .. attribute:: stop 
    479464 
    480465        Object of type :obj:`StopCriteria`. The default stopping 
    481         criterion stops induction when all examples in a node belong  
    482         to the same class. 
     466        criterion stops induction when all examples in a node belong to 
     467        the same class. 
    483468 
    484469    .. attribute:: splitter 
    485470 
    486         Object of type :obj:`ExampleSplitter`. The default splitter 
    487         is :obj:`ExampleSplitter_UnknownsAsSelector` that splits 
    488         the learning examples according to distributions given by the 
    489         selector. 
     471        Object of type :obj:`ExampleSplitter`. The default splitter is 
     472        :obj:`ExampleSplitter_UnknownsAsSelector` that splits the learning 
     473        examples according to distributions given by the selector. 
    490474 
    491475    .. attribute:: contingencyComputer 
    492476     
    493477        By default, this slot is left empty and ordinary contingency 
    494         matrices are computed for examples at each node. If need arises, 
    495         one can change the way the matrices are computed. This can be 
    496         used to change the way that unknown values are treated when 
    497         assessing qualities of attributes. As mentioned earlier, 
     478        matrices are computed for examples at each node. If need 
     479        arises, one can change the way the matrices are computed. This 
     480        can be used to change the way that unknown values are treated 
     481        when assessing qualities of attributes. As mentioned earlier, 
    498482        the computed matrices can be used by split constructor and by 
    499483        stopping criteria. On the other hand, they can be (and are) 
     
    502486    .. attribute:: nodeLearner 
    503487 
    504         Induces a classifier from examples belonging to a node. The 
    505         same learner is used for internal nodes and for leaves. The 
    506         default :obj:`nodeLearner` is :obj:`Orange.classification.majority.MajorityLearner`. 
     488        Induces a classifier from examples belonging to a 
     489        node. The same learner is used for internal nodes 
     490        and for leaves. The default :obj:`nodeLearner` is 
     491        :obj:`Orange.classification.majority.MajorityLearner`. 
    507492 
    508493    .. attribute:: descender 
    509494 
    510         Descending component that the induces :obj:`TreeClassifier` 
    511         will use. Default descender is  
    512         :obj:`Descender_UnknownMergeAsSelector` which votes using  
    513         the :obj:`branchSelector`'s distribution for vote weights. 
     495        Descending component that the induces :obj:`TreeClassifier` will 
     496        use. Default descender is :obj:`Descender_UnknownMergeAsSelector` 
     497        which votes using the :obj:`branchSelector`'s distribution for 
     498        vote weights. 
    514499 
    515500    .. attribute:: maxDepth 
    516501 
    517         Gives maximal tree depth; 0 means that only root is generated.  
    518         The default is 100 to prevent any infinite tree induction due 
    519         to missettings in stop criteria. If you are sure you need 
    520         larger trees, increase it. If you, on the other hand, want 
    521         to lower this hard limit, you can do so as well. 
     502        Gives maximal tree depth; 0 means that only root is generated. 
     503        The default is 100 to prevent any infinite tree induction due to 
     504        missettings in stop criteria. If you are sure you need larger 
     505        trees, increase it. If you, on the other hand, want to lower 
     506        this hard limit, you can do so as well. 
    522507 
    523508    .. attribute:: storeDistributions, storeContingencies, storeExamples, storeNodeClassifier 
    524509 
    525         Decides whether to store class distributions, contingencies  
    526         and examples in :obj:`Node`, and whether the  
    527         :obj:`nodeClassifier` should be build for internal nodes.  
    528         By default, distributions and node classifiers are stored,  
    529         while contingencies and examples are not. You won't save any  
    530         memory by not storing distributions but storing contingencies, 
    531         since distributions actually points to the same distribution 
    532         that is stored in :obj:`contingency.classes`. 
     510        Decides whether to store class distributions, contingencies and 
     511        examples in :obj:`Node`, and whether the :obj:`nodeClassifier` 
     512        should be build for internal nodes.  By default, distributions and 
     513        node classifiers are stored, while contingencies and examples are 
     514        not. You won't save any memory by not storing distributions but 
     515        storing contingencies, since distributions actually points to the 
     516        same distribution that is stored in :obj:`contingency.classes`. 
    533517 
    534518    The :obj:`_TreeLearner` first sets the defaults for missing 
     
    536520    fields, they are removed when the induction is finished. 
    537521 
    538     Then it ensures that examples are stored in a table. This is needed 
    539     because the algorithm juggles with pointers to examples. If 
     522    Then it ensures that examples are stored in a table. This is 
     523    needed because the algorithm juggles with pointers to examples. If 
    540524    examples are in a file or are fed through a filter, they are copied 
    541     to a table. Even if they are already in a table, they are copied 
    542     if :obj:`storeExamples` is set. This is to assure that pointers 
    543     remain pointing to examples even if the user later changes the 
    544     example table. If they are in the table and the :obj:`storeExamples` 
    545     flag is clear, we just use them as they are. This will obviously 
    546     crash in a multi-threaded system if one changes the table during 
    547     the tree induction. Well... don't do it. 
     525    to a table. Even if they are already in a table, they are copied if 
     526    :obj:`storeExamples` is set. This is to assure that pointers remain 
     527    pointing to examples even if the user later changes the example 
     528    table. If they are in the table and the :obj:`storeExamples` flag 
     529    is clear, we just use them as they are. This will obviously crash 
     530    in a multi-threaded system if one changes the table during the tree 
     531    induction. Well... don't do it. 
    548532 
    549533    Apriori class probabilities are computed. At this point we check 
    550     the sum of example weights; if it's zero, there are no examples and  
     534    the sum of example weights; if it's zero, there are no examples and 
    551535    we cannot proceed. A list of candidate attributes is set; in the 
    552536    beginning, all attributes are candidates for the split criterion. 
    553537 
    554     Now comes the recursive part of the :obj:`_TreeLearner`. Its arguments  
     538    Now comes the recursive part of the :obj:`_TreeLearner`. Its arguments 
    555539    are a set of examples, a weight meta-attribute ID (a tricky thing, 
    556     it can be always the same as the original or can change to  
    557     accomodate splitting of examples among branches), apriori class 
    558     distribution and a list of candidates (represented as a vector 
    559     of Boolean values). 
    560  
    561     The contingency matrix is computed next. This happens 
    562     even if the flag :obj:`storeContingencies` is false. 
    563     If the :obj:`contingencyComputer` is given we use it, 
    564     otherwise we construct just an ordinary contingency matrix. 
    565  
    566     A :obj:`stop` is called to see whether it's worth to continue. If  
    567     not, a :obj:`nodeClassifier` is built and the :obj:`Node` is  
    568     returned. Otherwise, a :obj:`nodeClassifier` is only built if  
     540    it can be always the same as the original or can change to accomodate 
     541    splitting of examples among branches), apriori class distribution 
     542    and a list of candidates (represented as a vector of Boolean values). 
     543 
     544    The contingency matrix is computed next. This happens even if the flag 
     545    :obj:`storeContingencies` is false.  If the :obj:`contingencyComputer` 
     546    is given we use it, otherwise we construct just an ordinary 
     547    contingency matrix. 
     548 
     549    A :obj:`stop` is called to see whether it's worth to continue. If 
     550    not, a :obj:`nodeClassifier` is built and the :obj:`Node` is 
     551    returned. Otherwise, a :obj:`nodeClassifier` is only built if 
    569552    :obj:`forceNodeClassifier` flag is set. 
    570553 
    571     To get a :obj:`Node`'s :obj:`nodeClassifier`, the  
    572     :obj:`nodeLearner`'s :obj:`smartLearn` function is called with  
    573     the given examples, weight ID and the just computed matrix. If  
    574     the learner can use the matrix (and the default,  
    575     :obj:`Orange.classification.majority.MajorityLearner`, can), it won't touch the examples. Thus, 
    576     a choice of :obj:`contingencyComputer` will, in many cases,  
    577     affect the :obj:`nodeClassifier`. The :obj:`nodeLearner` can 
    578     return no classifier; if so and if the classifier would be  
    579     needed for classification, the :obj:`TreeClassifier`'s function 
    580     returns DK or an empty distribution. If you're writing your own 
    581     tree classifier - pay attention. 
    582  
    583     If the induction is to continue, a :obj:`split` component is called.  
    584     If it fails to return a branch selector, induction stops and the  
     554    To get a :obj:`Node`'s :obj:`nodeClassifier`, the 
     555    :obj:`nodeLearner`'s :obj:`smartLearn` function is called 
     556    with the given examples, weight ID and the just computed 
     557    matrix. If the learner can use the matrix (and the default, 
     558    :obj:`Orange.classification.majority.MajorityLearner`, can), it won't 
     559    touch the examples. Thus, a choice of :obj:`contingencyComputer` 
     560    will, in many cases, affect the :obj:`nodeClassifier`. The 
     561    :obj:`nodeLearner` can return no classifier; if so and 
     562    if the classifier would be needed for classification, 
     563    the :obj:`TreeClassifier`'s function returns DK or an empty 
     564    distribution. If you're writing your own tree classifier - pay 
     565    attention. 
     566 
     567    If the induction is to continue, a :obj:`split` component is called. 
     568    If it fails to return a branch selector, induction stops and the 
    585569    :obj:`Node` is returned. 
    586570 
    587     :obj:`_TreeLearner` than uses :obj:`ExampleSplitter` to divide  
    588     the examples as described above. 
    589  
    590     The contingency gets removed at this point if it is not to be  
    591     stored. Thus, the :obj:`split`, :obj:`stop` and  
    592     :obj:`exampleSplitter` can use the contingency matrices if they will. 
    593  
    594     The :obj:`_TreeLearner` then recursively calls itself for each of  
    595     the non-empty subsets. If the splitter returnes a list of weights,  
    596     a corresponding weight is used for each branch. Besides, the  
    597     attribute spent by the splitter (if any) is removed from the  
    598     list of candidates for the subtree. 
    599  
    600     A subset of examples is stored in its corresponding tree node,  
    601     if so requested. If not, the new weight attributes are removed (if  
    602     any were created). 
     571    :obj:`_TreeLearner` than uses :obj:`ExampleSplitter` to divide the 
     572    examples as described above. 
     573 
     574    The contingency gets removed at this point if it is not to be 
     575    stored. Thus, the :obj:`split`, :obj:`stop` and :obj:`exampleSplitter` 
     576    can use the contingency matrices if they will. 
     577 
     578    The :obj:`_TreeLearner` then recursively calls itself for each of 
     579    the non-empty subsets. If the splitter returnes a list of weights, 
     580    a corresponding weight is used for each branch. Besides, the attribute 
     581    spent by the splitter (if any) is removed from the list of candidates 
     582    for the subtree. 
     583 
     584    A subset of examples is stored in its corresponding tree node, 
     585    if so requested. If not, the new weight attributes are removed 
     586    (if any were created). 
    603587 
    604588 
     
    607591 
    608592Split construction is almost as exciting as waiting for a delayed flight. 
    609 Boring, that is. Split constructors juggle  
    610 with contingency matrices, with separate cases for discrete 
    611 and continuous classes... Most split constructors work either for 
    612 discrete or for continuous attributes. We suggest to use  
    613 a :obj:`SplitConstructor_Combined` that delegates  
     593Boring, that is. Split constructors juggle with contingency matrices, 
     594with separate cases for discrete and continuous classes... Most split 
     595constructors work either for discrete or for continuous attributes. We 
     596suggest to use a :obj:`SplitConstructor_Combined` that delegates 
    614597attributes to specialized split constructors. 
    615598 
     
    617600type (discrete, continuous) do not report an error or a warning but 
    618601simply skip the attribute. It is your responsibility to use a correct 
    619 split constructor for your dataset. (May we again suggest 
    620 using :obj:`SplitConstructor_Combined`?) 
     602split constructor for your dataset. (May we again suggest using 
     603:obj:`SplitConstructor_Combined`?) 
    621604 
    622605The same components can be used either for inducing classification and 
    623606regression trees. The only component that needs to be chosen accordingly 
    624 is the 'measure' attribute for the :obj:`SplitConstructor_Measure` 
    625 class (and derived classes). 
     607is the 'measure' attribute for the :obj:`SplitConstructor_Measure` class 
     608(and derived classes). 
    626609 
    627610.. class:: SplitConstructor 
    628611 
    629     Finds a suitable criteria for dividing the learning (and later testing) 
    630     examples coming to the node. The data it gets is a set of examples 
    631     (and, optionally, an ID of weight meta-attribute), a domain 
    632     contingency computed from examples, apriori class probabilities, 
    633     a list of candidate attributes it should consider and a node 
    634     classifier (if it was constructed, that is, if  
    635     :obj:`storeNodeClassifier` is left true). 
     612    Finds a suitable criteria for dividing the learning (and later 
     613    testing) examples coming to the node. The data it gets is a set of 
     614    examples (and, optionally, an ID of weight meta-attribute), a domain 
     615    contingency computed from examples, apriori class probabilities, a 
     616    list of candidate attributes it should consider and a node classifier 
     617    (if it was constructed, that is, if :obj:`storeNodeClassifier` 
     618    is left true). 
    636619 
    637620    The :obj:`SplitConstructor` should use the domain contingency 
     
    641624    explained later. There are, however, cases, when domain contingency 
    642625    does not suffice, for examples, when ReliefF is used as a measure 
    643     of quality of attributes. In this case, there's no other way but 
    644     to use the examples and ignore the precomputed contingencies. 
     626    of quality of attributes. In this case, there's no other way but to 
     627    use the examples and ignore the precomputed contingencies. 
    645628 
    646629    :obj:`SplitConstructor` returns most of the data we talked 
    647630    about when describing the :obj:`Node`. It returns a classifier 
    648     to be used as :obj:`Node`'s :obj:`branchSelector`, a list of 
    649     branch descriptions and a list with the number of examples that 
    650     go into each branch. Just what we need for the :obj:`Node`. 
    651     It can return an empty list for the number of examples in branches; 
    652     in this case, the :obj:`_TreeLearner` will find the number itself 
    653     after splitting the example set into subsets. However, if a split 
    654     constructors can provide the numbers at no extra computational 
    655     cost, it should do so. 
     631    to be used as :obj:`Node`'s :obj:`branchSelector`, a list of branch 
     632    descriptions and a list with the number of examples that go into 
     633    each branch. Just what we need for the :obj:`Node`.  It can return 
     634    an empty list for the number of examples in branches; in this case, 
     635    the :obj:`_TreeLearner` will find the number itself after splitting 
     636    the example set into subsets. However, if a split constructors can 
     637    provide the numbers at no extra computational cost, it should do so. 
    656638 
    657639    In addition, it returns a quality of the split; a number without 
     
    660642    If the constructed splitting criterion uses an attribute in such 
    661643    a way that the attribute is 'completely spent' and should not be 
    662     considered as a split criterion in any of the subtrees (the 
    663     typical case of this are discrete attributes that are used 
    664     as-they-are, that is, without any binarization or subsetting), 
    665     then it should report the index of this attribute. Some splits 
    666     do not spend any attribute; this is indicated by returning a 
    667     negative index. 
     644    considered as a split criterion in any of the subtrees (the typical 
     645    case of this are discrete attributes that are used as-they-are, that 
     646    is, without any binarization or subsetting), then it should report 
     647    the index of this attribute. Some splits do not spend any attribute; 
     648    this is indicated by returning a negative index. 
    668649 
    669650    A :obj:`SplitConstructor` can veto the further tree induction 
    670651    by returning no classifier. This can happen for many reasons. 
    671652    A general one is related to number of examples in the branches. 
    672     :obj:`SplitConstructor` has a field :obj:`minSubset`, 
    673     which sets the minimal number of examples in a branch; null nodes, 
    674     however, are allowed. If there is no split where this condition 
    675     is met, :obj:`SplitConstructor` stops the induction. 
     653    :obj:`SplitConstructor` has a field :obj:`minSubset`, which sets 
     654    the minimal number of examples in a branch; null nodes, however, 
     655    are allowed. If there is no split where this condition is met, 
     656    :obj:`SplitConstructor` stops the induction. 
    676657 
    677658    .. attribute:: minSubset 
     
    708689    Bases: :class:`SplitConstructor` 
    709690 
    710     An abstract base class for split constructors that employ  
    711     a :class:`Orange.feature.scoring.Measure` to assess a quality of a split.  
    712     All split constructors except for :obj:`SplitConstructor_Combined` 
    713     are derived from this class. 
     691    An abstract base class for split constructors that employ 
     692    a :class:`Orange.feature.scoring.Measure` to assess a 
     693    quality of a split.  All split constructors except for 
     694    :obj:`SplitConstructor_Combined` are derived from this class. 
    714695 
    715696    .. attribute:: measure 
    716697 
    717         A component of type :class:`Orange.feature.scoring.Measure` used for 
    718         split evaluation. You must select a  
    719         :class:`Orange.feature.scoring.Measure` capable of handling your  
    720         class type  
    721         - for example, you cannot use :class:`Orange.feature.scoring.GainRatio` 
    722         for building regression trees or :class:`Orange.feature.scoring.MSE` 
    723         for classification trees. 
     698        A component of type :class:`Orange.feature.scoring.Measure` 
     699        used for split evaluation. You must select a 
     700        :class:`Orange.feature.scoring.Measure` capable of 
     701        handling your class type - for example, you cannot use 
     702        :class:`Orange.feature.scoring.GainRatio` for building regression 
     703        trees or :class:`Orange.feature.scoring.MSE` for classification 
     704        trees. 
    724705 
    725706    .. attribute:: worstAcceptable 
     
    733714    Bases: :class:`SplitConstructor_Measure` 
    734715 
    735     Attempts to use a discrete attribute as a split; each value of the  
    736     attribute corresponds to a branch in the tree. Attributes are 
    737     evaluated with the :obj:`measure` and the one with the 
    738     highest score is used for a split. If there is more than one 
    739     attribute with the highest score, one of them is selected by random. 
    740  
    741     The constructed :obj:`branchSelector` is an instance of  
    742     :obj:`orange.ClassifierFromVarFD` that returns a value of the  
    743     selected attribute. If the attribute is  
    744     :obj:`Orange.data.variable.Discrete`, 
    745     :obj:`branchDescription`'s are the attribute's values. The  
    746     attribute is marked as spent, so that it cannot reappear in the  
    747     node's subtrees. 
     716    Attempts to use a discrete attribute as a split; each value of 
     717    the attribute corresponds to a branch in the tree. Attributes are 
     718    evaluated with the :obj:`measure` and the one with the highest score 
     719    is used for a split. If there is more than one attribute with the 
     720    highest score, one of them is selected by random. 
     721 
     722    The constructed :obj:`branchSelector` is an instance of 
     723    :obj:`orange.ClassifierFromVarFD` that returns a value of the selected 
     724    attribute. If the attribute is :obj:`Orange.data.variable.Discrete`, 
     725    :obj:`branchDescription`'s are the attribute's values. The attribute 
     726    is marked as spent, so that it cannot reappear in the node's subtrees. 
    748727 
    749728.. class:: SplitConstructor_ExhaustiveBinary 
     
    753732    Works on discrete attributes. For each attribute, it determines 
    754733    which binarization of the attribute gives the split with the 
    755     highest score. If more than one split has the highest score, 
    756     one of them is selected by random. After trying all the attributes, 
     734    highest score. If more than one split has the highest score, one 
     735    of them is selected by random. After trying all the attributes, 
    757736    it returns one of those with the highest score. 
    758737 
     
    760739    :obj:`orange.ClassifierFromVarFD` that returns a value of the 
    761740    selected attribute. This time, however, its :obj:`transformer` 
    762     contains an instance of :obj:`MapIntValue` that maps the values 
    763     of the attribute into a binary attribute. Branch descriptions are 
    764     of form "[<val1>, <val2>, ...<valn>]" for branches corresponding to 
    765     more than one value of the attribute. Branches that correspond to 
    766     a single value of the attribute are described with this value. If  
    767     the attribute was originally binary, it is spent and cannot be  
    768     used in the node's subtrees. Otherwise, it can reappear in the  
    769     subtrees. 
     741    contains an instance of :obj:`MapIntValue` that maps the values of 
     742    the attribute into a binary attribute. Branch descriptions are of 
     743    form "[<val1>, <val2>, ...<valn>]" for branches corresponding to 
     744    more than one value of the attribute. Branches that correspond to a 
     745    single value of the attribute are described with this value. If the 
     746    attribute was originally binary, it is spent and cannot be used in 
     747    the node's subtrees. Otherwise, it can reappear in the subtrees. 
    770748 
    771749 
     
    776754    This is currently the only constructor for splits with continuous  
    777755    attributes. It divides the range of attributes values with a threshold  
    778     that maximizes the split's quality. As always, if there is more than  
    779     one split with the highest score, a random threshold is selected.  
     756    that maximizes the split's quality. As always, if there is more than 
     757    one split with the highest score, a random threshold is selected. 
    780758    The attribute that yields the highest binary split is returned. 
    781759 
    782     The constructed :obj:`branchSelector` is again an instance of  
    783     :obj:`orange.ClassifierFromVarFD` with an attached  
    784     :obj:`transformer`. This time, :obj:`transformer` is of type  
    785     :obj:`orange.ThresholdDiscretizer`. The branch descriptions are  
     760    The constructed :obj:`branchSelector` is again an instance 
     761    of :obj:`orange.ClassifierFromVarFD` with an attached 
     762    :obj:`transformer`. This time, :obj:`transformer` is of type 
     763    :obj:`orange.ThresholdDiscretizer`. The branch descriptions are 
    786764    "<threshold" and ">=threshold". The attribute is not spent. 
    787765 
     
    807785    one of them. Now, let us suppose that there is a single continuous 
    808786    attribute with the same score. :obj:`SplitConstructor_Combined` 
    809     would randomly select between the proposed discrete attribute and  
     787    would randomly select between the proposed discrete attribute and 
    810788    the continuous attribute, not aware of the fact that the discrete 
    811     has already competed with eight other discrete attributes. So,  
    812     he probability for selecting (each) discrete attribute would be 1/18 
    813     instead of 1/10. Although not really correct, we doubt that this 
    814     would affect the tree's performance; many other machine learning 
    815     systems simply choose the first attribute with the highest score  
    816     anyway.) 
    817  
    818     The :obj:`branchSelector`, :obj:`branchDescriptions` and whether  
     789    has already competed with eight other discrete attributes. So, he 
     790    probability for selecting (each) discrete attribute would be 1/18 
     791    instead of 1/10. Although not really correct, we doubt that this would 
     792    affect the tree's performance; many other machine learning systems 
     793    simply choose the first attribute with the highest score anyway.) 
     794 
     795    The :obj:`branchSelector`, :obj:`branchDescriptions` and whether 
    819796    the attribute is spent is decided by the winning split constructor. 
    820797 
    821798    .. attribute: discreteSplitConstructor 
    822799 
    823         Split constructor for discrete attributes; can be, for instance, 
    824         :obj:`SplitConstructor_Attribute` or  
     800        Split constructor for discrete attributes; can be, 
     801        for instance, :obj:`SplitConstructor_Attribute` or 
    825802        :obj:`SplitConstructor_ExhaustiveBinary`. 
    826803 
     
    833810    .. attribute: continuousSplitConstructor 
    834811     
    835         Split constructor for continuous attributes; at the moment, it  
    836         can be either :obj:`SplitConstructor_Threshold` or a split 
     812        Split constructor for continuous attributes; at the moment, 
     813        it can be either :obj:`SplitConstructor_Threshold` or a split 
    837814        constructor you programmed in Python. 
    838815 
     
    882859    .. attribute:: maxMajor 
    883860 
    884         Maximal proportion of majority class. When this is exceeded,  
     861        Maximal proportion of majority class. When this is exceeded, 
    885862        induction stops. 
    886863 
    887864    .. attribute:: minExamples 
    888865 
    889         Minimal number of examples in internal leaves. Subsets with less 
    890         than :obj:`minExamples` examples are not split any further. 
     866        Minimal number of examples in internal leaves. Subsets with 
     867        less than :obj:`minExamples` examples are not split any further. 
    891868        Example count is weighed. 
    892869 
     
    908885and, optionally, a list of new weight ID's. 
    909886 
    910 Most  
    911 :obj:`ExampleSplitter` classes simply call the node's  
    912 :obj:`branchSelector` and assign examples to corresponding  
    913 branches. When the value is unknown they choose a particular  
    914 branch or simply skip the example. 
    915  
    916 Some enhanced splitters can split examples. An example (actually,  
    917 a pointer to it) is copied to more than one subset. To facilitate 
    918 real splitting, weights are needed. Each branch is assigned a 
    919 weight ID (each would usually have its own ID) and all examples 
    920 that are in that branch (either completely or partially) should 
    921 have this meta attribute. If an example hasn't been split, it 
    922 has only one additional attribute - with weight ID corresponding 
    923 to the subset to which it went. Example that is split between, 
    924 say, three subsets, has three new meta attributes, one for each 
    925 subset. ID's of weight meta attributes are returned by the 
    926 :obj:`ExampleSplitter` to be used at induction of the 
    927 corresponding subtrees. 
    928  
    929 Note that weights are used only when needed. When no splitting 
    930 occured - because the splitter is not able to do it or becauser 
    931 there was no need for splitting - no weight ID's are returned. 
     887Most :obj:`ExampleSplitter` classes simply call the node's 
     888:obj:`branchSelector` and assign examples to corresponding branches. When 
     889the value is unknown they choose a particular branch or simply skip 
     890the example. 
     891 
     892Some enhanced splitters can split examples. An example (actually, a 
     893pointer to it) is copied to more than one subset. To facilitate real 
     894splitting, weights are needed. Each branch is assigned a weight ID (each 
     895would usually have its own ID) and all examples that are in that branch 
     896(either completely or partially) should have this meta attribute. If an 
     897example hasn't been split, it has only one additional attribute - with 
     898weight ID corresponding to the subset to which it went. Example that 
     899is split between, say, three subsets, has three new meta attributes, 
     900one for each subset. ID's of weight meta attributes are returned by 
     901the :obj:`ExampleSplitter` to be used at induction of the corresponding 
     902subtrees. 
     903 
     904Note that weights are used only when needed. When no splitting occured - 
     905because the splitter is not able to do it or becauser there was no need 
     906for splitting - no weight ID's are returned. 
    932907 
    933908 
    934909.. class:: ExampleSplitter 
    935910 
    936     An abstract base class for objects that split sets of examples into  
    937     subsets. The derived classes treat examples which 
    938     cannot be unambiguously placed into a single branch (usually due 
    939     to unknown value of the crucial attribute) differently. 
     911    An abstract base class for objects that split sets of examples 
     912    into subsets. The derived classes treat examples which cannot be 
     913    unambiguously placed into a single branch (usually due to unknown 
     914    value of the crucial attribute) differently. 
    940915 
    941916    .. method:: __call__(node, examples[, weightID]) 
    942917         
    943         Use the information in :obj:`node` (particularly the  
    944         :obj:`branchSelector`) to split the given set of examples into subsets.  
    945         Return a tuple with a list of example generators and a list of weights.  
    946         The list of weights is either an ordinary python list of integers or  
    947         a None when no splitting of examples occurs and thus no weights are  
    948         needed. 
     918        Use the information in :obj:`node` (particularly the 
     919        :obj:`branchSelector`) to split the given set of examples into 
     920        subsets.  Return a tuple with a list of example generators and 
     921        a list of weights.  The list of weights is either an ordinary 
     922        python list of integers or a None when no splitting of examples 
     923        occurs and thus no weights are needed. 
    949924 
    950925 
     
    953928    Bases: :class:`ExampleSplitter` 
    954929 
    955     Simply ignores the examples for which no single branch can be determined. 
     930    Simply ignores the examples for which no single branch can be 
     931    determined. 
    956932 
    957933.. class:: ExampleSplitter_UnknownsToCommon 
     
    986962    Bases: :class:`ExampleSplitter` 
    987963 
    988     Splits examples with unknown value of the attribute according to  
     964    Splits examples with unknown value of the attribute according to 
    989965    proportions of examples in each branch. 
    990966 
     
    993969    Bases: :class:`ExampleSplitter` 
    994970 
    995     Splits examples with unknown value of the attribute according to  
    996     distribution proposed by selector (which is in most cases the same  
     971    Splits examples with unknown value of the attribute according to 
     972    distribution proposed by selector (which is in most cases the same 
    997973    as proportions of examples in branches). 
    998974 
     
    1000976============================= 
    1001977 
    1002 This is a classifier's counterpart for :class:`ExampleSplitter`. It  
     978This is a classifier's counterpart for :class:`ExampleSplitter`. It 
    1003979decides the destiny of examples that need to be classified and cannot 
    1004980be unambiguously put in a branch. 
     
    1010986    .. method:: __call__(node, example) 
    1011987 
    1012         Descends down the tree until it reaches a leaf or a node in  
    1013         which a vote of subtrees is required. In both cases, a tuple  
    1014         of two elements is returned; in the former, the tuple contains  
    1015         the reached node and None, in the latter in  
    1016         contains a node and weights of votes for subtrees (a list of floats). 
    1017  
    1018         :obj:`Descender`'s that never split examples always descend 
    1019         to a leaf, but they differ in the treatment of examples with 
    1020         unknown values (or, in general, examples for which a branch 
    1021         cannot be determined at some node(s) the tree). 
    1022         :obj:`Descender`'s that do split examples differ in returned 
    1023         vote weights. 
     988        Descends down the tree until it reaches a leaf or a node in 
     989        which a vote of subtrees is required. In both cases, a tuple 
     990        of two elements is returned; in the former, the tuple contains 
     991        the reached node and None, in the latter in contains a node and 
     992        weights of votes for subtrees (a list of floats). 
     993 
     994        :obj:`Descender`'s that never split examples always descend to a 
     995        leaf, but they differ in the treatment of examples with unknown 
     996        values (or, in general, examples for which a branch cannot be 
     997        determined at some node(s) the tree).  :obj:`Descender`'s that 
     998        do split examples differ in returned vote weights. 
    1024999 
    10251000.. class:: Descender_UnknownsToNode 
     
    10271002    Bases: :obj:`Descender` 
    10281003 
    1029     When example cannot be classified into a single branch, the 
    1030     current node is returned. Thus, the node's :obj:`NodeClassifier` 
    1031     will be used to make a decision. It is your responsibility to see 
    1032     that even the internal nodes have their :obj:`NodeClassifier` 
    1033     (i.e., don't disable creating node classifier or manually remove 
    1034     them after the induction, that's all) 
     1004    When example cannot be classified into a single branch, the current 
     1005    node is returned. Thus, the node's :obj:`NodeClassifier` will be used 
     1006    to make a decision. It is your responsibility to see that even the 
     1007    internal nodes have their :obj:`NodeClassifier` (i.e., don't disable 
     1008    creating node classifier or manually remove them after the induction, 
     1009    that's all) 
    10351010 
    10361011.. class:: Descender_UnknownsToBranch 
     
    10541029    Bases: :obj:`Descender` 
    10551030 
    1056     Classifies examples with unknown values to the branch which received  
     1031    Classifies examples with unknown values to the branch which received 
    10571032    the highest recommendation by the selector. 
    10581033 
     
    10611036    Bases: :obj:`Descender` 
    10621037 
    1063     Makes the subtrees vote for the example's class; the vote is 
    1064     weighted according to the sizes of the branches. 
     1038    Makes the subtrees vote for the example's class; the vote is weighted 
     1039    according to the sizes of the branches. 
    10651040 
    10661041.. class:: Descender_MergeAsSelector 
     
    10681043    Bases: :obj:`Descender` 
    10691044 
    1070     Makes the subtrees vote for the example's class; the vote is  
    1071     weighted according to the selectors proposal. 
     1045    Makes the subtrees vote for the example's class; the vote is weighted 
     1046    according to the selectors proposal. 
    10721047 
    10731048Pruning 
     
    10771052    pair: classification trees; pruning 
    10781053 
    1079 Tree pruners derived from :obj:`Pruner` can be given either a 
    1080 :obj:`Node` (presumably, but not necessarily a root) or a 
    1081 :obj:`_TreeClassifier`. The result is a new :obj:`Node` 
    1082 or a :obj:`_TreeClassifier` with a pruned tree. The original 
    1083 tree remains intact. 
    1084  
    1085 The pruners construct only a shallow copy of a tree. 
    1086 The pruned tree's :obj:`Node` contain references to the same 
    1087 contingency matrices, node classifiers, branch selectors, ... 
    1088 as the original tree. Thus, you may modify a pruned tree structure 
    1089 (manually cut it, add new nodes, replace components) but modifying, 
    1090 for instance, some node's :obj:`nodeClassifier` (a 
    1091 :obj:`nodeClassifier` itself, not a reference to it!) would modify 
    1092 the node's :obj:`nodeClassifier` in the corresponding node of 
    1093 the original tree. 
    1094  
    1095 Pruners cannot construct a 
    1096 :obj:`nodeClassifier` nor merge :obj:`nodeClassifier` of the pruned 
    1097 subtrees into classifiers for new leaves. Thus, if you want to build 
    1098 a prunable tree, internal nodes must have their :obj:`nodeClassifier` 
    1099 defined. Fortunately, this is the default. 
     1054Tree pruners derived from :obj:`Pruner` can be given either a :obj:`Node` 
     1055(presumably, but not necessarily a root) or a :obj:`_TreeClassifier`. The 
     1056result is a new :obj:`Node` or a :obj:`_TreeClassifier` with a pruned 
     1057tree. The original tree remains intact. 
     1058 
     1059The pruners construct only a shallow copy of a tree.  The pruned tree's 
     1060:obj:`Node` contain references to the same contingency matrices, node 
     1061classifiers, branch selectors, ...  as the original tree. Thus, you may 
     1062modify a pruned tree structure (manually cut it, add new nodes, replace 
     1063components) but modifying, for instance, some node's :obj:`nodeClassifier` 
     1064(a :obj:`nodeClassifier` itself, not a reference to it!) would modify 
     1065the node's :obj:`nodeClassifier` in the corresponding node of the 
     1066original tree. 
     1067 
     1068Pruners cannot construct a :obj:`nodeClassifier` nor merge 
     1069:obj:`nodeClassifier` of the pruned subtrees into classifiers for new 
     1070leaves. Thus, if you want to build a prunable tree, internal nodes 
     1071must have their :obj:`nodeClassifier` defined. Fortunately, this is 
     1072the default. 
    11001073 
    11011074.. class:: Pruner 
    11021075 
    1103     This is an abstract base class which defines nothing useful, only  
     1076    This is an abstract base class which defines nothing useful, only 
    11041077    a pure virtual call operator. 
    11051078 
    11061079    .. method:: __call__(tree) 
    11071080 
    1108         Prunes a tree. The argument can be either a tree classifier or  
     1081        Prunes a tree. The argument can be either a tree classifier or 
    11091082        a tree node; the result is of the same type as the argument. 
    11101083 
     
    11131086    Bases: :class:`Pruner` 
    11141087 
    1115     In Orange, a tree can have a non-trivial subtrees (i.e. subtrees  
    1116     with more than one leaf) in which all the leaves have the same majority  
     1088    In Orange, a tree can have a non-trivial subtrees (i.e. subtrees with 
     1089    more than one leaf) in which all the leaves have the same majority 
    11171090    class. (This is allowed because those leaves can still have different 
    1118     distributions of classes and thus predict different probabilities.)  
    1119     However, this can be undesired when we're only interested in the  
    1120     class prediction or a simple tree interpretation. The  
     1091    distributions of classes and thus predict different probabilities.) 
     1092    However, this can be undesired when we're only interested 
     1093    in the class prediction or a simple tree interpretation. The 
    11211094    :obj:`Pruner_SameMajority` prunes the tree so that there is no 
    11221095    subtree in which all the nodes would have the same majority class. 
     
    11261099 
    11271100    Note that the leaves with more than one majority class require some  
    1128     special handling. The pruning goes backwards, from leaves to the root.  
    1129     When siblings are compared, the algorithm checks whether they  
    1130     have (at least one) common majority class. If so, they can be pruned. 
     1101    special handling. The pruning goes backwards, from leaves to the root. 
     1102    When siblings are compared, the algorithm checks whether they have 
     1103    (at least one) common majority class. If so, they can be pruned. 
    11311104 
    11321105.. class:: Pruner_m 
     
    11441117================= 
    11451118 
    1146 The included printing functions can 
    1147 print out practically anything you'd like to 
    1148 know, from the number of examples, proportion of examples of majority 
    1149 class in nodes and similar, to more complex statistics like the 
     1119The included printing functions can print out practically anything you'd 
     1120like to know, from the number of examples, proportion of examples of 
     1121majority class in nodes and similar, to more complex statistics like the 
    11501122proportion of examples in a particular class divided by the proportion 
    1151 of examples of this class in a parent node. And even more, you can 
    1152 define your own callback functions to be used for printing. 
     1123of examples of this class in a parent node. And even more, you can define 
     1124your own callback functions to be used for printing. 
    11531125 
    11541126Before we go on: you can read all about the function and use it to its 
     
    11661138 
    11671139**<precision>** is in the same format as in Python (or C) string 
    1168 formatting. For instance, :samp:`%N` denotes the number of examples in the node, 
    1169 hence :samp:`%6.2N` would mean output to two decimal digits and six places 
    1170 altogether. If left out, a default format :samp:`5.3` is used, unless you  
    1171 multiply the numbers by 100, in which case the default is :samp:`.0` 
    1172 (no decimals, the number is rounded to the nearest integer). 
     1140formatting. For instance, :samp:`%N` denotes the number of examples in 
     1141the node, hence :samp:`%6.2N` would mean output to two decimal digits 
     1142and six places altogether. If left out, a default format :samp:`5.3` is 
     1143used, unless you multiply the numbers by 100, in which case the default 
     1144is :samp:`.0` (no decimals, the number is rounded to the nearest integer). 
    11731145 
    11741146**<divisor>** tells what to divide the quantity in that node with. 
     
    11761148:samp:`%NbP` will tell give the number of examples in the node divided by the 
    11771149number of examples in parent node. You can add use precision formatting, 
    1178 e.g. :samp:`%6.2NbP.` bA is division by the same quantity over the entire data  
    1179 set, so :samp:`%NbA` will tell you the proportion of examples (out of the entire 
    1180 training data set) that fell into that node. If division is impossible 
    1181 since the parent node does not exist or some data is missing, a dot is 
    1182 printed out instead of the quantity. 
     1150e.g. :samp:`%6.2NbP.` bA is division by the same quantity over the entire 
     1151data set, so :samp:`%NbA` will tell you the proportion of examples (out 
     1152of the entire training data set) that fell into that node. If division is 
     1153impossible since the parent node does not exist or some data is missing, 
     1154a dot is printed out instead of the quantity. 
    11831155 
    11841156**<quantity>** is the only required element. It defines what to print. 
     
    12171189    - this will tell the number of examples of Iris-virginica by the  
    12181190    number of examples this class in the parent node. If you are  
    1219     interested in examples that are *not* Iris-virginica, say  
     1191    interested in examples that are *not* Iris-virginica, say 
    12201192    :samp:`%5.3CbP!="Iris-virginica"` 
    12211193 
    12221194    For regression trees, you can use operators =, !=, <, <=, >, and >=,  
    1223     as in :samp:`%C<22` - add the precision and divisor if you will. You can also 
    1224     check the number of examples in a certain interval: :samp:`%C[20, 22]` 
    1225     will give you the number of examples between 20 and 22 (inclusive) 
    1226     and :samp:`%C(20, 22)` will give the number of such examples excluding the 
    1227     boundaries. You can of course mix the parentheses, e.g. :samp:`%C(20, 22]`. 
    1228     If you would like the examples outside the interval, add a :samp:`!`, 
    1229     like :samp:`%C!(20, 22]`. 
    1230   
     1195    as in :samp:`%C<22` - add the precision and divisor if you will. You 
     1196    can also check the number of examples in a certain interval: 
     1197    :samp:`%C[20, 22]` will give you the number of examples between 20 
     1198    and 22 (inclusive) and :samp:`%C(20, 22)` will give the number of 
     1199    such examples excluding the boundaries. You can of course mix the 
     1200    parentheses, e.g. :samp:`%C(20, 22]`.  If you would like the examples 
     1201    outside the interval, add a :samp:`!`, like :samp:`%C!(20, 22]`. 
     1202 
    12311203:samp:`c` 
    12321204    Same as above, except that it computes the proportion of the class 
     
    12721244 
    12731245Let's now print out the predicted class at each node, the number 
    1274 of examples in the majority class with the total number of examples 
    1275 in the node:: 
     1246of examples in the majority class with the total number of examples in 
     1247the node:: 
    12761248 
    12771249    >>> print tree.dump(leafStr="%V (%M out of %N)") 
     
    12861258 
    12871259Would you like to know how the number of examples declines as 
    1288 compared to the entire data set and to the parent node? We find 
    1289 it with this:: 
     1260compared to the entire data set and to the parent node? We find it 
     1261with this:: 
    12901262 
    12911263    >>> print tree.dump(leafStr="%V (%^MbA%, %^MbP%)") 
     
    13021274examples in the majority class. We want it divided by the number of 
    13031275all examples from this class on the entire data set, hence :samp:`%MbA`. 
    1304 To have it multipied by 100, we say :samp:`%^MbA`. The percent sign *after* 
    1305 that is just printed out literally, just as the comma and parentheses 
    1306 (see the output). The string for showing the proportion of this class 
    1307 in the parent is the same except that we have :samp:`bP` instead  
    1308 of :samp:`bA`. 
     1276To have it multipied by 100, we say :samp:`%^MbA`. The percent sign 
     1277*after* that is just printed out literally, just as the comma and 
     1278parentheses (see the output). The string for showing the proportion 
     1279of this class in the parent is the same except that we have :samp:`bP` 
     1280instead of :samp:`bA`. 
    13091281 
    13101282And now for the output: all examples of setosa for into the first node. 
    13111283For versicolor, we have 98% in one node; the rest is certainly 
    1312 not in the neighbouring node (petal length>=5.350) since all 
    1313 versicolors from the node petal width<1.750 went to petal length<5.350 
    1314 (we know this from the 100% in that line). Virginica is the  
    1315 majority class in the three nodes that together contain 94% of this 
    1316 class (4+4+86). The rest must had gone to the same node as versicolor. 
     1284not in the neighbouring node (petal length>=5.350) since all versicolors 
     1285from the node petal width<1.750 went to petal length<5.350 (we know 
     1286this from the 100% in that line). Virginica is the majority class in 
     1287the three nodes that together contain 94% of this class (4+4+86). The 
     1288rest must had gone to the same node as versicolor. 
    13171289 
    13181290If you find this guesswork annoying - so do I. Let us print out the 
     
    13511323:samp:`data.domain.classVar.values` , you'll learn that the order is setosa,  
    13521324versicolor, virginica; so in the node at peta length<5.350 we have 49 
    1353 versicolors and 3 virginicae. To print out the proportions, we can  
    1354 :samp:`%.2d` - this gives us proportions within node, rounded on  
    1355 two decimals:: 
     1325versicolors and 3 virginicae. To print out the proportions, we can 
     1326:samp:`%.2d` - this gives us proportions within node, rounded on two 
     1327decimals:: 
    13561328 
    13571329    petal width<0.800: [1.00, 0.00, 0.00] 
     
    13651337 
    13661338We haven't tried printing out any information for internal nodes. 
    1367 To start with the most trivial case, we shall print the prediction 
    1368 at each node. 
     1339To start with the most trivial case, we shall print the prediction at 
     1340each node. 
    13691341 
    13701342:: 
     
    13881360 
    13891361Note that the output is somewhat different now: there appeared another 
    1390 node called *root* and the tree looks one level deeper. This is 
    1391 needed to print out the data for that node to. 
     1362node called *root* and the tree looks one level deeper. This is needed 
     1363to print out the data for that node to. 
    13921364 
    13931365Now for something more complicated: let us observe how the number 
     
    14001372of examples in this class. Add :samp:`^.1` and the result will be 
    14011373multiplied and printed with one decimal. The trailing :samp:`%` is printed 
    1402 out. In parentheses we print the same thing except that we divide by the 
    1403 examples in the parent node. Note the use of single quotes, so we can 
    1404 use the double quotes inside the string, when we specify the class. 
     1374out. In parentheses we print the same thing except that we divide by 
     1375the examples in the parent node. Note the use of single quotes, so we 
     1376can use the double quotes inside the string, when we specify the class. 
    14051377 
    14061378:: 
     
    14181390See what's in the parentheses in the root node? If :meth:`~TreeClassifier.dump` 
    14191391cannot compute something (in this case it's because the root has no parent), 
    1420 it prints out a dot. You can also eplace :samp:`=` by :samp:`!=` and it  
     1392it prints out a dot. You can also eplace :samp:`=` by :samp:`!=` and it 
    14211393will count all classes *except* virginica. 
    14221394 
     
    14641436    |    |    TAX>=534.500: 21.9 
    14651437 
    1466 Let us add the standard error in both internal nodes and leaves, and the 
    1467 90% confidence intervals in the leaves:: 
     1438Let us add the standard error in both internal nodes and leaves, and 
     1439the 90% confidence intervals in the leaves:: 
    14681440 
    14691441    >>> print tree.dump(leafStr="[SE: %E]\t %V %I(90)", nodeStr="[SE: %E]") 
     
    14961468leaf average anyway? Not necessarily, the tree predict whatever the 
    14971469:attr:`TreeClassifier.nodeClassifier` in a leaf returns.  
    1498 As :samp:`%V` uses the  
    1499 :obj:`Orange.data.variable.Continuous`' function for printing out the value,  
    1500 therefore the printed number has the same number of decimals  
    1501 as in the data file. 
     1470As :samp:`%V` uses the :obj:`Orange.data.variable.Continuous`' function 
     1471for printing out the value, therefore the printed number has the same 
     1472number of decimals as in the data file. 
    15021473 
    15031474Regression trees cannot print the distributions in the same way 
     
    15251496 
    15261497The last line, for instance, says the the number of examples with the 
    1527 class below 22 is among those with tax above 534 is 30 times higher 
    1528 than the number of such examples in its parent node. 
     1498class below 22 is among those with tax above 534 is 30 times higher than 
     1499the number of such examples in its parent node. 
    15291500 
    15301501For another exercise, let's count the same for all examples *outside* 
     
    15361507    >>> print tree.dump(leafStr="%C![20,22] (%^cbP![20,22]%)", nodeStr=".") 
    15371508 
    1538 OK, let's observe the format string for one last time. :samp:`%c![20, 22]` 
    1539 would be the proportion of examples (within the node) whose values are 
    1540 below 20 or above 22. By :samp:`%cbP![20, 22]` we derive this by the same 
    1541 statistics computed on the parent. Add a :samp:`^` and you have the percentages. 
     1509OK, let's observe the format string for one last time. :samp:`%c![20, 
     151022]` would be the proportion of examples (within the node) whose values 
     1511are below 20 or above 22. By :samp:`%cbP![20, 22]` we derive this by 
     1512the same statistics computed on the parent. Add a :samp:`^` and you have 
     1513the percentages. 
    15421514 
    15431515:: 
     
    15651537:meth:`TreeClassifier.dump`'s argument :obj:`userFormats` can be used to print out 
    15661538some other information in the leaves or nodes. If provided, 
    1567 :obj:`userFormats` should contain a list of tuples with a regular expression 
    1568 and a callback function to be called when that expression is found in the 
    1569 format string. Expressions from :obj:`userFormats` are checked before 
    1570 the built-in expressions discussed above, so you can override the built-ins 
    1571 if you want to. 
     1539:obj:`userFormats` should contain a list of tuples with a regular 
     1540expression and a callback function to be called when that expression 
     1541is found in the format string. Expressions from :obj:`userFormats` 
     1542are checked before the built-in expressions discussed above, so you can 
     1543override the built-ins if you want to. 
    15721544 
    15731545The regular expression should describe a string like those we used above, 
    15741546for instance the string :samp:`%.2DbP`. When a leaf or internal node 
    1575 is printed out, the format string (:obj:`leafStr` or :obj:`nodeStr`)  
    1576 is checked for these regular expressions and when the match is found,  
     1547is printed out, the format string (:obj:`leafStr` or :obj:`nodeStr`) 
     1548is checked for these regular expressions and when the match is found, 
    15771549the corresponding callback function is called. 
    15781550 
     
    16131585        return insertStr(strg, mo, str(node.nodeClassifier.defaultValue)) 
    16141586 
    1615 It therefore takes the value predicted at the node  
     1587It therefore takes the value predicted at the node 
    16161588(:samp:`node.nodeClassifier.defaultValue` ), converts it to a string 
    16171589and passes it to *insertStr* to do the replacement. 
    16181590 
    16191591A more complex regular expression is the one for the proportion of 
    1620 majority class, defined as :samp:`"%"+fs+"M"+by`. It uses the 
    1621 two partial expressions defined above. 
     1592majority class, defined as :samp:`"%"+fs+"M"+by`. It uses the two partial 
     1593expressions defined above. 
    16221594 
    16231595Let's say with like to print the classification margin for each node, 
     
    16301602   :lines: 7-31 
    16311603 
    1632 We first defined getMargin which gets the distribution and computes the 
    1633 margin. The callback replaces, replaceB, computes the margin for the node. 
    1634 If we need to divided the quantity by something (that is, if the :data:`by` 
    1635 group is present), we call :func:`byWhom` to get the node with whose margin 
    1636 this node's margin is to be divided. If this node (usually the parent) 
    1637 does not exist of if its margin is zero, we call :func:`insertDot` 
    1638 to insert a dot, otherwise we call :func:`insertNum` which will insert  
    1639 the number, obeying the format specified by the user. myFormat is a list  
    1640 containing the regular expression and the callback function. 
     1604We first defined getMargin which gets the distribution and computes 
     1605the margin. The callback replaces, replaceB, computes the margin for 
     1606the node.  If we need to divided the quantity by something (that is, 
     1607if the :data:`by` group is present), we call :func:`byWhom` to get the 
     1608node with whose margin this node's margin is to be divided. If this node 
     1609(usually the parent) does not exist of if its margin is zero, we call 
     1610:func:`insertDot` to insert a dot, otherwise we call :func:`insertNum` 
     1611which will insert the number, obeying the format specified by the 
     1612user. myFormat is a list containing the regular expression and the 
     1613callback function. 
    16411614 
    16421615We can now print out the iris tree: 
Note: See TracChangeset for help on using the changeset viewer.