Changeset 7439:12b299863b7c in orange


Ignore:
Timestamp:
02/04/11 13:05:03 (3 years ago)
Author:
markotoplak
Branch:
default
Convert:
c32810d8683410894d2e42284326cc2e35dc89e2
Message:

Matija's wish: sphinx does not complain about errors in "tree.py".

Location:
orange
Files:
2 added
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/classification/tree.py

    r7414 r7439  
    3535        flag to true. Note that even when the flag is not  
    3636        set, the contingencies get computed and stored to  
    37         :obj:`TreeNone`, but are removed shortly afterwards.  
     37        :obj:`TreeNode`, but are removed shortly afterwards.  
    3838        The details are given in the  
    39         description of the :obj:`TreeLearnerBase`object. 
     39        description of the :obj:`TreeLearnerBase` object. 
    4040 
    4141    .. attribute:: examples, weightID 
     
    873873.. index:: 
    874874    pair: classification trees; pruning 
    875 .. index:: pruning classification trees 
    876  
    877     Classes derived from :obj:`TreePruner` prune the trees as a 
    878     described in the section pruning XXXXXXXX - make sure you read it  
    879     to understand what the pruners will do to your trees. 
     875 
     876Classes derived from :obj:`TreePruner` prune the trees as a 
     877described in the section pruning XXXXXXXX - make sure you read it  
     878to understand what the pruners will do to your trees. 
    880879 
    881880.. class:: TreePruner 
     
    10121011of some other type, we don't know how to handle it and thus raise  
    10131012an exception. (Note that we could also use  
     1013 
    10141014:: 
     1015 
    10151016    if type(x) == orange.TreeClassifier: 
    10161017 
     
    11011102 
    11021103The stop is trivial. The default is set by 
     1104 
    11031105:: 
    11041106    >>> learner.stop = orange.TreeStopCriteria_common() 
     
    13571359 
    13581360.. _tree_c45.py: code/tree_c45.py 
    1359 .. _iris.tac: code/iris.tab 
     1361.. _iris.tab: code/iris.tab 
    13601362 
    13611363The simplest way to use :class:`C45Learner` is to call it. This 
     
    14091411===================== 
    14101412 
    1411 .. autofunction:: c45_printTree 
     1413.. autofunction:: printTreeC45 
    14121414 
    14131415=============== 
     
    18941896.. autodata:: fs 
    18951897 
    1896 <dt>fs</dt> 
    1897  
    1898 <dt>by</dt> 
    1899 <dd>Defines <code>bP</code> or <code>bA</code> or nothing; the result is in groups <code>by</code>.</dd> 
    1900 </dl> 
    1901  
    1902 <P>For a trivial example, "%V" is implemented like this. There is the following tuple in the list of built-in formats: <code>(re.compile("%V"), replaceV)</code>. <code>replaceV</code> is a function defined by:</P> 
    1903 <xmp class="code">def replaceV(strg, mo, node, parent, tree): 
    1904     return insertStr(strg, mo, str(node.nodeClassifier.defaultValue))</xmp> 
    1905 <P>It therefore takes the value predicted at the node (<code>node.nodeClassifier.defaultValue</code>), converts it to a string and passes it to <code>insertStr</code> to do the replacement.</P> 
    1906  
    1907 <P>A more complex regular expression is the one for the proportion of majority class, defined as <code>"%"+fs+"M"+by</code>. It uses the two partial expressions defined above.</P> 
    1908  
    1909 <P>Let's say with like to print the classification margin for each node, that is, the difference between the proportion of the largest and the second largest class in the node.</P> 
    1910  
    1911 <p class="header">part of <a href="orngTree2.py">orngTree2.py</a></p> 
    1912 <xmp class="code">def getMargin(dist): 
    1913     if dist.abs < 1e-30: 
    1914         return 0 
    1915     l = list(dist) 
    1916     l.sort() 
    1917     return (l[-1] - l[-2]) / dist.abs 
    1918  
    1919 def replaceB(strg, mo, node, parent, tree): 
    1920     margin = getMargin(node.distribution) 
    1921  
    1922     by = mo.group("by") 
    1923     if margin and by: 
    1924         whom = orngTree.byWhom(by, parent, tree) 
    1925         if whom and whom.distribution: 
    1926             divMargin = getMargin(whom.distribution) 
    1927             if divMargin > 1e-30: 
    1928                 margin /= divMargin 
    1929             else: 
    1930                 orngTree.insertDot(strg, mo) 
    1931         else: 
    1932             return orngTree.insertDot(strg, mo) 
    1933     return orngTree.insertNum(strg, mo, margin) 
    1934  
    1935  
    1936 myFormat = [(re.compile("%"+orngTree.fs+"B"+orngTree.by), replaceB)]</xmp> 
    1937  
    1938 <P>We first defined <code>getMargin</code> which gets the distribution and computes the margin. The callback replaces, <code>replaceB</code>, computes the margin for the node. If we need to divided the quantity by something (that is, if the <code>by</code> group is present), we call <code>orngTree.byWhom</code> to get the node with whose margin this node's margin is to be divided. If this node (usually the parent) does not exist of if its margin is zero, we call <code>insertDot</code> to insert a dot, otherwise we call <code>insertNum</code> which will insert the number, obeying the format specified by the user.</P> 
    1939  
    1940 <P><code>myFormat</code> is a list containing the regular expression and the callback function.</P> 
    1941  
    1942 <P>We can now print out the iris tree, for instance using the following call.</P> 
    1943 <xmp class="code">orngTree.printTree(tree, leafStr="%V %^B% (%^3.2BbP%)", userFormats = myFormat)</xmp> 
    1944  
    1945 <P>And this is what we get.</P> 
    1946 <xmp class="printout">petal width<0.800: Iris-setosa 100% (100.00%) 
    1947 petal width>=0.800 
    1948 |    petal width<1.750 
    1949 |    |    petal length<5.350: Iris-versicolor 88% (108.57%) 
    1950 |    |    petal length>=5.350: Iris-virginica 100% (122.73%) 
    1951 |    petal width>=1.750 
    1952 |    |    petal length<4.850: Iris-virginica 33% (34.85%) 
    1953 |    |    petal length>=4.850: Iris-virginica 100% (104.55%) 
    1954 </xmp> 
    1955  
    1956  
    1957 <h2>Plotting the Tree using Dot</h2> 
    1958  
    1959 <p>Function <code>printDot</code> prints the tree to a file in a format used by <a 
    1960 href="http://www.research.att.com/sw/tools/graphviz">GraphViz</a>. 
    1961 Uses the same parameters as <code>printTxt</code> defined above, and 
    1962 in addition two parameters which define the shape used for internal 
     1898.. autodata:: by 
     1899 
     1900For a trivial example, :samp:`%V` is implemented like this. There is the 
     1901following tuple in the list of built-in formats:: 
     1902 
     1903    (re.compile("%V"), replaceV) 
     1904 
     1905:obj:`replaceV` is a function defined by:: 
     1906 
     1907    def replaceV(strg, mo, node, parent, tree): 
     1908        return insertStr(strg, mo, str(node.nodeClassifier.defaultValue)) 
     1909 
     1910It therefore takes the value predicted at the node  
     1911(:samp:`node.nodeClassifier.defaultValue` ), converts it to a string 
     1912and passes it to <code>insertStr</code> to do the replacement. 
     1913 
     1914A more complex regular expression is the one for the proportion of 
     1915majority class, defined as :samp:`"%"+fs+"M"+by`. It uses the 
     1916two partial expressions defined above. 
     1917 
     1918Let's say with like to print the classification margin for each node, 
     1919that is, the difference between the proportion of the largest and the 
     1920second largest class in the node (part of `orngTree2.py`_): 
     1921 
     1922.. _orngTree2.py: code/orngTree2.py 
     1923 
     1924.. literalinclude:: code/orngTree2.py 
     1925   :lines: 7-30 
     1926 
     1927We first defined getMargin which gets the distribution and computes the 
     1928margin. The callback replaces, replaceB, computes the margin for the node. 
     1929If we need to divided the quantity by something (that is, if the :data:`by` 
     1930group is present), we call :func:`byWhom` to get the node with whose margin 
     1931this node's margin is to be divided. If this node (usually the parent) 
     1932does not exist of if its margin is zero, we call :func:`insertDot` 
     1933to insert a dot, otherwise we call :func:`insertNum` which will insert  
     1934the number, obeying the format specified by the user. myFormat is a list  
     1935containing the regular expression and the callback function. 
     1936 
     1937 
     1938We can now print out the iris tree: 
     1939 
     1940.. literalinclude:: code/orngTree2.py 
     1941    :lines: 32 
     1942 
     1943And we get:: 
     1944 
     1945    petal width<0.800: Iris-setosa 100% (100.00%) 
     1946    petal width>=0.800 
     1947    |    petal width<1.750 
     1948    |    |    petal length<5.350: Iris-versicolor 88% (108.57%) 
     1949    |    |    petal length>=5.350: Iris-virginica 100% (122.73%) 
     1950    |    petal width>=1.750 
     1951    |    |    petal length<4.850: Iris-virginica 33% (34.85%) 
     1952    |    |    petal length>=4.850: Iris-virginica 100% (104.55%) 
     1953 
     1954 
     1955Plotting the Tree using Dot 
     1956=========================== 
     1957 
     1958Prints the tree to a file in a format used by  
     1959`GraphViz <http://www.research.att.com/sw/tools/graphviz>`_. 
     1960Uses the same parameters as :func:`printTxt` defined above 
     1961plus two parameters which define the shape used for internal 
    19631962nodes and laves of the tree: 
    19641963 
    1965 <p class=section>Arguments</p> 
    1966 <dl class=arguments> 
    1967   <dt>leafShape</dt> 
    1968   <dd>Shape of the outline around leves of the tree. If "plaintext", 
    1969   no outline is used (default: "plaintext")</dd> 
    1970  
    1971   <dt>internalNodeShape</dt> 
    1972   <dd>Shape of the outline around internal nodes of the tree. If "plaintext", 
    1973   no outline is used (default: "box")</dd> 
    1974 </dl> 
     1964:param leafShape: Shape of the outline around leves of the tree.  
     1965    If "plaintext", no outline is used (default: "plaintext"). 
     1966:type leafShape: string 
     1967:param internalNodeShape: Shape of the outline around internal nodes  
     1968    of the tree. If "plaintext", no outline is used (default: "box") 
     1969:type leafShape: string 
    19751970 
    19761971<p>Check <a 
     
    24682463 
    24692464import re 
     2465 
    24702466fs = r"(?P<m100>\^?)(?P<fs>(\d*\.?\d*)?)" 
    24712467""" Defines the multiplier by 100 (:samp:`^`) and the format 
     
    24742470 
    24752471by = r"(?P<by>(b(P|A)))?" 
     2472""" Defines bP or bA or nothing; the result is in groups by. """ 
     2473 
    24762474bysub = r"((?P<bysub>b|s)(?P<by>P|A))?" 
    24772475opc = r"(?P<op>=|<|>|(<=)|(>=)|(!=))(?P<num>\d*\.?\d+)" 
Note: See TracChangeset for help on using the changeset viewer.