source: orange/Orange/doc/widgets/Classify/C4.5.htm @ 9671:a7b056375472

Revision 9671:a7b056375472, 4.9 KB checked in by anze <anze.staric@…>, 2 years ago (diff)

Moved orange to Orange (part 2)

Line 
1<html>
2<head>
3<title>C4.5</title>
4<link rel=stylesheet href="../../../style.css" type="text/css" media=screen>
5<link rel=stylesheet href="../../../style-print.css" type="text/css" media=print>
6</head>
7
8<body>
9
10<h1>C4.5 Learner</h1>
11
12<img class="screenshot" src="../icons/C4.5.png">
13<p>C4.5 learner by Ross Quinlan</p>
14
15<h2>Channels</h2>
16
17<h3>Inputs</h3>
18
19<DL class=attributes>
20<DT>Examples (ExampleTable)</DT>
21<DD>A table with training examples</DD>
22</dl>
23
24<h3>Outputs</h3>
25<DL class=attributes>
26<DT>Learner</DT>
27<DD>The C4.5 learning algorithm with settings as specified in the dialog.</DD>
28
29<DT>Classifier</DT>
30<DD>Trained C4.5 classifier</DD>
31
32<DT>C45 Tree (C45Classifier)</DT>
33<DD>Induced tree in the original Quinlan's format</DD>
34
35<DT>Classification Tree (TreeClassifier)</DT>
36<DD>Induced tree in Orange's native format</DD>
37</dl>
38
39<P><code>Classifier</code>, <code>C45 Tree</code> and <code>Classification Tree</code> are available only if examples are present on the input. Which of the latter two output signals is active is determined by setting <span class="option">Convert to orange tree structure</span> (see the description below.</P>
40
41<h2>Description</h2>
42
43<p>This widget provides a graphical interface to the well-known Quinlan's C4.5 algorithm for construction of classification tree. Orange uses the original Quinlan's code which must be, due to copyright issues, built and linked in separately.</p>
44
45<P>Orange also implements its own classification tree induction algorithm which is comparable to Quinlan's, though the results may differ due to technical details. It is accessible in widget <code>Classification Tree</code>.</P>
46
47<p>As all widgets for classification, C4.5 widget provides learner and classifier on the output. Learner is a learning algorithm with settings as specified by the user. It can be fed into widgets for testing learners, namely <code>Test Learners</code>. Classifier is a classification tree build from the training examples on the input. If examples are not given, the widget outputs no classifier.</p>
48
49<img class="leftscreenshot" src="C4.5.png" alt="C4.5 Widget" border=0>
50
51<P>Learner can be given a name under which it will appear in, say, <code>Test Learners</code>. The default name is "C4.5".</P>
52
53<P>The next block of options deals with splitting. C4.5 uses gain ratio by default; to override this, check <span class="option">Use information gain instead of ratio</span>, which is equivalent to C4.5's command line option <code>-g</code>. If you enable <span class="option">subsetting</span> (equivalent to <code>-s</code>), C4.5 will merge values of multivalued discrete attributes instead of creating one branch for each node. <span class="option">Probabilistic threshold for continuous attributes</span> (<code>-p</code>) makes C4.5 compute the lower and upper boundaries for values of continuous attributes for which the number of misclassified examples would be within one standard deviation from the base error.</P>
54
55<P>As for pruning, you can set the <span class="option">Minimal number of examples in the leaves</span> (Quinlan's default is 2, but you may want to disable this for noiseless data), and the <span class="option">Post prunning with confidence level</span>; the default confidence is 25.</P>
56
57<P>Trees can be constructed iteratively, with ever larger number of examples. If enable, you can set the <span class="option">Number of trials</span>, the <span class="option">initial windows size</span> and <span class="option">window increment</span>.</P>
58
59<P>The resulting classifier can be left in the original Quinlan's structure, as returned by his underlying code, or <span class="option">converted to orange the structure</span> that is used by Orange's tree induction algorithm. This setting decides which of the two signals that output the tree - <code>C45 Classifier</code> or <code>Tree Classifier</code> will be active. As Orange's structure is more general and can easily accommodate all the data that C4.5 tree needs for classification, we believe that the converted tree behave exactly the same as the original tree, so the results should not depend on this setting. You should therefore leave it enabled since only the converted trees can be shown in the tree displaying widgets.</P>
60
61<P>When you change one or more settings, you need to push <span class="option">Apply</span>; this will put the new learner on the output and, if the training examples are given, construct a new classifier and output it as well.</P>
62
63
64<h2>Examples</h2>
65
66<P>There are two typical uses of this widget. First, you may want to induce the tree and see what it looks like, like in the schema on the right.</P>
67
68<img class="screenshot" src="C4.5-SchemaClassifier2.png" alt="C4.5 - Schema with a Classifier" border=0>
69
70<P>The second schema shows how to compare the results of C4.5 learner with another classifier, naive Bayesian Learner.</P>
71
72<img class="screenshot"
73src="C4.5-SchemaLearner.png" alt="C4.5 - Schema with a Learner" border=0>
74
75</body>
76</html>
Note: See TracBrowser for help on using the repository browser.