source: orange/Orange/doc/modules/orngWrap.htm @ 9671:a7b056375472

Revision 9671:a7b056375472, 13.6 KB checked in by anze <anze.staric@…>, 2 years ago (diff)

Moved orange to Orange (part 2)

Line 
1<html><HEAD>
2<LINK REL=StyleSheet HREF="../style.css" TYPE="text/css">
3</HEAD>
4<body>
5<h1>orngWrap: Wrappers for Tuning Parameters and Thresholds</h1>
6
7<p>This module contains classes for two very useful purposes: tuning learning algorithm's parameters using internal validation and tuning the threshold for classification into positive class.</p>
8
9
10<h2>Tuning parameters</h2>
11<index name="modules/tuning learning parameters">
12
13<P>The module provides two classes, one for fitting a single parameter and one fitting multiple parameters at once, trying all possible combinations. When called with examples and, optionally, id of meta attribute with weights, they find the optimal setting of arguments using the cross validation. The classes can also be used as ordinary learning algorithms - they are in fact derived from <code>orange.Learner.</code>
14
15<P>Both classes have a common parent, <code><INDEX name="classes/TuneParameters (in orngWrap)">TuneParameters</code>, and a few common attributes.</P>
16
17<p class="section">Attributes</p>
18<dl class="attributes">
19<dt>object</dt>
20<dd>The learning algorithm whose parameters are to be tuned. This can be, for instance, <code>orngTree.TreeLearner</code>. You will usually use the wrapped learners from modules, not the built-in classifiers, such as <code>orange.TreeLearner</code> directly, since the arguments to be fitted are easier to address in the wrapped versions. But in principle it doesn't matter.</dd>
21
22<dt>evaluate</dt>
23<dd>The statistics to evaluate. The default is <a href="orngStat.htm"><code>orngStat.CA</code></a>, so the learner will be fit for the optimal classification accuracy. You can replace it with, for instance, <code>orngStat.AUC</code> to optimize the AUC. Statistics can return either a single value (classification accuracy), a list with a single value (this is what <code>orngStat.CA</code> actually does), or arbitrary objects which the <code>compare</code> function below must be able to compare.</dd>
24
25<dt>folds</dt>
26<dd>The number of folds used in internal cross-validation. Default is 5.</dd>
27
28<dt>compare</dt>
29<dd>The function used to compare the results. The function should accept two arguments (e.g. two classification accuracies, AUCs or whatever the result of <code>evaluate</code> is) and return a positive value if the first argument is better, 0 if they are equal and a negative value if the first is worse than the second. The default <code>compare</code> function is <code>cmp</code>. You don't need to change this if <code>evaluate</code> is such that higher values mean a better classifier.</dd>
30
31<dt>returnWhat</dt>
32<dd>Decides what should be result of tuning. Possible values are
33<ul>
34<li><b><code>TuneParameters.returnNone</code></b> (or 0): tuning will return nothing,</li>
35<li><b><code>TuneParameters.returnParameters</code></b> (or 1): return the optimal value(s) of parameter(s),</li>
36<li><b><code>TuneParameters.returnLearner</code></b> (or 2): return the learner set to optimal parameters,</li>
37<li><b><code>TuneParameters.returnClassifier</code></b> (or 3): return a classifier trained with the optimal parameters on the entire data set. This is the default setting.</li>
38</ul>
39Regardless of this, the learner (given as <code>object</code>) is left set to the optimal parameters.
40</dd>
41
42<dt>verbose</dt>
43<dd>If 0 (default), the class doesn't print anything. If set to 1, it will print out the optimal value found, if set to 2, it will print out all tried values and the related
44</dl>
45
46<P>If tuner returns the classifier, it behaves as a learning algorithm. As the examples below will demonstrate, it can be called, given the examples and the result is a "trained" classifier. It can, for instance, be used in cross-validation.</P>
47
48<P>Out of these attributes, the only necessary argument is <code>object</code>. The real tuning classes add two additional - the attributes that tell what parameter(s) to optimize and which values to use.</P>
49
50
51<h3>Tune1Parameter</h3>
52
53<P>Class <code><INDEX name="classes/Tune1Parameter (in orngWrap)">Tune1Parameter</code> tunes a single parameter.</P>
54
55<p class="section">Attributes</p>
56<dl class="attributes">
57<dt>parameter</dt>
58<dd>The name of the parameter (or a list of names, if the same parameter is stored at multiple places - see the examples) to be tuned.</dd>
59
60<dt>values</dt>
61<dd>A list of parameter's values to be tried.</dd>
62</dl>
63
64<P>To show how it works, we shall fit the minimal number of examples in a leaf for a tree classifier.</P>
65<p class="header">part of <a href="tuning1.py">tuning1.py</a></p>
66<xmp class="code">import orange, orngTree, orngWrap
67
68learner = orngTree.TreeLearner()
69data = orange.ExampleTable("voting")
70tuner = orngWrap.Tune1Parameter(object=learner,
71                                parameter="minSubset",
72                                values=[1, 2, 3, 4, 5, 10, 15, 20],
73                                evaluate = orngStat.AUC)
74classifier = tuner(data)
75
76print learner.minSubset
77</xmp>
78
79<P>Set up like this, when the tuner is called, set <code>learner.minSubset</code> to 1, 2, 3, 4, 5, 10, 15 and 20, and measure the AUC in 5-fold cross validation. It will then reset the <code>learner.minSubset</code> to the optimal value found and, since we left <code>returnWhat</code> at the default (<code>returnClassifier</code>), construct and return the classifier from the entire data set. So, what we get is a classifier, but if we'd also like to know what the optimal value was, we can get it from <code>learner.minSubset</code>.</P>
80
81<P>Tuning is of course not limited to setting numeric parameters. You can, for instance, try to find the optimal criteria for assessing the quality of attributes by tuning <code>parameter="measure"</code>, trying settings like <code>values=[orange.MeasureAttribute_gainRatio(), orange.MeasureAttribute_gini()]</code>.
82
83<P>Since the tuner returns a classifier and thus behaves like a learner, we can used in a cross-validation. Let us see whether a tuning tree indeed enhances the AUC or not. We shall reuse the <code>tuner</code> from above, add another tree learner, and test them both.</P>
84
85<xmp class="code">import orngTest
86untuned = orngTree.TreeLearner()
87res = orngTest.crossValidation([untuned, tuner], data)
88AUCs = orngStat.AUC(res)
89
90print "Untuned tree: %5.3f" % AUCs[0]
91print "Tuned tree: %5.3f" % AUCs[1]
92</xmp>
93
94<P>This will take some time: for each of 8 values for <code>minSubset</code> it will perform 5-fold cross validation inside a 10-fold cross validation - altogether 400 trees. Plus, it will learn the optimal tree afterwards for each fold. Add a tree without tuning, and you get 420 trees build.</P>
95
96<P>Well, not that long, and the results are good:</P>
97<xmp class="printout">Untuned tree: 0.930
98Tuned tree: 0.986</xmp>
99
100<P>We mentioned that we will normally use wrapped learners from orng modules, not directly from Orange. Why is this? Couldn't we use <code>orange.TreeLearner</code> instead of <code>orngTree.TreeLearner</code>. Well, we can, except that <code>minSubset</code> is not only deeper, but also appears on two places. To optimize it, we'd do the same as above, except that we'd specify the parameter to optimize with</P>
101<xmp class="code">parameter=["split.continuousSplitConstructor.minSubset", "split.discreteSplitConstructor.minSubset"]
102</xmp>
103
104
105<h3>TuneMParameters</h3>
106
107<P>The use of <code><INDEX name="classes/TuneMParameters (in orngWrap)">TuneMParameters</code> differs from <code>Tune1Parameter</code> only in specification of tuning parameters.</P>
108
109<p class="section">Attributes</p>
110<dl class="attributes">
111<dt>parameters</dt>
112<dd>A list of two-element tuples, each containing the name of a parameter and its possible values.</dd>
113</dl>
114
115<P>For exercise we can try to tune both settings mentioned above, the minimal number of examples in leaves and the splitting criteria by setting the tuner as follows:
116<p class="header"><a href="tuningm.py">tuningm.py</a></p>
117<xmp class="code">tuner = orngWrap.TuneMParameters(object=learner,
118                                 parameters=[("minSubset", [2, 5, 10, 20]),
119                                             ("measure", [orange.MeasureAttribute_gainRatio(), orange.MeasureAttribute_gini()])],
120                                 evaluate = orngStat.AUC)
121</xmp>
122
123<P>Everything else stays like above, in examples for <code>Tune1Parameter</code>.</P>
124
125
126<h2>Setting Optimal Thresholds</h2>
127<index name="modules/tuning thresholds">
128<index name="modules/optimal thresholds">
129<index name="modules/threshold setting">
130
131<P>Some models may perform well in terms of AUC which measures the ability to distinguish between examples of two classes, but have low classifications accuracies. The reason may be in the threshold: in binary problems, classifiers usually classify into the more probable class, while sometimes, when class distributions are highly skewed, a modified threshold would give better accuracies. Here are two classes that can help.</P>
132
133<h3>Threshold Learner and Classifier</h3>
134<index name="classifiers/with threshold tuning">
135
136<P><code><INDEX name="classes/ThresholdLearner (in orngWrap)">ThresholdLearner</code> is a class that wraps around another learner. When given the data, it calls the wrapped learner to build a classifier, than it uses the classifier to predict the class probabilities on the training examples. Storing the probabilities, it computes the threshold that would give the optimal classification accuracy. Then it wraps the classifier and the threshold into an instance of <code>ThresholdClassifier</code>.</P>
137
138<P>Note that the learner doesn't perform internal cross-validation. Also, the learner doesn't work for multivalued classes. If you don't understand why, think harder. If you still don't, try to program it yourself, this should help. :)</P>
139
140<P><code>ThresholdLearner</code> has the same interface as any learner: if the constructor is given examples, it returns a classifier, else it returns a learner. It has two attributes.
141<dl class="attributes">
142<dt>learner</dt>
143<dd>The wrapped learner, for example an instance of <code>orange.BayesLearner</code>.</dd>
144
145<dt>storeCurve</dt>
146<dd>If set, the resulting classifier will contain an attribute <code>curve</code>, with a list of tuples containing thresholds and classification accuracies at that threshold.</dd>
147</dl>
148
149<P>There's also a dumb variant of <code>ThresholdLearner</code>, a class call <code>ThreshholdLearner_fixed</code>. Instead of finding the optimal threshold it uses a prescribed one. So, it has the following two attributes.</P>
150<dl class="attributes">
151<dt>learner</dt>
152<dd>The wrapped learner, for example an instance of <code>orange.BayesLearner</code>.</dd>
153
154<dt>threshold</dt>
155<dd>Threshold to use in classification.</dd>
156</dl>
157
158<P>What this guy does is therefore simple: to learn, it calls the learner and puts the resulting classifier together with the threshold into an instance of <code>ThresholdClassifier</code>.
159
160<P><code><INDEX name="classes/ThresholdClassifier (in orngWrap)">ThresholdClassifier</code>, used by both <code>ThredholdLearner</code> and <code>ThresholdLearner_fixed</code> is therefore another wrapper class, containing a classifier and a threshold. When it needs to classify an example, it calls the wrapped classifier to predict probabilities. The example will be classified into the second class only if the probability of that class is above the threshold.</P>
161<dl class="attributes">
162<dt>classifier</dt>
163<dd>The wrapped classifier, normally the one related to the <code>ThresholdLearner</code>'s <code>learner</code>, e.g. an instance of <code>orange.BayesClassifier</code>.</dd>
164
165<dt>threshold</dt>
166<dd>The threshold for classification into the second class.</dd>
167</dl>
168<P>The two attributes can be specified set as attributes or given to the constructor as ordinary arguments.</P>
169
170<P>This is how you use the learner.</P>
171
172<p class="header"><a href="thresholding1.py">thresholding1.py</a></p>
173<xmp class="code">import orange, orngWrap, orngTest, orngStat
174
175data = orange.ExampleTable("bupa")
176
177learner = orange.BayesLearner()
178thresh = orngWrap.ThresholdLearner(learner = learner)
179thresh80 = orngWrap.ThresholdLearner_fixed(learner = learner, threshold = .8)
180res = orngTest.crossValidation([learner, thresh, thresh80], data)
181CAs = orngStat.CA(res)
182
183print "W/out threshold adjustement: %5.3f" % CAs[0]
184print "With adjusted thredhold: %5.3f" % CAs[1]
185print "With threshold at 0.80: %5.3f" % CAs[2]
186</xmp>
187
188<P>The output,
189<xmp class="printout">W/out threshold adjustement: 0.633
190With adjusted thredhold: 0.659
191With threshold at 0.80: 0.449
192</xmp>
193shows that fitting threshold is good (well, although 2.5 percent increase in the accuracy absolutely guarantees you a publication at ICML, the difference is still unimportant), while setting it at 80% is a bad idea. Or is it?</P>
194
195<p class="header"><a href="thresholding2.py">thresholding2.py</a></p>
196<xmp class="code">
197import orange, orngWrap, orngTest, orngStat
198
199data = orange.ExampleTable("bupa")
200ri2 = orange.MakeRandomIndices2(data, 0.7)
201train = data.select(ri2, 0)
202test = data.select(ri2, 1)
203
204bayes = orange.BayesLearner(train)
205
206thresholds = [.2, .5, .8]
207models = [orngWrap.ThresholdClassifier(bayes, thr) for thr in thresholds]
208
209res = orngTest.testOnData(models, test)
210cm = orngStat.confusionMatrices(res)
211
212print
213for i, thr in enumerate(thresholds):
214    print "%1.2f: TP %5.3f, TN %5.3f" % (thr, cm[i].TP, cm[i].TN)
215</xmp>
216
217<P>The script first divides the data into training and testing examples. It trains a naive Bayesian classifier and than wraps it into <code>ThresholdClassifier</code>s with thresholds of .2, .5 and .8. The three models are tested on the left-out examples, and we compute the confusion matrices from the results. The printout,
218<xmp class="printout">0.20: TP 60.000, TN 1.000
2190.50: TP 42.000, TN 24.000
2200.80: TP 2.000, TN 43.000</xmp>
221shows how the varying threshold changes the balance between the number of true positives and negatives.</P>
222</body>
223</html>
Note: See TracBrowser for help on using the repository browser.