orange/Orange/optimization/__init__.py
r7528 r7561 11 11 Tuning parameters 12 12 ================= 13 14 Two classes support tuning parameters. 15 :obj:`Orange.optimization.Tune1Parameter` for fitting a single parameter and 16 :obj:`Orange.optimization.TuneMParameters` fitting multiple parameters at once, 17 trying all possible combinations. When called with examples and, optionally, id 18 of meta attribute with weights, they find the optimal setting of arguments 19 using the cross validation. The classes can also be used as ordinary learning 20 algorithms  they are in fact derived from 21 :obj:`Orange.classification.Learner`. 22 23 Both classes have a common parent, :obj:`Orange.optimization.TuneParameters`, 24 and a few common attributes. 13 25 14 26 .. autoclass:: Orange.optimization.TuneParameters … … 38 50 import Orange.core 39 51 40 # The class needs to be given41 # object  the learning algorithm to be fitted42 # evaluate  statistics to evaluate (default: orngStat.CA)43 # folds  the number of folds for internal cross validation44 # compare  function to compare (default: cmp  the bigger the better)45 # returnWhat  tells whether to return values of parameters, a fitted46 # learner, the best classifier or None. "object" is left47 # with optimal parameters in any case48 52 class TuneParameters(Orange.core.Learner): 49 """Tune 53 54 """.. attribute:: examples 55 56 Data table with either discrete or continuous features 57 58 .. attribute:: weightID 59 60 The ID of the weight meta attribute 61 62 .. attribute:: object 63 64 The learning algorithm whose parameters are to be tuned. This can be, for 65 instance, orngTree.TreeLearner. You will usually use the wrapped learners 66 from modules, not the builtin classifiers, such as orange.TreeLearner 67 directly, since the arguments to be fitted are easier to address in the 68 wrapped versions. But in principle it doesn't matter. 69 70 .. attribute:: evaluate 71 72 The statistics to evaluate. The default is orngStat.CA, so the learner will 73 be fit for the optimal classification accuracy. You can replace it with, 74 for instance, orngStat.AUC to optimize the AUC. Statistics can return 75 either a single value (classification accuracy), a list with a single value 76 (this is what orngStat.CA actually does), or arbitrary objects which the 77 compare function below must be able to compare. 78 79 .. attribute:: folds 80 81 The number of folds used in internal crossvalidation. Default is 5. 82 83 .. attribute:: compare 84 85 The function used to compare the results. The function should accept two 86 arguments (e.g. two classification accuracies, AUCs or whatever the result 87 of evaluate is) and return a positive value if the first argument is 88 better, 0 if they are equal and a negative value if the first is worse than 89 the second. The default compare function is cmp. You don't need to change 90 this if evaluate is such that higher values mean a better classifier. 91 92 .. attribute:: returnWhat 93 94 Decides what should be result of tuning. Possible values are: 95 96 * TuneParameters.returnNone (or 0): tuning will return nothing, 97 * TuneParameters.returnParameters (or 1): return the optimal value(s) of parameter(s), 98 * TuneParameters.returnLearner (or 2): return the learner set to optimal parameters, 99 * TuneParameters.returnClassifier (or 3): return a classifier trained with the optimal parameters on the entire data set. This is the default setting. 100 101 Regardless of this, the learner (given as object) is left set to the 102 optimal parameters. 103 104 .. attribute:: verbose 105 106 If 0 (default), the class doesn't print anything. If set to 1, it will 107 print out the optimal value found, if set to 2, it will print out all tried 108 values and the related 109 110 If tuner returns the classifier, it behaves as a learning algorithm. As the 111 examples below will demonstrate, it can be called, given the examples and 112 the result is a "trained" classifier. It can, for instance, be used in 113 crossvalidation. 114 115 Out of these attributes, the only necessary argument is object. The real tuning 116 classes add two additional  the attributes that tell what parameter(s) to 117 optimize and which values to use. 50 118 51 119 """ … … 77 145 # (eg <object>.<parameter> = <value>[i]) 78 146 class Tune1Parameter(TuneParameters): 147 148 """Class :obj:`Orange.optimization.Tune1Parameter` tunes a single parameter. 149 150 .. attribute:: parameter 151 152 The name of the parameter (or a list of names, if the same parameter is 153 stored at multiple places  see the examples) to be tuned. 154 155 .. attribute:: values 156 157 A list of parameter's values to be tried. 158 159 To show how it works, we shall fit the minimal number of examples in a leaf 160 for a tree classifier. 161 162 part of `optimizationtuning1.py`_ 163 164 .. literalinclude:: code/optimizationtuning1.py 165 :lines: 715 166 167 Set up like this, when the tuner is called, set learner.minSubset to 1, 2, 168 3, 4, 5, 10, 15 and 20, and measure the AUC in 5fold cross validation. It 169 will then reset the learner.minSubset to the optimal value found and, since 170 we left returnWhat at the default (returnClassifier), construct and return 171 the classifier from the entire data set. So, what we get is a classifier, 172 but if we'd also like to know what the optimal value was, we can get it 173 from learner.minSubset. 174 175 Tuning is of course not limited to setting numeric parameters. You can, for 176 instance, try to find the optimal criteria for assessing the quality of 177 attributes by tuning parameter="measure", trying settings like 178 values=[orange.MeasureAttribute_gainRatio(), 179 orange.MeasureAttribute_gini()]. 180 181 Since the tuner returns a classifier and thus behaves like a learner, it 182 can be used in a crossvalidation. Let us see whether a tuning tree indeed 183 enhances the AUC or not. We shall reuse the tuner from above, add another 184 tree learner, and test them both. 185 186 part of `optimizationtuning1.py`_ 187 188 .. literalinclude:: code/optimizationtuning1.py 189 :lines: 1722 190 191 This will take some time: for each of 8 values for minSubset it will 192 perform 5fold cross validation inside a 10fold cross validation  193 altogether 400 trees. Plus, it will learn the optimal tree afterwards for 194 each fold. Add a tree without tuning, and you get 420 trees build. 195 196 Well, not that long, and the results are good:: 197 198 Untuned tree: 0.930 199 Tuned tree: 0.986 200 201 .. _optimizationtuning1.py: code/optimizationtuning1.py 202 203 """ 204 79 205 def __call__(self, table, weight=None, verbose=0): 80 206 import orngTest, orngStat, orngMisc … … 126 252 # (eg <object>.<parameter[j]> = <value[j]>[i]) 127 253 class TuneMParameters(TuneParameters): 254 255 """The use of :obj:`Orange.optimization.TuneMParameters differs from 256 Tune1Parameter only in specification of tuning parameters. 257 258 .. attribute:: parameters 259 260 A list of twoelement tuples, each containing the name of a parameter 261 and its possible values. 262 263 For exercise we can try to tune both settings mentioned above, the minimal 264 number of examples in leaves and the splitting criteria by setting the 265 tuner as follows: 266 267 part of `optimizationtuningm.py`_ 268 269 .. literalinclude:: code/optimizationtuningm.py 270 :lines: 912 271 272 Everything else stays like above, in examples for Tune1Parameter. 273 274 .. _optimizationtuningm.py: code/optimizationtuningm.py 275 276 """ 277 128 278 def __call__(self, table, weight=None, verbose=0): 129 279 import orngTest, orngStat, orngMisc
