02/06/12 13:17:48 (2 years ago)
tomazc <tomaz.curk@…>

Updated documentation on Orange.feature.imputation.

1 edited


  • Orange/feature/imputation.py

    r9671 r9806  
    1 """ 
    2 ########################### 
    3 Imputation (``imputation``) 
    4 ########################### 
    6 .. index:: imputation 
    8 .. index::  
    9    single: feature; value imputation 
    12 Imputation is a procedure of replacing the missing feature values with some  
    13 appropriate values. Imputation is needed because of the methods (learning  
    14 algorithms and others) that are not capable of handling unknown values, for  
    15 instance logistic regression. 
    17 Missing values sometimes have a special meaning, so they need to be replaced 
    18 by a designated value. Sometimes we know what to replace the missing value 
    19 with; for instance, in a medical problem, some laboratory tests might not be 
    20 done when it is known what their results would be. In that case, we impute  
    21 certain fixed value instead of the missing. In the most complex case, we assign 
    22 values that are computed based on some model; we can, for instance, impute the 
    23 average or majority value or even a value which is computed from values of 
    24 other, known feature, using a classifier. 
    26 In a learning/classification process, imputation is needed on two occasions. 
    27 Before learning, the imputer needs to process the training examples. 
    28 Afterwards, the imputer is called for each example to be classified. 
    30 In general, imputer itself needs to be trained. This is, of course, not needed 
    31 when the imputer imputes certain fixed value. However, when it imputes the 
    32 average or majority value, it needs to compute the statistics on the training 
    33 examples, and use it afterwards for imputation of training and testing 
    34 examples. 
    36 While reading this document, bear in mind that imputation is a part of the 
    37 learning process. If we fit the imputation model, for instance, by learning 
    38 how to predict the feature's value from other features, or even if we  
    39 simply compute the average or the minimal value for the feature and use it 
    40 in imputation, this should only be done on learning data. If cross validation 
    41 is used for sampling, imputation should be done on training folds only. Orange 
    42 provides simple means for doing that. 
    44 This page will first explain how to construct various imputers. Then follow 
    45 the examples for `proper use of imputers <#using-imputers>`_. Finally, quite 
    46 often you will want to use imputation with special requests, such as certain 
    47 features' missing values getting replaced by constants and other by values 
    48 computed using models induced from specified other features. For instance, 
    49 in one of the studies we worked on, the patient's pulse rate needed to be 
    50 estimated using regression trees that included the scope of the patient's 
    51 injuries, sex and age, some attributes' values were replaced by the most 
    52 pessimistic ones and others were computed with regression trees based on 
    53 values of all features. If you are using learners that need the imputer as a 
    54 component, you will need to `write your own imputer constructor  
    55 <#write-your-own-imputer-constructor>`_. This is trivial and is explained at 
    56 the end of this page. 
    58 Wrapper for learning algorithms 
    59 =============================== 
    61 This wrapper can be used with learning algorithms that cannot handle missing 
    62 values: it will impute the missing examples using the imputer, call the  
    63 earning and, if the imputation is also needed by the classifier, wrap the 
    64 resulting classifier into another wrapper that will impute the missing values 
    65 in examples to be classified. 
    67 Even so, the module is somewhat redundant, as all learners that cannot handle  
    68 missing values should, in principle, provide the slots for imputer constructor. 
    69 For instance, :obj:`Orange.classification.logreg.LogRegLearner` has an attribute  
    70 :obj:`Orange.classification.logreg.LogRegLearner.imputerConstructor`, and even 
    71 if you don't set it, it will do some imputation by default. 
    73 .. class:: ImputeLearner 
    75     Wraps a learner and performs data discretization before learning. 
    77     Most of Orange's learning algorithms do not use imputers because they can 
    78     appropriately handle the missing values. Bayesian classifier, for instance, 
    79     simply skips the corresponding attributes in the formula, while 
    80     classification/regression trees have components for handling the missing 
    81     values in various ways. 
    83     If for any reason you want to use these algorithms to run on imputed data, 
    84     you can use this wrapper. The class description is a matter of a separate 
    85     page, but we shall show its code here as another demonstration of how to 
    86     use the imputers - logistic regression is implemented essentially the same 
    87     as the below classes. 
    89     This is basically a learner, so the constructor will return either an 
    90     instance of :obj:`ImputerLearner` or, if called with examples, an instance 
    91     of some classifier. There are a few attributes that need to be set, though. 
    93     .. attribute:: base_learner  
    95     A wrapped learner. 
    97     .. attribute:: imputer_constructor 
    99     An instance of a class derived from :obj:`ImputerConstructor` (or a class 
    100     with the same call operator). 
    102     .. attribute:: dont_impute_classifier 
    104     If given and set (this attribute is optional), the classifier will not be 
    105     wrapped into an imputer. Do this if the classifier doesn't mind if the 
    106     examples it is given have missing values. 
    108     The learner is best illustrated by its code - here's its complete 
    109     :obj:`__call__` method:: 
    111         def __call__(self, data, weight=0): 
    112             trained_imputer = self.imputer_constructor(data, weight) 
    113             imputed_data = trained_imputer(data, weight) 
    114             base_classifier = self.base_learner(imputed_data, weight) 
    115             if self.dont_impute_classifier: 
    116                 return base_classifier 
    117             else: 
    118                 return ImputeClassifier(base_classifier, trained_imputer) 
    120     So "learning" goes like this. :obj:`ImputeLearner` will first construct 
    121     the imputer (that is, call :obj:`self.imputer_constructor` to get a (trained) 
    122     imputer. Than it will use the imputer to impute the data, and call the 
    123     given :obj:`baseLearner` to construct a classifier. For instance, 
    124     :obj:`baseLearner` could be a learner for logistic regression and the 
    125     result would be a logistic regression model. If the classifier can handle 
    126     unknown values (that is, if :obj:`dont_impute_classifier`, we return it as  
    127     it is, otherwise we wrap it into :obj:`ImputeClassifier`, which is given 
    128     the base classifier and the imputer which it can use to impute the missing 
    129     values in (testing) examples. 
    131 .. class:: ImputeClassifier 
    133     Objects of this class are returned by :obj:`ImputeLearner` when given data. 
    135     .. attribute:: baseClassifier 
    137     A wrapped classifier. 
    139     .. attribute:: imputer 
    141     An imputer for imputation of unknown values. 
    143     .. method:: __call__  
    145     This class is even more trivial than the learner. Its constructor accepts  
    146     two arguments, the classifier and the imputer, which are stored into the 
    147     corresponding attributes. The call operator which does the classification 
    148     then looks like this:: 
    150         def __call__(self, ex, what=orange.GetValue): 
    151             return self.base_classifier(self.imputer(ex), what) 
    153     It imputes the missing values by calling the :obj:`imputer` and passes the 
    154     class to the base classifier. 
    156 .. note::  
    157    In this setup the imputer is trained on the training data - even if you do 
    158    cross validation, the imputer will be trained on the right data. In the 
    159    classification phase we again use the imputer which was classified on the 
    160    training data only. 
    162 .. rubric:: Code of ImputeLearner and ImputeClassifier  
    164 :obj:`Orange.feature.imputation.ImputeLearner` puts the keyword arguments into 
    165 the instance's  dictionary. You are expected to call it like 
    166 :obj:`ImputeLearner(base_learner=<someLearner>, 
    167 imputer=<someImputerConstructor>)`. When the learner is called with examples, it 
    168 trains the imputer, imputes the data, induces a :obj:`base_classifier` by the 
    169 :obj:`base_cearner` and constructs :obj:`ImputeClassifier` that stores the 
    170 :obj:`base_classifier` and the :obj:`imputer`. For classification, the missing 
    171 values are imputed and the classifier's prediction is returned. 
    173 Note that this code is slightly simplified, although the omitted details handle 
    174 non-essential technical issues that are unrelated to imputation:: 
    176     class ImputeLearner(orange.Learner): 
    177         def __new__(cls, examples = None, weightID = 0, **keyw): 
    178             self = orange.Learner.__new__(cls, **keyw) 
    179             self.__dict__.update(keyw) 
    180             if examples: 
    181                 return self.__call__(examples, weightID) 
    182             else: 
    183                 return self 
    185         def __call__(self, data, weight=0): 
    186             trained_imputer = self.imputer_constructor(data, weight) 
    187             imputed_data = trained_imputer(data, weight) 
    188             base_classifier = self.base_learner(imputed_data, weight) 
    189             return ImputeClassifier(base_classifier, trained_imputer) 
    191     class ImputeClassifier(orange.Classifier): 
    192         def __init__(self, base_classifier, imputer): 
    193             self.base_classifier = base_classifier 
    194             self.imputer = imputer 
    196         def __call__(self, ex, what=orange.GetValue): 
    197             return self.base_classifier(self.imputer(ex), what) 
    199 .. rubric:: Example 
    201 Although most Orange's learning algorithms will take care of imputation 
    202 internally, if needed, it can sometime happen that an expert will be able to 
    203 tell you exactly what to put in the data instead of the missing values. In this 
    204 example we shall suppose that we want to impute the minimal value of each 
    205 feature. We will try to determine whether the naive Bayesian classifier with 
    206 its  implicit internal imputation works better than one that uses imputation by  
    207 minimal values. 
    209 :download:`imputation-minimal-imputer.py <code/imputation-minimal-imputer.py>` (uses :download:`voting.tab <code/voting.tab>`): 
    211 .. literalinclude:: code/imputation-minimal-imputer.py 
    212     :lines: 7- 
    214 Should ouput this:: 
    216     Without imputation: 0.903 
    217     With imputation: 0.899 
    219 .. note::  
    220    Note that we constructed just one instance of \ 
    221    :obj:`Orange.classification.bayes.NaiveLearner`, but this same instance is 
    222    used twice in each fold, once it is given the examples as they are (and  
    223    returns an instance of :obj:`Orange.classification.bayes.NaiveClassifier`. 
    224    The second time it is called by :obj:`imba` and the \ 
    225    :obj:`Orange.classification.bayes.NaiveClassifier` it returns is wrapped 
    226    into :obj:`Orange.feature.imputation.Classifier`. We thus have only one 
    227    learner, but which produces two different classifiers in each round of 
    228    testing. 
    230 Abstract imputers 
    231 ================= 
    233 As common in Orange, imputation is done by pairs of two classes: one that does 
    234 the work and another that constructs it. :obj:`ImputerConstructor` is an 
    235 abstract root of the hierarchy of classes that get the training data (with an  
    236 optional id for weight) and constructs an instance of a class, derived from 
    237 :obj:`Imputer`. An :obj:`Imputer` can be called with an 
    238 :obj:`Orange.data.Instance` and it will return a new example with the missing 
    239 values imputed (it will leave the original example intact!). If imputer is 
    240 called with an :obj:`Orange.data.Table`, it will return a new example table 
    241 with imputed examples. 
    243 .. class:: ImputerConstructor 
    245     .. attribute:: imputeClass 
    247     Tell whether to impute the class value (default) or not. 
    249 Simple imputation 
    250 ================= 
    252 The simplest imputers always impute the same value for a particular attribute, 
    253 disregarding the values of other attributes. They all use the same imputer 
    254 class, :obj:`Imputer_defaults`. 
    256 .. class:: Imputer_defaults 
    258     .. attribute::  defaults 
    260     An example with the default values to be imputed instead of the missing.  
    261     Examples to be imputed must be from the same domain as :obj:`defaults`. 
    263     Instances of this class can be constructed by  
    264     :obj:`Orange.feature.imputation.ImputerConstructor_minimal`,  
    265     :obj:`Orange.feature.imputation.ImputerConstructor_maximal`, 
    266     :obj:`Orange.feature.imputation.ImputerConstructor_average`.  
    268     For continuous features, they will impute the smallest, largest or the 
    269     average  values encountered in the training examples. For discrete, they 
    270     will impute the lowest (the one with index 0, e. g. attr.values[0]), the  
    271     highest (attr.values[-1]), and the most common value encountered in the 
    272     data. The first two imputers will mostly be used when the discrete values 
    273     are ordered according to their impact on the class (for instance, possible 
    274     values for symptoms of some disease can be ordered according to their 
    275     seriousness). The minimal and maximal imputers will then represent 
    276     optimistic and pessimistic imputations. 
    278     The following code will load the bridges data, and first impute the values 
    279     in a single examples and then in the whole table. 
    281 :download:`imputation-complex.py <code/imputation-complex.py>` (uses :download:`bridges.tab <code/bridges.tab>`): 
    283 .. literalinclude:: code/imputation-complex.py 
    284     :lines: 9-23 
    286 This is example shows what the imputer does, not how it is to be used. Don't 
    287 impute all the data and then use it for cross-validation. As warned at the top 
    288 of this page, see the instructions for actual `use of 
    289 imputers <#using-imputers>`_. 
    291 .. note:: The :obj:`ImputerConstructor` are another class with schizophrenic 
    292   constructor: if you give the constructor the data, it will return an \ 
    293   :obj:`Imputer` - the above call is equivalent to calling \ 
    294   :obj:`Orange.feature.imputation.ImputerConstructor_minimal()(data)`. 
    296 You can also construct the :obj:`Orange.feature.imputation.Imputer_defaults` 
    297 yourself and specify your own defaults. Or leave some values unspecified, in 
    298 which case the imputer won't impute them, as in the following example. Here, 
    299 the only attribute whose values will get imputed is "LENGTH"; the imputed value 
    300 will be 1234. 
    302 .. literalinclude:: code/imputation-complex.py 
    303     :lines: 56-69 
    305 :obj:`Orange.feature.imputation.Imputer_defaults`'s constructor will accept an 
    306 argument of type :obj:`Orange.data.Domain` (in which case it will construct an 
    307 empty instance for :obj:`defaults`) or an example. (Be careful with this: 
    308 :obj:`Orange.feature.imputation.Imputer_defaults` will have a reference to the 
    309 instance and not a copy. But you can make a copy yourself to avoid problems: 
    310 instead of `Imputer_defaults(data[0])` you may want to write 
    311 `Imputer_defaults(Orange.data.Instance(data[0]))`. 
    313 Random imputation 
    314 ================= 
    316 .. class:: Imputer_Random 
    318     Imputes random values. The corresponding constructor is 
    319     :obj:`ImputerConstructor_Random`. 
    321     .. attribute:: impute_class 
    323     Tells whether to impute the class values or not. Defaults to True. 
    325     .. attribute:: deterministic 
    327     If true (default is False), random generator is initialized for each 
    328     example using the example's hash value as a seed. This results in same 
    329     examples being always imputed the same values. 
    331 Model-based imputation 
    332 ====================== 
    334 .. class:: ImputerConstructor_model 
    336     Model-based imputers learn to predict the attribute's value from values of 
    337     other attributes. :obj:`ImputerConstructor_model` are given a learning 
    338     algorithm (two, actually - one for discrete and one for continuous 
    339     attributes) and they construct a classifier for each attribute. The 
    340     constructed imputer :obj:`Imputer_model` stores a list of classifiers which 
    341     are used when needed. 
    343     .. attribute:: learner_discrete, learner_continuous 
    345     Learner for discrete and for continuous attributes. If any of them is 
    346     missing, the attributes of the corresponding type won't get imputed. 
    348     .. attribute:: use_class 
    350     Tells whether the imputer is allowed to use the class value. As this is 
    351     most often undesired, this option is by default set to False. It can 
    352     however be useful for a more complex design in which we would use one 
    353     imputer for learning examples (this one would use the class value) and 
    354     another for testing examples (which would not use the class value as this 
    355     is unavailable at that moment). 
    357 .. class:: Imputer_model 
    359     .. attribute: models 
    361     A list of classifiers, each corresponding to one attribute of the examples 
    362     whose values are to be imputed. The :obj:`classVar`'s of the models should 
    363     equal the examples' attributes. If any of classifier is missing (that is, 
    364     the corresponding element of the table is :obj:`None`, the corresponding 
    365     attribute's values will not be imputed. 
    367 .. rubric:: Examples 
    369 The following imputer predicts the missing attribute values using 
    370 classification and regression trees with the minimum of 20 examples in a leaf.  
    371 Part of :download:`imputation-complex.py <code/imputation-complex.py>` (uses :download:`bridges.tab <code/bridges.tab>`): 
    373 .. literalinclude:: code/imputation-complex.py 
    374     :lines: 74-76 
    376 We could even use the same learner for discrete and continuous attributes, 
    377 as :class:`Orange.classification.tree.TreeLearner` checks the class type 
    378 and constructs regression or classification trees accordingly. The  
    379 common parameters, such as the minimal number of 
    380 examples in leaves, are used in both cases. 
    382 You can also use different learning algorithms for discrete and 
    383 continuous attributes. Probably a common setup will be to use 
    384 :class:`Orange.classification.bayes.BayesLearner` for discrete and  
    385 :class:`Orange.regression.mean.MeanLearner` (which 
    386 just remembers the average) for continuous attributes. Part of  
    387 :download:`imputation-complex.py <code/imputation-complex.py>` (uses :download:`bridges.tab <code/bridges.tab>`): 
    389 .. literalinclude:: code/imputation-complex.py 
    390     :lines: 91-94 
    392 You can also construct an :class:`Imputer_model` yourself. You will do  
    393 this if different attributes need different treatment. Brace for an  
    394 example that will be a bit more complex. First we shall construct an  
    395 :class:`Imputer_model` and initialize an empty list of models.  
    396 The following code snippets are from 
    397 :download:`imputation-complex.py <code/imputation-complex.py>` (uses :download:`bridges.tab <code/bridges.tab>`): 
    399 .. literalinclude:: code/imputation-complex.py 
    400     :lines: 108-109 
    402 Attributes "LANES" and "T-OR-D" will always be imputed values 2 and 
    403 "THROUGH". Since "LANES" is continuous, it suffices to construct a 
    404 :obj:`DefaultClassifier` with the default value 2.0 (don't forget the 
    405 decimal part, or else Orange will think you talk about an index of a discrete 
    406 value - how could it tell?). For the discrete attribute "T-OR-D", we could 
    407 construct a :class:`Orange.classification.ConstantClassifier` and give the index of value 
    408 "THROUGH" as an argument. But we shall do it nicer, by constructing a 
    409 :class:`Orange.data.Value`. Both classifiers will be stored at the appropriate places 
    410 in :obj:`imputer.models`. 
    412 .. literalinclude:: code/imputation-complex.py 
    413     :lines: 110-112 
    416 "LENGTH" will be computed with a regression tree induced from "MATERIAL",  
    417 "SPAN" and "ERECTED" (together with "LENGTH" as the class attribute, of 
    418 course). Note that we initialized the domain by simply giving a list with 
    419 the names of the attributes, with the domain as an additional argument 
    420 in which Orange will look for the named attributes. 
    422 .. literalinclude:: code/imputation-complex.py 
    423     :lines: 114-119 
    425 We printed the tree just to see what it looks like. 
    427 :: 
    429     <XMP class=code>SPAN=SHORT: 1158 
    430     SPAN=LONG: 1907 
    431     SPAN=MEDIUM 
    432     |    ERECTED<1908.500: 1325 
    433     |    ERECTED>=1908.500: 1528 
    434     </XMP> 
    436 Small and nice. Now for the "SPAN". Wooden bridges and walkways are short, 
    437 while the others are mostly medium. This could be done by 
    438 :class:`Orange.classifier.ClassifierByLookupTable` - this would be faster 
    439 than what we plan here. See the corresponding documentation on lookup 
    440 classifier. Here we are going to do it with a Python function. 
    442 .. literalinclude:: code/imputation-complex.py 
    443     :lines: 121-128 
    445 :obj:`compute_span` could also be written as a class, if you'd prefer 
    446 it. It's important that it behaves like a classifier, that is, gets an example 
    447 and returns a value. The second element tells, as usual, what the caller expect 
    448 the classifier to return - a value, a distribution or both. Since the caller, 
    449 :obj:`Imputer_model`, always wants values, we shall ignore the argument 
    450 (at risk of having problems in the future when imputers might handle 
    451 distribution as well). 
    453 Missing values as special values 
    454 ================================ 
    456 Missing values sometimes have a special meaning. The fact that something was 
    457 not measured can sometimes tell a lot. Be, however, cautious when using such 
    458 values in decision models; it the decision not to measure something (for 
    459 instance performing a laboratory test on a patient) is based on the expert's 
    460 knowledge of the class value, such unknown values clearly should not be used  
    461 in models. 
    463 .. class:: ImputerConstructor_asValue 
    465     Constructs a new domain in which each 
    466     discrete attribute is replaced with a new attribute that has one value more: 
    467     "NA". The new attribute will compute its values on the fly from the old one, 
    468     copying the normal values and replacing the unknowns with "NA". 
    470     For continuous attributes, it will 
    471     construct a two-valued discrete attribute with values "def" and "undef", 
    472     telling whether the continuous attribute was defined or not. The attribute's 
    473     name will equal the original's with "_def" appended. The original continuous 
    474     attribute will remain in the domain and its unknowns will be replaced by 
    475     averages. 
    477     :class:`ImputerConstructor_asValue` has no specific attributes. 
    479     It constructs :class:`Imputer_asValue` (I bet you 
    480     wouldn't guess). It converts the example into the new domain, which imputes  
    481     the values for discrete attributes. If continuous attributes are present, it  
    482     will also replace their values by the averages. 
    484 .. class:: Imputer_asValue 
    486     .. attribute:: domain 
    488         The domain with the new attributes constructed by  
    489         :class:`ImputerConstructor_asValue`. 
    491     .. attribute:: defaults 
    493         Default values for continuous attributes. Present only if there are any. 
    495 The following code shows what this imputer actually does to the domain. 
    496 Part of :download:`imputation-complex.py <code/imputation-complex.py>` (uses :download:`bridges.tab <code/bridges.tab>`): 
    498 .. literalinclude:: code/imputation-complex.py 
    499     :lines: 137-151 
    501 The script's output looks like this:: 
    507     RIVER: M -> M 
    508     ERECTED: 1874 -> 1874 (def) 
    509     PURPOSE: RR -> RR 
    510     LENGTH: ? -> 1567 (undef) 
    511     LANES: 2 -> 2 (def) 
    512     CLEAR-G: ? -> NA 
    513     T-OR-D: THROUGH -> THROUGH 
    514     MATERIAL: IRON -> IRON 
    515     SPAN: ? -> NA 
    516     REL-L: ? -> NA 
    517     TYPE: SIMPLE-T -> SIMPLE-T 
    519 Seemingly, the two examples have the same attributes (with 
    520 :samp:`imputed` having a few additional ones). If you check this by 
    521 :samp:`original.domain[0] == imputed.domain[0]`, you shall see that this 
    522 first glance is False. The attributes only have the same names, 
    523 but they are different attributes. If you read this page (which is already a 
    524 bit advanced), you know that Orange does not really care about the attribute 
    525 names). 
    527 Therefore, if we wrote :samp:`imputed[i]` the program would fail 
    528 since :samp:`imputed` has no attribute :samp:`i`. But it has an 
    529 attribute with the same name (which even usually has the same value). We 
    530 therefore use :samp:`i.name` to index the attributes of 
    531 :samp:`imputed`. (Using names for indexing is not fast, though; if you do 
    532 it a lot, compute the integer index with 
    533 :samp:`imputed.domain.index(i.name)`.)</P> 
    535 For continuous attributes, there is an additional attribute with "_def" 
    536 appended; we get it by :samp:`i.name+"_def"`. 
    538 The first continuous attribute, "ERECTED" is defined. Its value remains 1874 
    539 and the additional attribute "ERECTED_def" has value "def". Not so for 
    540 "LENGTH". Its undefined value is replaced by the average (1567) and the new 
    541 attribute has value "undef". The undefined discrete attribute "CLEAR-G" (and 
    542 all other undefined discrete attributes) is assigned the value "NA". 
    544 Using imputers 
    545 ============== 
    547 To properly use the imputation classes in learning process, they must be 
    548 trained on training examples only. Imputing the missing values and subsequently 
    549 using the data set in cross-validation will give overly optimistic results. 
    551 Learners with imputer as a component 
    552 ------------------------------------ 
    554 Orange learners that cannot handle missing values will generally provide a slot 
    555 for the imputer component. An example of such a class is 
    556 :obj:`Orange.classification.logreg.LogRegLearner` with an attribute called 
    557 :obj:`Orange.classification.logreg.LogRegLearner.imputerConstructor`. To it you 
    558 can assign an imputer constructor - one of the above constructors or a specific 
    559 constructor you wrote yourself. When given learning examples, 
    560 :obj:`Orange.classification.logreg.LogRegLearner` will pass them to 
    561 :obj:`Orange.classification.logreg.LogRegLearner.imputerConstructor` to get an 
    562 imputer (again some of the above or a specific imputer you programmed). It will 
    563 immediately use the imputer to impute the missing values in the learning data 
    564 set, so it can be used by the actual learning algorithm. Besides, when the 
    565 classifier :obj:`Orange.classification.logreg.LogRegClassifier` is constructed, 
    566 the imputer will be stored in its attribute 
    567 :obj:`Orange.classification.logreg.LogRegClassifier.imputer`. At 
    568 classification, the imputer will be used for imputation of missing values in 
    569 (testing) examples. 
    571 Although details may vary from algorithm to algorithm, this is how the 
    572 imputation is generally used in Orange's learners. Also, if you write your own 
    573 learners, it is recommended that you use imputation according to the described 
    574 procedure. 
    576 Write your own imputer 
    577 ====================== 
    579 Imputation classes provide the Python-callback functionality (not all Orange 
    580 classes do so, refer to the documentation on `subtyping the Orange classes  
    581 in Python <callbacks.htm>`_ for a list). If you want to write your own 
    582 imputation constructor or an imputer, you need to simply program a Python 
    583 function that will behave like the built-in Orange classes (and even less, 
    584 for imputer, you only need to write a function that gets an example as 
    585 argument, imputation for example tables will then use that function). 
    587 You will most often write the imputation constructor when you have a special 
    588 imputation procedure or separate procedures for various attributes, as we've  
    589 demonstrated in the description of 
    590 :obj:`Orange.feature.imputation.ImputerConstructor_model`. You basically only  
    591 need to pack everything we've written there to an imputer constructor that 
    592 will accept a data set and the id of the weight meta-attribute (ignore it if 
    593 you will, but you must accept two arguments), and return the imputer (probably 
    594 :obj:`Orange.feature.imputation.Imputer_model`. The benefit of implementing an 
    595 imputer constructor as opposed to what we did above is that you can use such a 
    596 constructor as a component for Orange learners (like logistic regression) or 
    597 for wrappers from module orngImpute, and that way properly use the in 
    598 classifier testing procedures. 
    600 """ 
    6021import Orange.core as orange 
    6032from orange import ImputerConstructor_minimal  
Note: See TracChangeset for help on using the changeset viewer.