Changeset 8925:ea6a068ee6e1 in orange


Ignore:
Timestamp:
09/08/11 11:58:12 (3 years ago)
Author:
lanz <lan.zagar@…>
Branch:
default
Convert:
a7b6fc9f68ff77b339cfea951aa47b3a3b2e37e2
Message:

Improved documenatation for lookup module.

Location:
orange
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/classification/lookup.py

    r8066 r8925  
    99Lookup classifiers predict classes by looking into stored lists of 
    1010cases. There are two kinds of such classifiers in Orange. The simpler 
    11 and fastest :obj:`ClassifierByLookupTable` use up to three discrete 
    12 features and have a stored mapping from values of those features to 
    13 class value. The more complex classifiers store a 
     11and faster :obj:`ClassifierByLookupTable` uses up to three discrete 
     12features and has a stored mapping from values of those features to the 
     13class value. The more complex classifiers store an 
    1414:obj:`Orange.data.Table` and predict the class by matching the instance 
    1515to instances in the table. 
     
    2121they usually reside in :obj:`~Orange.data.variable.Variable.get_value_from` fields of constructed 
    2222features to facilitate their automatic computation. For instance, 
    23 the following script shows how to translate the `monks-1.tab`_ dataset 
     23the following script shows how to translate the `monks-1.tab`_ data set 
    2424features into a more useful subset that will only include the features 
    25 a, b, e, and features that will tell whether a and b are equal and 
    26 whether e is 1 (don't bother about the details, they follow later;  
     25``a``, ``b``, ``e``, and features that will tell whether ``a`` and ``b`` are equal and 
     26whether ``e`` is 1 (don't bother about the details, they follow later;  
    2727`lookup-lookup.py`_, uses: `monks-1.tab`_): 
    2828 
     
    3131     
    3232We can check the correctness of the script by printing out several 
    33 random examples from data2. 
     33random examples from ``data2``. 
    3434 
    3535    >>> for i in range(5): 
     
    4141    ['1', '3', 'no', '1', 'yes', '1'] 
    4242 
    43 The first :obj:`ClassifierByLookupTable` takes values of features a 
    44 and b and computes the value of ab according to the rule given in the 
    45 given table. The first three values correspond to a=1 and b=1, 2, 3; 
    46 for the first combination, value of ab should be "yes", for the other 
    47 two a and b are different. The next triplet correspond to a=2; 
     43The first :obj:`ClassifierByLookupTable` takes values of features ``a`` 
     44and ``b`` and computes the value of ``ab`` according to the rule given in the 
     45given table. The first three values correspond to ``a=1`` and ``b=1,2,3``; 
     46for the first combination, value of ``ab`` should be "yes", for the other 
     47two ``a`` and ``b`` are different. The next triplet corresponds to ``a=2``; 
    4848here, the middle value is "yes"... 
    4949 
    5050The second lookup is simpler: since it involves only a single feature, 
    51 the list is a simple one-to-one mapping from the four-valued e to the 
    52 two-valued e1. The last value in the list is returned when e is unknown 
    53 and tells that e1 should be unknown then as well. 
     51the list is a simple one-to-one mapping from the four-valued ``e`` to the 
     52two-valued ``e1``. The last value in the list is returned when ``e`` is unknown 
     53and tells that ``e1`` should be unknown then as well. 
    5454 
    5555Note that you don't need :obj:`ClassifierByLookupTable` for this. 
    56 The new feature e1 could be computed with a callback to Python, 
     56The new feature ``e1`` could be computed with a callback to Python, 
    5757for instance:: 
    5858 
    59     e2.get_value_from = lambda ex, rw: orange.Value(e2, ex["e"]=="1") 
     59    e2.get_value_from = lambda ex, rw: orange.Value(e2, ex["e"] == "1") 
    6060 
    6161 
     
    7272:obj:`ClassifierByLookupTable1`, :obj:`ClassifierByLookupTable2` or 
    7373:obj:`ClassifierByLookupTable3`. As their names tell, the first 
    74 classifies using a single feature (so that's what we had for e1), 
    75 the second uses a pair of features (and has been constructed for ab 
     74classifies using a single feature (so that's what we had for ``e1``), 
     75the second uses a pair of features (and has been constructed for ``ab`` 
    7676above), and the third uses three features. Class predictions for each 
    7777combination of feature values are stored in a (one dimensional) table. 
     
    8888.. py:class:: ClassifierByLookupTable(class_var, variable1[, variable2[, variable3]] [, lookup_table[, distributions]]) 
    8989     
    90     A general constructor that, based on the number of feature 
    91     descriptors, constructs one of the three classes discussed. 
    92     If lookup_table and distributions are omitted, constructor also 
    93     initializes lookup_table and distributions to two lists of the 
    94     right sizes, but their elements are don't knows and empty 
    95     distributions. If they are given, they must be of correct size. 
     90    A general constructor that, based on the number of feature descriptors, 
     91    constructs one of the three classes discussed. If :obj:`lookup_table` 
     92    and :obj:`distributions` are omitted, the constructor also initializes 
     93    them to two lists of the right sizes, but their elements are don't knows 
     94    and empty distributions. If they are given, they must be of correct size. 
    9695     
    9796    .. attribute:: variable1[, variable2[, variable3]](read only) 
     
    106105        The above variables, returned as a tuple. 
    107106 
    108     .. attribute:: noOfValues1, noOfValues2[, noOfValues3] (read only) 
     107    .. attribute:: no_of_values1[, no_of_values2[, no_of_values3]] (read only) 
    109108         
    110109        The number of values for variable1, variable2 and variable3. 
     
    115114    .. attribute:: lookup_table (read only) 
    116115         
    117         A list of values (ValueList), one for each possible combination of 
    118         features. For ClassifierByLookupTable1, there is an additional 
    119         element that is returned when the feature's value is unknown. 
    120         Values are ordered by values of features, with variable1 being the 
    121         most important. In case of two three valued features, the list 
    122         order is therefore 1-1, 1-2, 1-3, 2-1, 2-2, 2-3, 3-1, 3-2, 3-3, 
     116        A list of values (:obj:`Orange.core.ValueList`), one for each possible 
     117        combination of features. For ClassifierByLookupTable1, there is an 
     118        additional element that is returned when the feature's value is 
     119        unknown. Values are ordered by values of features, with variable1 
     120        being the most important. In case of two three valued features, the 
     121        list order is therefore 1-1, 1-2, 1-3, 2-1, 2-2, 2-3, 3-1, 3-2, 3-3, 
    123122        where the first digit corresponds to variable1 and the second to 
    124123        variable2. 
     
    130129    .. attribute:: distributions (read only) 
    131130         
    132         Similar to :obj:`lookup_table`, but is of type DistributionList 
    133         and stores a distribution for each combination of values.  
    134  
    135     .. attribute:: dataDescription 
    136          
    137         An object of type EFMDataDescription, defined only for 
    138         ClassifierByLookupTable2 and ClassifierByLookupTable3. They use 
    139         it to make predictions when one or more feature values are unknown. 
    140         ClassifierByLookupTable1 doesn't need it since this case is covered 
    141         by an additional element in lookup_table and distributions, 
     131        Similar to :obj:`lookup_table`, but is of type 
     132        :obj:`Orange.core.DistributionList` and stores a distribution 
     133        for each combination of values.  
     134 
     135    .. attribute:: data_description 
     136         
     137        An object of type :obj:`EFMDataDescription`, defined only for 
     138        ClassifierByLookupTable2 and ClassifierByLookupTable3. They use it 
     139        to make predictions when one or more feature values are unknown. 
     140        ClassifierByLookupTable1 doesn't need it since this case is covered by 
     141        an additional element in :obj:`lookup_table` and :obj:`distributions`, 
    142142        as told above. 
    143143         
    144     .. method:: getindex(example) 
    145      
    146         Returns an index into lookup_table or distributions. The formula 
    147         depends upon the type of the classifier. If value\ *i* is 
    148         int(example[variable\ *i*]), then the corresponding formulae are 
     144    .. method:: get_index(example) 
     145     
     146        Returns an index of ``example`` in :obj:`lookup_table` and 
     147        :obj:`distributions`. The formula depends upon the type of 
     148        the classifier. If value\ *i* is int(example[variable\ *i*]), 
     149        then the corresponding formulae are 
    149150 
    150151        ClassifierByLookupTable1: 
    151             index = value1, or len(lookup_table)-1 if value is unknown 
     152            index = value1, or len(lookup_table) - 1 if value is unknown 
    152153        ClassifierByLookupTable2: 
    153             index = value1*noOfValues1 + value2, or -1 if any value is unknown  
     154            index = value1 * no_of_values1 + value2, or -1 if any value is unknown 
    154155        ClassifierByLookupTable3: 
    155             index = (value1*noOfValues1 + value2) * noOfValues2 + value3, or -1 if any value is unknown 
     156            index = (value1 * no_of_values1 + value2) * no_of_values2 + value3, or -1 if any value is unknown 
    156157 
    157158        Let's see some indices for randomly chosen examples from the original table. 
     
    407408        >>> bound = [table.domain[name] for name in ["a", "b"]] 
    408409        >>> newVar = Orange.data.variable.Discrete("a=b", values=["no", "yes"]) 
    409         >>> lookup = lookup_from_function(newVar, bound, lambda x: x[0]==x[1]) 
     410        >>> lookup = lookup_from_function(newVar, bound, lambda x: x[0] == x[1]) 
    410411        >>> newVar.get_value_from = lookup 
    411412        >>> import orngCI 
     
    515516       
    516517 
    517 def lookup_from_data(examples, weight = 0, learnerForUnknown = None): 
     518def lookup_from_data(examples, weight=0, learnerForUnknown=None): 
    518519    if len(examples.domain.attributes) <= 3: 
    519520        lookup = lookup_from_bound(examples.domain.class_var, 
     
    522523        for example in examples: 
    523524            ind = lookup.getindex(example) 
    524             if not lookup_table[ind].isSpecial() and (lookup_table[ind] <> 
     525            if not lookup_table[ind].isSpecial() and (lookup_table[ind] != 
    525526                                                     example.getclass()): 
    526527                break 
     
    565566        for ex in cnt: 
    566567            for i in range(len(ex)): 
    567                 if ex[i]<len(boundset[i].values): 
     568                if ex[i] < len(boundset[i].values): 
    568569                    outp += "%s\t" % boundset[i].values[ex[i]] 
    569570                else: 
  • orange/doc/Orange/rst/code/lookup-lookup.py

    r8042 r8925  
    1212 
    1313ab = Orange.data.variable.Discrete("a==b", values = ["no", "yes"]) 
    14 ab.getValueFrom = Orange.classification.lookup.ClassifierByLookupTable(ab, a, b, 
     14ab.get_value_from = Orange.classification.lookup.ClassifierByLookupTable(ab, a, b, 
    1515                    ["yes", "no", "no",  "no", "yes", "no",  "no", "no", "yes"]) 
    1616 
    1717e1 = Orange.data.variable.Discrete("e==1", values = ["no", "yes"]) 
    18 e1.getValueFrom = Orange.classification.lookup.ClassifierByLookupTable(e1, e, 
     18e1.get_value_from = Orange.classification.lookup.ClassifierByLookupTable(e1, e, 
    1919                    ["yes", "no", "no", "no", "?"]) 
    2020 
    21 table2 = table.select([a, b, ab, e, e1, table.domain.classVar]) 
     21table2 = table.select([a, b, ab, e, e1, table.domain.class_var]) 
    2222 
    2323for i in range(5): 
    24     print table2.randomexample() 
     24    print table2.random_example() 
    2525 
    2626for i in range(5): 
    27     ex = table.randomexample() 
    28     print "%s: ab %i, e1 %i " % (ex, ab.getValueFrom.getindex(ex), 
    29                                  e1.getValueFrom.getindex(ex)) 
     27    ex = table.random_example() 
     28    print "%s: ab %i, e1 %i " % (ex, ab.get_value_from.get_index(ex), 
     29                                 e1.get_value_from.get_index(ex)) 
    3030     
    3131# What follows is only for testing Orange... 
    3232 
    33 ab_c = ab.getValueFrom 
    34 print ab_c.variable1.name, ab_c.variable2.name, ab_c.classVar.name 
    35 print ab_c.noOfValues1, ab_c.noOfValues2 
     33ab_c = ab.get_value_from 
     34print ab_c.variable1.name, ab_c.variable2.name, ab_c.class_var.name 
     35print ab_c.no_of_values1, ab_c.no_of_values2 
    3636print [x.name for x in ab_c.variables] 
    3737 
    38 e1_c = e1.getValueFrom 
    39 print e1_c.variable1.name, e1_c.classVar.name 
     38e1_c = e1.get_value_from 
     39print e1_c.variable1.name, e1_c.class_var.name 
    4040print [x.name for x in e1_c.variables] 
  • orange/doc/Orange/rst/code/lookup-table.py

    r8042 r8925  
    1010a, b, e = table.domain["a"], table.domain["b"], table.domain["e"] 
    1111 
    12 table_s = table.select([a, b, e, table.domain.classVar]) 
     12table_s = table.select([a, b, e, table.domain.class_var]) 
    1313abe = Orange.classification.lookup.LookupLearner(table_s) 
    1414 
     
    2121 
    2222for i in abe.sorted_examples[:10]: 
    23     print i, i.getclass().svalue 
     23    print i, i.get_class().svalue 
    2424print 
    2525 
     
    2727abe2 = Orange.classification.lookup.LookupLearner(y2, [a, b, e], table) 
    2828for i in abe2.sorted_examples[:10]: 
    29     print i, i.getclass().svalue 
     29    print i, i.get_class().svalue 
    3030print 
    3131 
     
    3333abe2 = Orange.classification.lookup.LookupLearner(y2, [a, b], table) 
    3434for i in abe2.sorted_examples: 
    35     print i, i.getclass().svalue 
     35    print i, i.get_class().svalue 
Note: See TracChangeset for help on using the changeset viewer.