Changeset 8925:ea6a068ee6e1 in orange
 Timestamp:
 09/08/11 11:58:12 (3 years ago)
 Branch:
 default
 Convert:
 a7b6fc9f68ff77b339cfea951aa47b3a3b2e37e2
 Location:
 orange
 Files:

 3 edited
Legend:
 Unmodified
 Added
 Removed

orange/Orange/classification/lookup.py
r8066 r8925 9 9 Lookup classifiers predict classes by looking into stored lists of 10 10 cases. There are two kinds of such classifiers in Orange. The simpler 11 and faste st :obj:`ClassifierByLookupTable` useup to three discrete12 features and ha ve a stored mapping from values of those features to13 class value. The more complex classifiers store a 11 and faster :obj:`ClassifierByLookupTable` uses up to three discrete 12 features and has a stored mapping from values of those features to the 13 class value. The more complex classifiers store an 14 14 :obj:`Orange.data.Table` and predict the class by matching the instance 15 15 to instances in the table. … … 21 21 they usually reside in :obj:`~Orange.data.variable.Variable.get_value_from` fields of constructed 22 22 features to facilitate their automatic computation. For instance, 23 the following script shows how to translate the `monks1.tab`_ data set23 the following script shows how to translate the `monks1.tab`_ data set 24 24 features into a more useful subset that will only include the features 25 a, b, e, and features that will tell whether a and bare equal and26 whether eis 1 (don't bother about the details, they follow later;25 ``a``, ``b``, ``e``, and features that will tell whether ``a`` and ``b`` are equal and 26 whether ``e`` is 1 (don't bother about the details, they follow later; 27 27 `lookuplookup.py`_, uses: `monks1.tab`_): 28 28 … … 31 31 32 32 We can check the correctness of the script by printing out several 33 random examples from data2.33 random examples from ``data2``. 34 34 35 35 >>> for i in range(5): … … 41 41 ['1', '3', 'no', '1', 'yes', '1'] 42 42 43 The first :obj:`ClassifierByLookupTable` takes values of features a44 and b and computes the value of abaccording to the rule given in the45 given table. The first three values correspond to a=1 and b=1, 2, 3;46 for the first combination, value of abshould be "yes", for the other47 two a and b are different. The next triplet correspond to a=2;43 The first :obj:`ClassifierByLookupTable` takes values of features ``a`` 44 and ``b`` and computes the value of ``ab`` according to the rule given in the 45 given table. The first three values correspond to ``a=1`` and ``b=1,2,3``; 46 for the first combination, value of ``ab`` should be "yes", for the other 47 two ``a`` and ``b`` are different. The next triplet corresponds to ``a=2``; 48 48 here, the middle value is "yes"... 49 49 50 50 The second lookup is simpler: since it involves only a single feature, 51 the list is a simple onetoone mapping from the fourvalued eto the52 twovalued e1. The last value in the list is returned when eis unknown53 and tells that e1should be unknown then as well.51 the list is a simple onetoone mapping from the fourvalued ``e`` to the 52 twovalued ``e1``. The last value in the list is returned when ``e`` is unknown 53 and tells that ``e1`` should be unknown then as well. 54 54 55 55 Note that you don't need :obj:`ClassifierByLookupTable` for this. 56 The new feature e1could be computed with a callback to Python,56 The new feature ``e1`` could be computed with a callback to Python, 57 57 for instance:: 58 58 59 e2.get_value_from = lambda ex, rw: orange.Value(e2, ex["e"] =="1")59 e2.get_value_from = lambda ex, rw: orange.Value(e2, ex["e"] == "1") 60 60 61 61 … … 72 72 :obj:`ClassifierByLookupTable1`, :obj:`ClassifierByLookupTable2` or 73 73 :obj:`ClassifierByLookupTable3`. As their names tell, the first 74 classifies using a single feature (so that's what we had for e1),75 the second uses a pair of features (and has been constructed for ab74 classifies using a single feature (so that's what we had for ``e1``), 75 the second uses a pair of features (and has been constructed for ``ab`` 76 76 above), and the third uses three features. Class predictions for each 77 77 combination of feature values are stored in a (one dimensional) table. … … 88 88 .. py:class:: ClassifierByLookupTable(class_var, variable1[, variable2[, variable3]] [, lookup_table[, distributions]]) 89 89 90 A general constructor that, based on the number of feature 91 descriptors, constructs one of the three classes discussed. 92 If lookup_table and distributions are omitted, constructor also 93 initializes lookup_table and distributions to two lists of the 94 right sizes, but their elements are don't knows and empty 95 distributions. If they are given, they must be of correct size. 90 A general constructor that, based on the number of feature descriptors, 91 constructs one of the three classes discussed. If :obj:`lookup_table` 92 and :obj:`distributions` are omitted, the constructor also initializes 93 them to two lists of the right sizes, but their elements are don't knows 94 and empty distributions. If they are given, they must be of correct size. 96 95 97 96 .. attribute:: variable1[, variable2[, variable3]](read only) … … 106 105 The above variables, returned as a tuple. 107 106 108 .. attribute:: no OfValues1, noOfValues2[, noOfValues3] (read only)107 .. attribute:: no_of_values1[, no_of_values2[, no_of_values3]] (read only) 109 108 110 109 The number of values for variable1, variable2 and variable3. … … 115 114 .. attribute:: lookup_table (read only) 116 115 117 A list of values ( ValueList), one for each possible combination of118 features. For ClassifierByLookupTable1, there is an additional119 element that is returned when the feature's value is unknown.120 Values are ordered by values of features, with variable1 being the121 most important. In case of two three valued features, the list122 order is therefore 11, 12, 13, 21, 22, 23, 31, 32, 33,116 A list of values (:obj:`Orange.core.ValueList`), one for each possible 117 combination of features. For ClassifierByLookupTable1, there is an 118 additional element that is returned when the feature's value is 119 unknown. Values are ordered by values of features, with variable1 120 being the most important. In case of two three valued features, the 121 list order is therefore 11, 12, 13, 21, 22, 23, 31, 32, 33, 123 122 where the first digit corresponds to variable1 and the second to 124 123 variable2. … … 130 129 .. attribute:: distributions (read only) 131 130 132 Similar to :obj:`lookup_table`, but is of type DistributionList 133 and stores a distribution for each combination of values. 134 135 .. attribute:: dataDescription 136 137 An object of type EFMDataDescription, defined only for 138 ClassifierByLookupTable2 and ClassifierByLookupTable3. They use 139 it to make predictions when one or more feature values are unknown. 140 ClassifierByLookupTable1 doesn't need it since this case is covered 141 by an additional element in lookup_table and distributions, 131 Similar to :obj:`lookup_table`, but is of type 132 :obj:`Orange.core.DistributionList` and stores a distribution 133 for each combination of values. 134 135 .. attribute:: data_description 136 137 An object of type :obj:`EFMDataDescription`, defined only for 138 ClassifierByLookupTable2 and ClassifierByLookupTable3. They use it 139 to make predictions when one or more feature values are unknown. 140 ClassifierByLookupTable1 doesn't need it since this case is covered by 141 an additional element in :obj:`lookup_table` and :obj:`distributions`, 142 142 as told above. 143 143 144 .. method:: getindex(example) 145 146 Returns an index into lookup_table or distributions. The formula 147 depends upon the type of the classifier. If value\ *i* is 148 int(example[variable\ *i*]), then the corresponding formulae are 144 .. method:: get_index(example) 145 146 Returns an index of ``example`` in :obj:`lookup_table` and 147 :obj:`distributions`. The formula depends upon the type of 148 the classifier. If value\ *i* is int(example[variable\ *i*]), 149 then the corresponding formulae are 149 150 150 151 ClassifierByLookupTable1: 151 index = value1, or len(lookup_table) 1 if value is unknown152 index = value1, or len(lookup_table)  1 if value is unknown 152 153 ClassifierByLookupTable2: 153 index = value1 *noOfValues1 + value2, or 1 if any value is unknown154 index = value1 * no_of_values1 + value2, or 1 if any value is unknown 154 155 ClassifierByLookupTable3: 155 index = (value1 *noOfValues1 + value2) * noOfValues2 + value3, or 1 if any value is unknown156 index = (value1 * no_of_values1 + value2) * no_of_values2 + value3, or 1 if any value is unknown 156 157 157 158 Let's see some indices for randomly chosen examples from the original table. … … 407 408 >>> bound = [table.domain[name] for name in ["a", "b"]] 408 409 >>> newVar = Orange.data.variable.Discrete("a=b", values=["no", "yes"]) 409 >>> lookup = lookup_from_function(newVar, bound, lambda x: x[0] ==x[1])410 >>> lookup = lookup_from_function(newVar, bound, lambda x: x[0] == x[1]) 410 411 >>> newVar.get_value_from = lookup 411 412 >>> import orngCI … … 515 516 516 517 517 def lookup_from_data(examples, weight = 0, learnerForUnknown =None):518 def lookup_from_data(examples, weight=0, learnerForUnknown=None): 518 519 if len(examples.domain.attributes) <= 3: 519 520 lookup = lookup_from_bound(examples.domain.class_var, … … 522 523 for example in examples: 523 524 ind = lookup.getindex(example) 524 if not lookup_table[ind].isSpecial() and (lookup_table[ind] <>525 if not lookup_table[ind].isSpecial() and (lookup_table[ind] != 525 526 example.getclass()): 526 527 break … … 565 566 for ex in cnt: 566 567 for i in range(len(ex)): 567 if ex[i] <len(boundset[i].values):568 if ex[i] < len(boundset[i].values): 568 569 outp += "%s\t" % boundset[i].values[ex[i]] 569 570 else: 
orange/doc/Orange/rst/code/lookuplookup.py
r8042 r8925 12 12 13 13 ab = Orange.data.variable.Discrete("a==b", values = ["no", "yes"]) 14 ab.get ValueFrom = Orange.classification.lookup.ClassifierByLookupTable(ab, a, b,14 ab.get_value_from = Orange.classification.lookup.ClassifierByLookupTable(ab, a, b, 15 15 ["yes", "no", "no", "no", "yes", "no", "no", "no", "yes"]) 16 16 17 17 e1 = Orange.data.variable.Discrete("e==1", values = ["no", "yes"]) 18 e1.get ValueFrom = Orange.classification.lookup.ClassifierByLookupTable(e1, e,18 e1.get_value_from = Orange.classification.lookup.ClassifierByLookupTable(e1, e, 19 19 ["yes", "no", "no", "no", "?"]) 20 20 21 table2 = table.select([a, b, ab, e, e1, table.domain.class Var])21 table2 = table.select([a, b, ab, e, e1, table.domain.class_var]) 22 22 23 23 for i in range(5): 24 print table2.random example()24 print table2.random_example() 25 25 26 26 for i in range(5): 27 ex = table.random example()28 print "%s: ab %i, e1 %i " % (ex, ab.get ValueFrom.getindex(ex),29 e1.get ValueFrom.getindex(ex))27 ex = table.random_example() 28 print "%s: ab %i, e1 %i " % (ex, ab.get_value_from.get_index(ex), 29 e1.get_value_from.get_index(ex)) 30 30 31 31 # What follows is only for testing Orange... 32 32 33 ab_c = ab.get ValueFrom34 print ab_c.variable1.name, ab_c.variable2.name, ab_c.class Var.name35 print ab_c.no OfValues1, ab_c.noOfValues233 ab_c = ab.get_value_from 34 print ab_c.variable1.name, ab_c.variable2.name, ab_c.class_var.name 35 print ab_c.no_of_values1, ab_c.no_of_values2 36 36 print [x.name for x in ab_c.variables] 37 37 38 e1_c = e1.get ValueFrom39 print e1_c.variable1.name, e1_c.class Var.name38 e1_c = e1.get_value_from 39 print e1_c.variable1.name, e1_c.class_var.name 40 40 print [x.name for x in e1_c.variables] 
orange/doc/Orange/rst/code/lookuptable.py
r8042 r8925 10 10 a, b, e = table.domain["a"], table.domain["b"], table.domain["e"] 11 11 12 table_s = table.select([a, b, e, table.domain.class Var])12 table_s = table.select([a, b, e, table.domain.class_var]) 13 13 abe = Orange.classification.lookup.LookupLearner(table_s) 14 14 … … 21 21 22 22 for i in abe.sorted_examples[:10]: 23 print i, i.get class().svalue23 print i, i.get_class().svalue 24 24 print 25 25 … … 27 27 abe2 = Orange.classification.lookup.LookupLearner(y2, [a, b, e], table) 28 28 for i in abe2.sorted_examples[:10]: 29 print i, i.get class().svalue29 print i, i.get_class().svalue 30 30 print 31 31 … … 33 33 abe2 = Orange.classification.lookup.LookupLearner(y2, [a, b], table) 34 34 for i in abe2.sorted_examples: 35 print i, i.get class().svalue35 print i, i.get_class().svalue
Note: See TracChangeset
for help on using the changeset viewer.