source: orange/Orange/doc/modules/orngLookup.htm @ 9671:a7b056375472

Revision 9671:a7b056375472, 5.5 KB checked in by anze <anze.staric@…>, 2 years ago (diff)

Moved orange to Orange (part 2)

Line 
1<html><HEAD>
2<LINK REL=StyleSheet HREF="../style.css" TYPE="text/css">
3</HEAD>
4<body>
5<h1>orngLookup: Functions for Working with Classifiers That Use a Stored Example Table</h1>
6<index name="modules+lookup classification">
7<index name="classifiers/lookup classification">
8
9<P>This module contains several functions for working with classifiers that use a stored example table for making predictions. There are four such classifiers; the most general stores an <CODE>ExampleTable</CODE> and the other three are specialized and optimized for cases where the domain contains only one, two or three attributes (besides the class attribute).<P>
10
11<hr>
12
13<H2>Functions</H2>
14
15<H3>lookupFromBound(classVar, bound)</H3>
16
17<P>This function constructs an appropriate lookup classifier for one, two or three attributes. If there are more, it returns <CODE>None</CODE>. The resulting classifier is of type <CODE>ClassifierByLookupTable</CODE>, <CODE>ClassifierByLookupTable2</CODE> or <CODE>ClassifierByLookupTable3</CODE>, with <CODE>classVar</CODE> and bound set set as given.</P>
18
19<P>If, for instance, <CODE>data</CODE> contains a dataset Monk 1 and you would like to construct a new feature from attributes <CODE>a</CODE> and <CODE>b</CODE>, you can call this function as follows.
20
21<XMP class="code">>>> newvar = orange.EnumVariable()
22>>> bound = [data.domain[name] for name in ["a", "b"]
23>>> lookup = orngLookup.lookupFromBound(newvar, bound)
24>>> print lookup.lookupTable
25<?, ?, ?, ?, ?, ?, ?, ?, ?>
26</XMP>
27
28<P>Function <CODE>orngLookup.lookupFromBound</CODE> does not initialize
29neither <CODE>newVar</CODE> nor the lookup table...</P>
30
31
32<H3>lookupFromFunction(classVar, bound, function)</H3>
33
34<P>... and that's exactly where <CODE>lookupFromFunction</CODE> differs from <CODE>lookupFromBound</CODE>. <CODE>lookupFromFunction</CODE> first calls <CODE>lookupFromBound</CODE> and then uses the function to initialize the lookup table. The other difference between this and the previous function is that <CODE>lookupFromFunction</CODE> also accepts bound sets with more than three attributes. In this case, it construct a <CODE>ClassifierByExampleTable</CODE>.
35</P>
36
37<P>The function gets the values of attributes as integer indices and should return an integer index of the "class value". The class value must be properly initialized.</P>
38
39<P>For exercise, let us construct a new attribute called <CODE>a=b</CODE> whose value will be "yes" when <CODE>a</CODE> and <CODE>b</CODE> or equal and "no" when they are not. We will then add the attribute to the dataset.</P>
40
41<XMP class="code">>>> bound = [data.domain[name] for name in ["a", "b"]]
42>>> newVar = orange.EnumVariable("a=b", values = ["no", "yes"])
43>>> lookup = orngLookup.lookupFromFunction(newVar, bound, lambda x: x[0]==x[1])
44>>> newVar.getValueFrom = lookup
45>>> import orngCI
46>>> data2 = orngCI.addAnAttribute(newVar, data)
47>>> for i in data2[:30]:
48...     print i
49['1', '1', '1', '1', '1', '1', 'yes', '1']
50['1', '1', '1', '1', '1', '2', 'yes', '1']
51['1', '1', '1', '1', '2', '1', 'yes', '1']
52['1', '1', '1', '1', '2', '2', 'yes', '1']
53   ...
54['2', '1', '2', '3', '4', '1', 'no', '0']
55['2', '1', '2', '3', '4', '2', 'no', '0']
56['2', '2', '1', '1', '1', '1', 'yes', '1']
57['2', '2', '1', '1', '1', '2', 'yes', '1']
58   ...
59</XMP>
60
61<P>The attribute was inserted with use of <CODE>orngCI.addAnAttribute</CODE>.
62By setting <CODE>newVar.getValueFrom</CODE> to <CODE>lookup</CODE> we state that when converting domains (either when needed by <CODE>addAnAttribute</CODE> or at some other place), <CODE>lookup</CODE> should be used to compute <CODE>newVar</CODE>'s value. (A bit off topic, but important: you should <B>never call <CODE>getValueFrom</CODE> directly</B>, but always call it through <CODE>computeValue</CODE>.)
63
64
65
66<H3>lookupFromExamples(examples [, weight])</H3>
67
68<P>This function takes a set of examples (<CODE>ExampleTable</CODE>, for instance) and turns it into a classifier. If there are one, two or three attributes and no ambiguous examples (examples are ambiguous if they have same values of attributes but with different class values), it will construct an appropriate <CODE>ClassifierByLookupTable</CODE>. Otherwise, it will return an <CODE>ClassifierByExampleTable</CODE>.</P>
69
70<XMP class="code">>>> lookup = orngLookup.lookupFromExamples(data)
71>>> testExample = orange.Example(data.domain, ['3', '2', '2', '3', '4', '1', '?'])
72>>> lookup(testExample)
73<orange.Value 'y'='0'>
74</XMP>
75
76
77<H3>printLookupFunction(func)</H3>
78
79<P><CODE>printLookupFunction</CODE> returns a string with a lookup function in tab-delimited format. Argument <CODE>func</CODE> can be any of the abovementioned classifiers or an attribute whose <CODE>getValueFrom</CODE> points to one of such classifiers.</P>
80
81<P>Module <CODE>orngLookup</CODE> sets the output for those classifiers using the orange output schema. This means that you don't need to call <CODE>printLookupFunction</CODE> directly. Use <CODE>dump</CODE> and <CODE>write</CODE> functions instead.</P>
82
83<P>For instance, if <CODE>lookup</CODE> is such as constructed in example for <CODE>lookupFromFunction</CODE>, you can print it out by</CODE></P>
84
85<XMP class="code">>>> print lookup.dump("tab")
86a      b      a=b
87------ ------ ------
881      1      yes
891      2      no
901      3      no
912      1      no
922      2      yes
932      3      no
943      1      no
953      2      no
963      3      yes
97</XMP>
98
99<P>Function <CODE>write</CODE> writes it to file, either a new one</P>
100<XMP class="code">>>> lookup.write("tab", "d:\\t.txt")
101</XMP>
102
103<P>or to an already open file (this way you can write more things to one file)</P>
104<XMP class="code">>>> lookup.write("tab", f)
105</XMP>
106
107</body> </html>
Note: See TracBrowser for help on using the repository browser.