source: orange/Orange/doc/reference/Value.htm @ 9671:a7b056375472

Revision 9671:a7b056375472, 12.5 KB checked in by anze <anze.staric@…>, 2 years ago (diff)

Moved orange to Orange (part 2)

Line 
1<html>
2<HEAD>
3<LINK REL=StyleSheet HREF="../style.css" TYPE="text/css">
4<LINK REL=StyleSheet HREF="style-print.css" TYPE="text/css" MEDIA=print>
5</HEAD>
6
7<BODY>
8<h1>Values of attributes</h1>
9<index name="attribute values">
10
11<p><code>Orange.<INDEX name="classes/Value">Value</code> contains a value of an attribute. Value can be discrete, continuous or of some other type, like discrete or continuous distribution, or a string. The value can also contain an attribute descriptor of type <code>Variable</code>. This enables several operations which are otherwise unavailable.</P>
12
13<p>When taking a value from an example (e.g. <code>value = example[2]</code>), what you get is a copy of the value, not a reference. Changing the value would not change the example from which it was got.</P>
14
15<hr>
16
17<p class=section>Attributes</p>
18<DL class=attributes>
19<DT>value</DT>
20<DD>The attribute's value. Values of discrete and continuous attributes are internally stored as integers or floating point numbers, respectively. This field, however, "contains" floating point numbers for continuous attributes and strings for discrete. If attribute descriptor (field <CODE>variable</CODE>) is known, the string is a symbolic value for the attribute; otherwise it contains a number enclosed in "&lt;" and "&gt;". If value is continuous or unknown, no descriptor is needed. For the latter, the result is a string '?', '~' or '.'  for don't know, don't care and other, respectively.</DD>
21
22<DT>svalue</DT>
23<DD>While the previous field (<CODE>value</CODE>) can store only integers and floats, <CODE>svalue</CODE> can contain objects of any type derived from <code>SomeValue</code>, such as <code>StringValue</code>, <code>DiscDistribution</code> or <code>ContDistribution</code>.</DD>
24
25<DT>variable</DT>
26<DD>An attribute descriptor, associatied with the value. Can be <code>None</code>.</DD>
27
28<DT>varType <SPAN class=normalfont>(read only)</SPAN></DT>
29<DD>The value's type. It can be <code>orange.VarTypes.Discrete</code> (1), <code>orange.VarTypes.Continuous</code> (2) or <code>orange.VarType.OtherVar</code> (3).</DD>
30
31<DT>valueType <SPAN class=normalfont>(read only)</SPAN></DT>
32<DD>Tells whether the value is regular (known) or special. This field can be <code>orange.ValueType.Regular</code> (0), <code>orange.ValueType.DC</code> (1), <code>orange.ValueType.DK</code> (2), or any value from 3 to 255. </DD>
33</DL>
34
35<p class=section>Methods</p>
36<DL class=attributes>
37<DT>&lt;constructors&gt;</DT>
38<DD>The are many ways to construct the Value. See the examples section.</DD>
39
40<DT>&lt;casting to numerical types&gt;</DT>
41<DD>Value can be casted to an int, float, long. Casting to numerical types requires the values to be known. For nominal values, the result of numerical cast is an index of the value. </DD>
42
43<DT>&lt;casting to string&gt;</DT>
44<DD>For nominal values, it returns symbolic value if the descriptor is defined. If not, it returns an index in arrow brackets (like "<2>"). For continuous values, a string representation of the value is returned. Symbols "?", "~" and "." are used for don't-knows, don't-cares and other types of unspecified values, respectively. <code>StringValues</code> can be casted to strings as well.</DT>
45
46<DT>&lt;casting to bool&gt;</DT>
47<DD>Values can be used in conditional expression. A <CODE>Value</CODE> is <CODE>true</CODE> if it is known (i.e. not special).</DD>
48
49
50<DT>&lt;arithmetic operations&gt;</DT>
51<DD>Continuous values can be added, subtracted, multiplied, divider, raised to powers. A negative and absolute value can also be computed. Results of arithmetic operations are not values but floats. Value can be added (subtracted...) to another value or to an float or integer.
52</DD>
53
54
55<DT>&lt;comparison&gt;</DT>
56<DD>Values can be compared. Both values must be of the same type (discrete, continuous). Continuous values can be compared as expected. Discrete values are compared by indices, not alphabetically. This enables ordered attributes; values "tiny", "little", "big" and "huge" should be compared as listed, not alphabetically. All discrete attributes are here treated as ordinal, not nominal; the descriptor's flag <code>ordered</code> is ignored. It is possible to compare two values of the same attribute (i.e. with the same descriptor) or two values of different attributes. When the values belong to different attributes, orange tries different ways to compare them. See examples below for details. Both, discrete and continuous values can be compared to strings, provided that the strings can be converted to attribute's values. It is not possible to compare any of the above attributes with string "small", and it is not possible to compare a continuous attribute with "parrot". Discrete values can be compared with integers which get treated as indices. Continuous values can be compared with any numeric object. Comparing two undefined values yields equality and comparing an undefined value with a defined yields an error.
57</DD>
58
59<DT>native()</DT>
60<DD>Returns the same as the attribute <CODE>value</CODE> - string for discrete and undefined values, floating point number for continuous.</DD>
61
62<DT>firstValue(), nextValue(), randomValue()</DT>
63<DD>These functions set the value to the firstValue, the next (from the current) or to a random value. This is not always possible - the value must have a descriptor and the descriptor must support the methods. Functions return <CODE>true</CODE> on success and <CODE>false</CODE> on failure.</DD>
64
65<DT>isDK(), isDC(), isSpecial()</DT>
66<DD>Return 1 if the value is don't know, don't care, or of these or any other special type, respectively. <code>val.isSpecial()</code> is thus equivalent to <code>val.valueType!=0</code> and to <CODE>not not val</CODE>.</DD>
67</DL>
68
69<hr>
70
71<H3>Examples</H3>
72
73<H4>Construction</H4>
74<A name="construction">
75
76<p>Let us first assume that you have defined two attribute descriptors</p>
77<xmp class="code">>>> fruit = orange.EnumVariable("fruit", values = ["plum", "apple", "lemon"])
78>>> iq = orange.FloatVariable("iq")
79</xmp>
80<p>Let us now define several values.</p>
81<xmp class="code">>>> lm = orange.Value(fruit, "lemon")
82>>> ap = orange.Value(fruit, 1)
83>>> un = orange.Value(fruit)
84>>>
85>>> Mary = orange.Value(iq, "105")
86>>> Harry = orange.Value(iq, 80)
87>>> Dick = orange.Value(iq)
88</xmp>
89<p>When a descriptor is given, as in above cases, the attribute's value can be converted from a string. In case of <code>EnumVariable</code>, this can be a symbolic value (as for <code>lm</code>), in case of continuous attributes, the string should contain a number (as for <code>Mary</code>). Discrete values can also be given as indices.</p>
90<xmp class="code">>>> print ap
91apple
92</xmp>
93<p>The value of <code>ap</code> is <code>apple</code>, since <code>apple</code> is the second fruit in the list of values. Thus, saying "<code>ap = orange.Value(fruit, 1)</code>" is equivalent to "<code>ap = orange.Value(fruit, fruit.values[1])</code>". Values of continuous attributes can be given numerically, as for <code>Harry</code>.</P>
94
95<p>What about <code>un</code> and <code>Dick</code>? Those two <code>Value</code>'s correspond to attributes <code>fruit</code> and <code>iq</code>, but the values are not specified. More accurately, they are unknown.</P>
96
97<p>We can also omit the descriptor.</p>
98<xmp class="code">>>> sf = orange.Value(2)
99>>> Sally = orange.Value(118.0)
100>>>
101>>> print sf
102<2>
103>>> print Sally
104118.000000
105</xmp>
106
107<p>Note that the outputs are different. There's a reason for it. <code>sf</code> is discrete and <code>Sally</code> is continuous. The type of the argument - int or float - defines the type of the value constructed. But why &lt;2&gt;? It means "the third value" of some discrete attribute, but there's no descriptor so we have no symbolical name for it. But we can assign a descriptor:</p>
108<xmp class="code">>>> sf.variable = fruit
109>>> print sf
110lemon
111</xmp>
112<p>Before delving into the <code>Value</code>'s fields, let us list some further - working and non-working - ways to construct a <code>Value</code>.</p>
113<xmp class="code">>>> m = orange.Value("plum")
114Traceback (most recent call last):
115  File "<stdin>", line 1, in ?
116TypeError: cannot convert 'plum' to value of an unknown attribute
117</xmp>
118<p>This is hardly surprising when you think about it. The string "plum" cannot be converted to an index without having the attribute descriptor. The only two strings which can be passed without a descriptor are '?' and '~', the Orange's representation of "don't know" and "don't care". The value will be discrete.</P>
119
120<p>How do we construct a continuous unknown value? You'll probably never need it, but, well, here it is.</p>
121<xmp class="code">>>> udv = orange.Value(orange.VarTypes.Continuous, orange.ValueTypes.DK)
122</xmp>
123<p>This says that udv is a continuous variable with value "don't know". Replace DK with DC and you have a don't-care value. Replace Continuous with Discrete and you have a discrete value.</P>
124
125<p>There's another way to construct a <code>Value</code>: you can pass any class derived from <code>SomeValue</code> to constructor. There are three such classes at the moment: <code>StringValue</code>, <code>DiscDistribution</code> and <code>ContDistribution</code>.</p>
126
127<xmp class="code">>>> city = orange.Value(orange.StringValue("Cicely"))
128>>> print city
129Cicely
130</xmp>
131
132<p>There's another temptation you might have:</p>
133<xmp class="code">>>> val = orange.Value(fruit)
134>>> val = "plum"
135</xmp>
136
137<p>This actually works, but not as you might wish. After the second line, <code>val</code> becomes an ordinary string, not an <code>orange.Value</code>. What you can do is</p>
138<xmp class="code">>>> val = orange.Value(fruit)
139>>> val.value = "plum"
140</xmp>
141
142<H4>Casting, Arithmetics and Comparison</H4>
143
144<p>There's not much to tell about casting and arithmetics since both work exactly as you'd expected them to. Well, probably - it depends upon whether you meant to apply arithmetical operations to any attributes but continuous. You cannot do that, you cannot add <code>lemon</code> to an <code>apple</code>.</P>
145
146<p>When comparing values, you don't need to convert them into numbers. You can simply compare them to builtin types</p>
147
148<xmp class="code">>>> Harry>80
1490
150>>> Harry>=80
1511
152</xmp>
153
154<p>More often, you will check values within examples. We're skipping a part of documentation, but if the attribute <code>iq</code> appears in a domain of example table <code>tab</code>, you can print the examples with lower <code>iq</code>'s by</p>
155
156<xmp class="code">>>> for e in tab:
157...    if e[iq]<90:
158...        print e
159</xmp>
160
161<p>Comparing nominal values with strings is just as simple, except that strings are not compared alphabetically but by indices. Strings must be legal attribute's values:</p>
162<xmp class="code">>>> lm=="melon"
163Traceback (most recent call last):
164  File "<interactive input>", line 1, in ?
165Exception: Attribute 'fruit' has no value 'melon'
166</xmp>
167
168<p>When comparing values of different nominal attributes, Orange tries converting string representations from one attribute to another. Let us have a three- and a four-valued attribute.</p>
169<xmp class="code">>>> deg3 = orange.EnumVariable("deg3", values=["little", "medium", "big"])
170>>> deg4 = orange.EnumVariable("deg4", values=["tiny", "little", "big", "huge"])
171>>> val3 = orange.Value(deg3)
172>>> val4 = orange.Value(deg4)
173</xmp>
174
175<p>When both attributes have the same value, such as "little" or "big", there's no problem. A more interesting case is</p>
176<xmp class="code">val3.value = "medium"
177val4.value = "little"
178</xmp>
179<p>The values can be compared since it is known (from deg3) that "little" is less than "medium". But what about the following:</p>
180<xmp class="code">val3.value = "medium"
181val4.value = "huge"
182</xmp>
183<p><code>Medium</code> and <code>huge</code> can't be compared since they don't (both) appear in the same attribute. (There is a way: from attribute <code>deg4</code> we see that "huge" is bigger than "big", which is in turn bigger than <code>medium</code> (from <code>deg3</code>). But Orange is not as smart as we.)</P>
184
185<p>Orange requires that both attributes have the same ordering.</p>
186<xmp class="code">>>> degb = orange.EnumVariable(values=["black", "white"])
187>>> degd = orange.EnumVariable(values=["white", "black"])
188>>> print orange.Value(degb, "white") == orange.Value(degd, "white")
1891
190>>> print orange.Value(degb, "black") < orange.Value(degd, "white")
191Traceback (most recent call last):
192  File "<interactive input>", line 1, in ?
193TypeError: Value.compare: values are of different types and have different orders
194</xmp>
195
196<p>The first test succeeds. <code>White</code> equals <code>white</code> for both attributes. The second reports an error since the two attributes rank <code>black</code> and <code>white</code> differently.</p>
197</BODY> 
Note: See TracBrowser for help on using the repository browser.