source: orange/docs/reference/rst/Orange.data.instance.rst @ 9524:c806ca0fa3a9

Revision 9524:c806ca0fa3a9, 11.6 KB checked in by janezd <janez.demsar@…>, 2 years ago (diff)

Added documentation about multiple classes

Line 
1.. py:currentmodule:: Orange.data
2
3=============================
4Data instances (``Instance``)
5=============================
6
7Class `Orange.data.Instance` holds data instances. Each data instance
8corresponds to a domain, which defines its length, data types and
9values for symbolic indices.
10
11--------
12Features
13--------
14
15The data instance is described by a list of features, defined by the
16domain descriptor. Instances support indexing with either integer
17indices, strings or variable descriptors.
18
19Since "age" is the the first attribute in dataset lenses, the
20below statements are equivalent.::
21
22    >>> age = data.domain["age"]
23    >>> example = data[0]
24    >>> print example[0]
25    young
26    >>> print example[age]
27    young
28    >>> print example["age"]
29    young
30
31Negative indices do not work as usual in Python, since they return
32the values of meta attributes.
33
34The last element of data instance is the class label, if it
35exists. It should be accessed using :obj:`get_class` and
36:obj:`set_class`.
37
38Data instances can be traversed using a for loop.
39
40The list has a fixed length, determined by the domain to which the
41instance corresponds.
42
43---------------
44Meta attributes
45---------------
46
47Meta attributes provide a way to attach additional information to
48examples. These attributes are treated specially, for instance, they
49are not used for learning, but can carry additional information, such
50as, for example, a name of a patient or the number of times the
51instance was missclassified during some test procedure. The most
52common additional information is the instance's weight.
53
54For contrast from ordinary features, instances from the same domain do
55not need to have the same meta attributes. Meta attributes are hence
56not addressed by positions, but by their id's, which are represented
57by negative indices. Id's are generated by function
58:obj:`Orange.data.variable.new_meta_id()`. Id's can be reused for
59multiple domains.
60
61If ordinary features resemble lists, meta attributes can be seen as a
62dictionaries.
63
64Domain descriptor can, but doesn't need to know about
65meta descriptors. See documentation on :obj:`Orange.data.Domain` for
66more on that.
67
68If there is a particular descriptor associated with the meta attribute
69for the domain, attribute or its name can also be used for
70indexing. When registering meta attributes with domains, it is
71recommended to used the same id for the same attribute in all domains.
72
73Meta values can also be loaded from files in tab-delimited format.
74
75Meta attributes are often used as weights. Many procedures, such as
76learning algorithms, accept the id of the meta attribute defining the
77weights of instances as an additional argument besides the data.
78
79The following example adds a meta attribute with a random value to
80each data instance
81
82.. literalinclude:: code/instance-metavar.py
83    :lines: 1-
84
85The code prints out something like::
86
87    ['young', 'myope', 'no', 'reduced', 'none'], {-2:0.84}
88
89Data instance now consists of two parts, ordinary features that
90resemble a list since they are addressed by positions (eg. the first
91value is "psby"), and meta values that are more like dictionaries,
92where the id (-2) is a key and 0.34 is a value (of type
93:obj:`Orange.data.Value`).
94
95To tell the learning algorithm to use the weights, the id needs to be
96passed along with the data::
97
98    bayes = orange.BayesLearner(data, id)
99
100Many other functions accept weights in similar fashion.
101
102Code::
103
104    print orange.getClassDistribution(data)
105    print orange.getClassDistribution(data, id)
106
107prints out::
108
109    <15.000, 5.000, 4.000>
110    <9.691, 3.232, 1.969>
111
112Registering the meta attribute changes how the data instance is
113printed out and how it can be accessed::
114
115    w = orange.FloatVariable("w")
116    data.domain.addmeta(id, w)
117
118Meta-attribute can now be indexed just like ordinary features. The
119following three statements are equivalent::
120
121    print data[0][id]
122    print data[0][w]
123    print data[0]["w"]
124
125Another consequence of registering the meta attribute is that it
126allows for conversion from Python native types::
127
128    ok = orange.EnumVariable("ok?", values=["no", "yes"])
129    ok_id = orange.newmetaid()
130    data.domain.addmeta(ok_id, ok)
131    data[0][ok_id] = "yes"
132
133The last line fails unless the attribute is registered since Orange
134does not know which variable descriptor to use to convert the string
135"yes" to an attribute value.
136
137-------
138Hashing
139-------
140
141Data instances compute hashes using CRC32 and can thus be used for
142keys in dictionaries or collected to Python data sets.
143
144.. class:: Instance
145
146    .. attribute:: domain
147
148        The domain to which the data instance corresponds. This
149        attribute is read-only.
150
151    .. method:: __init__(domain[, values])
152
153        Construct a data instance with the given domain and initialize
154        the values. Values should be given as a list containing
155        objects that can be converted into values of corresponding
156        variables; generally, they can be given as strings and
157        integer indices (for discrete varaibles) or numbers (for
158        continuous variables), and also as instances of
159        :obj:`Orange.data.Value`.
160
161        If values are omitted, they are set to unknown.
162
163        :param domain: domain descriptor
164        :type domain: Orange.data.Domain
165        :param values: A list of values
166        :type value: list
167
168        The following example loads data on lenses and constructs
169        another data instance from the same domain.
170
171        .. literalinclude:: code/instance-construct.py
172            :lines: 1-5
173
174        Same can be done using other representations of values
175
176        .. literalinclude:: code/instance-construct.py
177            :lines: 7-8
178
179    .. method:: __init__([domain ,] instance)
180
181        Construct a new data instance as a shallow copy of the
182        original. If a domain descriptor is given, the instance is
183        converted; conversion can add or remove variables, including
184        transformations, like discretization ets.
185
186        :param domain: domain descriptor
187        :type domain: Orange.data.Domain
188        :param instance: Data instance
189        :type value: :obj:`Instance`
190
191        The following examples constructs a reduced domain and a data
192        instance in this domain. ::
193
194            domain_red = Orange.data.Domain(["age", "lenses"], domain)
195            inst_red = Orange.data.Instance(domain_red, inst)
196
197    .. method:: __init__(domain, instances)
198
199        Construct a new data instance for the given domain, where
200        attribute values are taken from the provided instances, using
201        both their ordinary features and meta attributes, which are
202        registered with their corresponding domains. Meta attributes
203        which appear in the provided instances and do not appear in
204        the domain of the new instance, are copied as well.
205
206        :param domain: domain descriptor
207        :type domain: Orange.data.domain
208        :param instances: data instances
209        :type value: list of Orange.data.Instance
210
211        .. literalinclude:: code/instance_merge.py
212                :lines: 3-
213
214        The new domain consists of variables from `data1` and `data2`:
215        `a1`, `a3` and `m1` are ordinary features, and `m2` and `a2`
216        are meta attributes in the new domain. `m2` has the
217        same meta attribute id as it has in `data1`, while `a2` gets a
218        new meta id. In addition, the new domain has two new
219        attributes, `n1` and `n2`.
220
221        Here is the output::
222
223            First example:  [1, 2], {"m1":3, "m2":4}
224            Second example:  [1, 2.5], {"m1":3, "m3":4.5}
225            Merge:  [1, 2.5, 3, ?], {"a2":2, "m2":4, -5:4.50, "n2":?}
226
227
228        Since attributes `a1` and `m1` appear in domains of both
229        original instance, the new instance can only be constructed if
230        these values match. `a3` comes from the second instance, and
231        meta attributes `a2` and `m1` come from the first one. The
232        meta attribute `m3` is also copied from the second instance;
233        since it is not registered within the new domain, it is
234        printed out with an id (-5) instead of with a name. Values of
235        the two new attributes are left undefined.
236
237    .. method:: native([level])
238
239        Converts the instance into an ordinary Python list. If the
240        optional argument is 1 (default), the result is a list of
241        objects of :obj:`orange.Data.value`. If it is 0, it contains
242        pure Pyhon objects, that is, strings for discrete variables
243        and numbers for continuous ones.
244
245    .. method:: compatible(other, ignore_class=0)
246
247        Return :obj:`True` if the two instances are compatible, that
248        is, equal in all features which are not missing in one of
249        them. The optional second argument can be used to omit the
250        class from comparison.
251
252    .. method:: get_class()
253
254        Return the instance's class as :obj:`Orange.data.Value`.
255
256    .. method:: get_classes()
257
258        Return the values of multiple classes as a list of
259        :obj:`Orange.data.Value`.
260
261    .. method:: set_class(value)
262
263        Set the instance's class.
264
265        :param value: the new instance's class
266        :type value: :obj:`Orange.data.Value`, number or string
267
268    .. method:: set_classes(values)
269
270        Set the values of multiple classes.
271
272        :param values: a list of values; the length must match the number of multiple classes
273        :type values: list
274
275    .. method:: get_metas([key_type])
276
277        Return a dictionary containing meta values of the data
278        instance. The key type can be :obj:`int` (default), :obj:`str`
279        or :obj:`Orange.data.variable.Variable` and determines whether
280        the dictionary keys will be meta ids, variables names or
281        variable descriptors. In the latter two cases, only registered
282        attributes are returned. ::
283
284            data = orange.ExampleTable("inquisition2")
285            example = data[4]
286            print example.getmetas()
287            print example.getmetas(int)
288            print example.getmetas(str)
289            print example.getmetas(orange.Variable)
290
291        :param key_type: the key type; either :obj:`int`, :obj:`str` or :obj:`Orange.data.variable.Variable`
292        :type key_type: :obj:`type`
293
294    .. method:: get_metas(optional, [key_type])
295
296        Similar to above, but return a dictionary containing meta
297        values of the data instance which are or which are not
298        optional.
299
300        :param optional: tells whether to return optional or non-optional attributes
301        :type optional: :obj:`bool`
302        :param key_type: the key type; either :obj:`int`, :obj:`str` or :obj:`Orange.data.variable.Variable`
303        :type key_type: :obj:`type`
304
305    .. method:: has_meta(meta_attr)
306
307        Return :obj:`True` if the data instance has the specified meta
308        attribute, specified by id, string or descriptor.
309
310        :param meta_attr: meta attribute
311        :type meta_attr: :obj:`id`, :obj:`str` or :obj:`Orange.data.variable.Variable`
312
313    .. method:: remove_meta(meta_attr)
314
315        Remove meta attribute.
316
317        :param meta_attr: meta attribute
318        :type meta_attr: :obj:`id`, :obj:`str` or :obj:`Orange.data.variable.Variable`
319
320    .. method:: get_weight(meta_attr)
321
322        Return the value of the specified meta attribute. The value
323        must be continuous; it is returned as a :obj:`float`.
324
325        :param meta_attr: meta attribute
326        :type meta_attr: :obj:`id`, :obj:`str` or :obj:`Orange.data.variable.Variable`
327
328    .. method:: set_weight(meta_attr, weight=1)
329
330        Set the value of the specified meta attribute to `weight`. The value
331        must be continuous; it is returned as a :obj:`float`.
332
333        :param meta_attr: meta attribute
334        :type meta_attr: :obj:`id`, :obj:`str` or :obj:`Orange.data.variable.Variable`
335        :param weight: weight of the instance
336        :type weight: :obj:`float`
Note: See TracBrowser for help on using the repository browser.