source: orange/docs/reference/rst/Orange.data.instance.rst @ 9927:d6ca7b346864

Revision 9927:d6ca7b346864, 11.4 KB checked in by markotoplak, 2 years ago (diff)

data.variable -> feature.

Line 
1.. py:currentmodule:: Orange.data
2
3=============================
4Data instances (``Instance``)
5=============================
6
7Class `Orange.data.Instance` holds a data instance. Each data instance
8corresponds to a domain, which defines its length, data types and
9values for symbolic indices.
10
11--------
12Features
13--------
14
15The data instance is described by a list of features defined by the
16domain descriptor (:obj:`Orange.data.domain`). Instances support indexing
17with either integer indices, strings or variable descriptors.
18
19Since "age" is the the first attribute in dataset lenses, the
20below statements are equivalent::
21
22    >>> data = Orange.data.Table("lenses")
23    >>> age = data.domain["age"]
24    >>> example = data[0]
25    >>> print example[0]
26    young
27    >>> print example[age]
28    young
29    >>> print example["age"]
30    young
31
32Negative indices do not work as usual in Python, since they refer to
33the values of meta attributes.
34
35The last element of data instance is the class label,
36if the domain has a class. It should be accessed using
37:obj:`~Orange.data.Instance.get_class()` and
38:obj:`~Orange.data.Instance.set_class()`.
39
40The list has a fixed length that equals the number of variables.
41
42---------------
43Meta attributes
44---------------
45
46Meta attributes provide a way to attach additional information to data
47instances, such as, for example, an id of a patient or the number of times
48the instance was missclassified during some test procedure. The most
49common additional information is the instance's weight. These attributes
50do not appear in induced models.
51
52Instances from the same domain do not need to have the same meta
53attributes. Meta attributes are hence not addressed by positions,
54but by their id's, which are represented by negative indices. Id's are
55generated by function :obj:`Orange.feature.new_meta_id()`. Id's can
56be reused for multiple domains.
57
58Domain descriptor can, but doesn't need to know about
59meta descriptors. See documentation on :obj:`Orange.data.Domain` for
60more on that.
61
62If there is a particular descriptor associated with the meta attribute
63for the domain, attribute or its name can also be used for
64indexing. When registering meta attributes with domains, it is
65recommended to use the same id for the same attribute in all domains.
66
67Meta values can also be loaded from files in tab-delimited format.
68
69Meta attributes are often used as weights. Many procedures, such as
70learning algorithms, accept the id of the meta attribute defining the
71weights of instances as an additional argument.
72
73The following example adds a meta attribute with a random value to
74each data instance.
75
76.. literalinclude:: code/instance-metavar.py
77    :lines: 1-
78
79The code prints out::
80
81    ['young', 'myope', 'no', 'reduced', 'none'], {-2:0.84}
82
83(except for a different random value). Data instance now consists of
84two parts, ordinary features that
85resemble a list since they are addressed by positions (eg. the first
86value is "psby"), and meta values that are more like dictionaries,
87where the id (-2) is a key and 0.84 is a value (of type
88:obj:`Orange.data.Value`).
89
90To tell the learning algorithm to use the weights, the id needs to be
91passed along with the data::
92
93    bayes = orange.BayesLearner(data, id)
94
95Many other functions accept weights in similar fashion.
96
97Code ::
98
99    print orange.getClassDistribution(data)
100    print orange.getClassDistribution(data, id)
101
102prints out ::
103
104    <15.000, 5.000, 4.000>
105    <9.691, 3.232, 1.969>
106
107where the first line is the actual distribution and the second a
108distribution with random weights assigned to the instances.
109
110Registering the meta attribute using :obj:`Orange.data.Domain.add_meta`
111changes how the data instance is printed out and how it can be
112accessed::
113
114    w = orange.FloatVariable("w")
115    data.domain.addmeta(id, w)
116
117Meta-attribute can now be indexed just like ordinary features. The
118following three statements are equivalent::
119
120    print data[0][id]
121    print data[0][w]
122    print data[0]["w"]
123
124Another consequence of registering the meta attribute is that it
125allows for conversion from Python native types::
126
127    ok = orange.EnumVariable("ok?", values=["no", "yes"])
128    ok_id = Orange.data.new_meta_id()
129    data.domain.addmeta(ok_id, ok)
130    data[0][ok_id] = "yes"
131
132The last line fails unless the attribute is registered since Orange
133does not know which variable descriptor to use to convert the string
134"yes" to an attribute value.
135
136-------
137Hashing
138-------
139
140Data instances compute hashes using CRC32 and can thus be used for
141keys in dictionaries or collected to Python data sets.
142
143.. class:: Instance
144
145    .. attribute:: domain
146
147        The domain to which the data instance corresponds. This
148        attribute is read-only.
149
150    .. method:: __init__(domain[, values])
151
152        Construct a data instance with the given domain and initialize
153        the values. Values are given as a list of
154        objects that can be converted into values of corresponding
155        variables: strings and integer indices (for discrete varaibles),
156        strings or numbers (for continuous variables), or instances of
157        :obj:`Orange.data.Value`.
158
159        If values are omitted, they are set to unknown.
160
161        :param domain: domain descriptor
162        :type domain: Orange.data.Domain
163        :param values: A list of values
164        :type value: list
165
166        The following example loads data on lenses and constructs
167        another data instance from the same domain.
168
169        .. literalinclude:: code/instance-construct.py
170            :lines: 1-5
171
172        Same can be done using other representations of values
173
174        .. literalinclude:: code/instance-construct.py
175            :lines: 7-8
176
177    .. method:: __init__([domain ,] instance)
178
179        Construct a new data instance as a shallow copy of the
180        original. If a domain descriptor is given, the instance is
181        converted to another domain.
182
183        :param domain: domain descriptor
184        :type domain: Orange.data.Domain
185        :param instance: Data instance
186        :type value: :obj:`Instance`
187
188        The following examples constructs a reduced domain and a data
189        instance in this domain. ::
190
191            domain_red = Orange.data.Domain(["age", "lenses"], domain)
192            inst_red = Orange.data.Instance(domain_red, inst)
193
194    .. method:: __init__(domain, instances)
195
196        Construct a new data instance for the given domain, where the
197        feature values are found in the provided instances using
198        both their ordinary features and meta attributes that are
199        registered with their corresponding domains. The new instance
200        also includes the meta attributes that appear in the provided
201        instances and whose values are not used for the instance's
202        features.
203
204        :param domain: domain descriptor
205        :type domain: Orange.data.domain
206        :param instances: data instances
207        :type value: list of Orange.data.Instance
208
209        .. literalinclude:: code/instance_merge.py
210                :lines: 3-
211
212        The new domain consists of variables from `data1` and `data2`:
213        `a1`, `a3` and `m1` are ordinary features, and `m2` and `a2`
214        are meta attributes in the new domain. `m2` has the
215        same meta attribute id as it has in `data1`, while `a2` gets a
216        new meta id. In addition, the new domain has two new
217        attributes, `n1` and `n2`.
218
219        Here is the output::
220
221            First example:  [1, 2], {"m1":3, "m2":4}
222            Second example:  [1, 2.5], {"m1":3, "m3":4.5}
223            Merge:  [1, 2.5, 3, ?], {"a2":2, "m2":4, -5:4.50, "n2":?}
224
225
226        Since attributes `a1` and `m1` appear in domains of both
227        original instance, the new instance can only be constructed if
228        these values match. `a3` comes from the second instance, and
229        meta attributes `a2` and `m1` come from the first one. The
230        meta attribute `m3` is also copied from the second instance;
231        since it is not registered within the new domain, it is
232        printed out with an id (-5) instead of with a name. Values of
233        the two new attributes are left undefined.
234
235    .. method:: native([level])
236
237        Convert the instance into an ordinary Python list. If the
238        optional argument `level` is 1 (default), the result is a list of
239        instances of :obj:`orange.data.Value`. If it is 0, it contains
240        pure Python objects, that is, strings for discrete variables
241        and numbers for continuous ones.
242
243    .. method:: compatible(other, ignore_class=False)
244
245        Return ``True`` if the two instances are compatible, that
246        is, equal in all features which are not missing in one of
247        them. The optional second argument can be used to omit the
248        class from comparison.
249
250    .. method:: get_class()
251
252        Return the instance's class as :obj:`Orange.data.Value`.
253
254    .. method:: get_classes()
255
256        Return the values of multiple classes as a list of
257        :obj:`Orange.data.Value`.
258
259    .. method:: set_class(value)
260
261        Set the instance's class.
262
263        :param value: the new instance's class
264        :type value: :obj:`Orange.data.Value`, number or string
265
266    .. method:: set_classes(values)
267
268        Set the values of multiple classes.
269
270        :param values: a list of values; the length must match the number of multiple classes
271        :type values: list
272
273    .. method:: get_metas([key_type])
274
275        Return a dictionary containing meta values of the data
276        instance. The argument ``key_type`` can be ``int`` (default),
277        ``str`` or :obj:`Orange.feature.Descriptor` and
278        determines whether
279        the dictionary keys are meta ids, variables names or
280        variable descriptors. In the latter two cases, only registered
281        attributes are returned. ::
282
283            data = orange.ExampleTable("inquisition2")
284            example = data[4]
285            print example.getmetas()
286            print example.getmetas(int)
287            print example.getmetas(str)
288            print example.getmetas(orange.Variable)
289
290        :param key_type: the key type; either ``int``, ``str`` or :obj:`~Orange.feature.Descriptor`
291        :type key_type: ``type``
292
293    .. method:: get_metas(optional, [key_type])
294
295        Similar to above, but return a dictionary that contains
296        only non-optional attributes (if ``optional`` is 0) or
297        only optional attributes.
298
299        :param optional: tells whether to return optional or non-optional attributes
300        :type optional: ``bool``
301        :param key_type: the key type; either ``int``, ``str`` or :obj:`~Orange.feature.Descriptor`
302        :type key_type: `type``
303
304    .. method:: has_meta(meta_attr)
305
306        Return ``True`` if the data instance has the specified meta
307        attribute.
308
309        :param meta_attr: meta attribute
310        :type meta_attr: :obj:`id`, ``str`` or :obj:`~Orange.feature.Descriptor`
311
312    .. method:: remove_meta(meta_attr)
313
314        Remove the specified meta attribute.
315
316        :param meta_attr: meta attribute
317        :type meta_attr: :obj:`id`, ``str`` or :obj:`~Orange.feature.Descriptor`
318
319    .. method:: get_weight(meta_attr)
320
321        Return the value of the specified meta attribute. The
322        attribute's value must be continuous and is returned as ``float``.
323
324        :param meta_attr: meta attribute
325        :type meta_attr: :obj:`id`, ``str`` or :obj:`~Orange.feature.Descriptor`
326
327    .. method:: set_weight(meta_attr, weight=1)
328
329        Set the value of the specified meta attribute to ``weight``.
330
331        :param meta_attr: meta attribute
332        :type meta_attr: :obj:`id`, ``str`` or :obj:`~Orange.feature.Descriptor`
333        :param weight: weight of instance
334        :type weight: ``float``
Note: See TracBrowser for help on using the repository browser.