source: orange/docs/widgets/rst/data/continuize.rst @ 11050:e3c4699ca155

Revision 11050:e3c4699ca155, 3.6 KB checked in by Miha Stajdohar <miha.stajdohar@…>, 16 months ago (diff)

Widget docs From HTML to Sphinx.

Line 
1.. _Continuize:
2
3Continuize
4==========
5
6.. image:: ../icons/Continuize.png
7
8Turns discrete attributes into continuous dummy variables.
9
10Signals
11-------
12
13Inputs:
14   - Examples (ExampleTable)
15      Input data set.
16
17Outputs:
18   - Examples (ExampleTable)
19      Output data set.
20
21
22Description
23-----------
24
25Continuize widget receives a data set on the input and outputs the same data in which the discrete attributes (including binary attributes) are replaced with continuous using the methods specified by the user.
26
27.. image:: images/Continuize.png
28
29The first box, :obj:`Multinominal attributes`, defines the treatment of multivalued discrete attributes. Say that we have a discrete attribute status with values low, middle and high, listed in that order. Options for its transformation are
30
31   - :obj:`Target or First value as base`: the attribute will be transformed into two continuous attributes, status=middle with values 0 or 1 signifying whether the original attribute had value middle on a particular example, and similarly, status=high. Hence, a three-valued attribute is transformed into two continuous attributes, corresponding to all except the first value of the attribute.
32
33   - :obj:`Most frequent value as base`: similar to the above, except that the data is analyzed and the most frequent value is used as a base. So, if most examples have the value middle, the two newly constructed continuous attributes will be status=low and status=high.
34
35   - :obj:`One attribute per value`: this would construct three continuous attributes out of a three-valued discrete one.
36
37   - :obj:`Ignore multinominal attributes`: removes the multinominal attributes from the data.
38
39   - :obj:`Treat as ordinal:` converts the attribute into a continuous attribute with values 0, 1, and 2.
40
41   - :obj:`Divide by number of values:` same as above, except that the values are normalized into range 0-1. So, our case would give values 0, 0.5 and 1.
42
43
44Next box defines the treatment of continuous attributes. You will usually prefer :obj:`Leave as is` option. The alternative is :obj:`Normalize by span` which will subtract the lowest value found in the data and divide by the span, so all values will fit into [0, 1]. Finally, :obj:`Normalize by variance` subtracts the average and divides by the variance.
45
46Finally, you can decide what happens with the class if it is discrete. Besides leaving it as it is, there are also the options which are available for multinominal attributes, except for those options which split the attribute into more than one attribute - this obviously cannot be supported since you cannot have more than one class attribute. Additionally, you can :obj:`specify a target value`; this will transform the class into a continuous attribute with value 1 if the value of the original attribute equals the target and 0 otherwise.
47
48With :obj:`value range`, you can define the values of the new attributes. In the above text we supposed the range :obj:`from 0 to 1`. You can change it to :obj:`from -1 to 1`.
49
50If :obj:`Send automatically` is set, the data set is committed on any change. Otherwise, you have to press :obj:`Send data` after each change.
51
52Examples
53--------
54
55The schema below shows a typical use of this widget: in order to properly plot linear projection of the data, discrete attributes need to be converted to continuous, therefore we put the data through Continuize widget before drawing it. Attribute "chest pain" originally had four values and was transformed into three continuous attributes; similar happened to gender, which was transformed into a single attribute gender=female.
56
57.. image:: images/Continuize-Schema.png
Note: See TracBrowser for help on using the repository browser.