source: orange/Orange/doc/widgets/Data/Continuize.htm @ 9671:a7b056375472

Revision 9671:a7b056375472, 4.5 KB checked in by anze <anze.staric@…>, 2 years ago (diff)

Moved orange to Orange (part 2)

Line 
1<html>
2<head>
3<title>Continuize</title>
4<link rel=stylesheet href="../../../style.css" type="text/css" media=screen>
5<link rel=stylesheet href="style-print.css" type="text/css" media=print></link>
6</head>
7
8<body>
9
10<h1>Continuize</h1>
11
12<img class="screenshot" src="../icons/Continuize.png">
13<p>Turns discrete attributes into continuous dummy variables.</p>
14
15<h2>Channels</h2>
16
17<h3>Inputs</h3>
18
19<DL class=attributes>
20<DT>Examples (ExampleTable)</DT>
21<DD>Input data set.</DD>
22</dl>
23
24<h3>Outputs</h3>
25
26<DL class=attributes>
27<DT>Examples (ExampleTable)</DT>
28<DD>Output data set.</DD>
29</dl>
30
31<h2>Description</h2>
32
33<p>Continuize widget receives a data set on the input and outputs the same data in which the discrete attributes (including binary attributes) are replaced with continuous using the methods specified by the user.</p>
34
35<table>
36<tr><td valign="top">
37<img class="screenshot" src="Continuize.png" align="left">
38</td>
39<td valign="top">
40<p>The first box, <span class="option">Multinominal attributes</span>, defines the treatment of multivalued discrete attributes. Say that we have a discrete attribute <em>status</em> with values <em>low</em>, <em>middle</em> and <em>high</em>, listed in that order. Options for its transformation are</p>
41<ul>
42<li><span class="option">Target or First value as base</span>: the attribute will be transformed into two continuous attributes, <em>status=middle</em> with values 0 or 1 signifying whether the original attribute had value <em>middle</em> on a particular example, and similarly, <em>status=high</em>. Hence, a three-valued attribute is transformed into two continuous attributes, corresponding to all except the first value of the attribute.</li>
43
44<li><span class="option">Most frequent value as base</span>: similar to the above, except that the data is analyzed and the most frequent value is used as a base. So, if most examples have the value <em>middle</em>, the two newly constructed continuous attributes will be <em>status=low</em> and <em>status=high</em>.</li>
45
46<li><span class="option">One attribute per value</span>: this would construct three continuous attributes out of a three-valued discrete one.</li>
47
48<li><span class="option">Ignore multinominal attributes</span>: removes the multinominal attributes from the data.</li>
49
50<li><span class="option">Treat as ordinal:</span> converts the attribute into a continuous attribute with values 0, 1, and 2.</li>
51
52<li><span class="option">Divide by number of values:</span> same as above, except that the values are normalized into range 0-1. So, our case would give values 0, 0.5 and 1.</li>
53</ul>
54
55<P>Next box defines the treatment of continuous attributes. You will usually prefer <span class="option">Leave as is</span> option. The alternative is <span class="option">Normalize by span</span> which will subtract the lowest value found in the data and divide by the span, so all values will fit into [0, 1]. Finally, <span class="option">Normalize by variance</span> subtracts the average and divides by the variance.</P>
56
57<P>Finally, you can decide what happens with the class if it is discrete. Besides leaving it as it is, there are also the options which are available for multinominal attributes, except for those options which split the attribute into more than one attribute - this obviously cannot be supported since you cannot have more than one class attribute. Additionally, you can <span class="option">specify a target value</span>; this will transform the class into a continuous attribute with value 1 if the value of the original attribute equals the target and 0 otherwise.</P>
58
59<P>With <span class="option">value range</span>, you can define the values of the new attributes. In the above text we supposed the range <span class="option">from 0 to 1</span>. You can change it to <span class="option">from -1 to 1</span>.</P>
60
61<P>If <span class="option">Send automatically</span> is set, the data set is committed on any change. Otherwise, you have to press <span class="option">Send data</span> after each change.</P>
62</td></tr></table>
63
64
65<h2>Examples</h2>
66
67<P>The schema below shows a typical use of this widget: in order to properly plot linear projection of the data, discrete attributes need to be converted to continuous, therefore we put the data through Continuize widget before drawing it. Attribute "chest pain" originally had four values and was transformed into three continuous attributes; similar happened to gender, which was transformed into a single attribute gender=female.</P>
68
69<img src="Continuize-Schema.png">
70
71
72</body>
73</html>
Note: See TracBrowser for help on using the repository browser.