source: orange/orange/doc/widgets/Unsupervised/ExampleDistance.htm @ 9399:6bbe263e8bcf

Revision 9399:6bbe263e8bcf, 2.7 KB checked in by mitar, 2 years ago (diff)

Renaming widgets catalog.

Line 
1<html>
2<head>
3<title>Example Distance</title>
4<link rel=stylesheet href="../../../style.css" type="text/css" media=screen>
5<link rel=stylesheet href="../../../style-print.css" type="text/css" media=print></link>
6</head>
7
8<body>
9
10<h1>Example Distance</h1>
11
12<img class="screenshot" src="../icons/ExampleDistance.png">
13<p>Computes distances between examples in the data set</p>
14
15<h2>Channels</h2>
16
17<h3>Inputs</h3>
18
19<DL class=attributes>
20<DT>Examples</DT>
21<DD>A list of examples</DD>
22</dl>
23
24<h3>Outputs</h3>
25<DL class="attributes">
26<DT>Distance Matrix</DT>
27<DD>A matrix of example distances</DD>
28</dl>
29
30<h2>Description</h2>
31
32<p>Widget Example Distances computes the distances between the examples in the data sets. Don't confuse it with a similar widget for computing the distances between attributes.</p>
33
34<img class="screenshot" src="ExampleDistance.png" alt="Example Distance Widget" border=0>
35
36<P>The available <span class="option">Distance Metrics</span> definitions are <span class="option">Euclidean</span>, <span class="option">Manhattan</span>, <span class="option">Hammming</span> and <span class="option">Relief</span>. Besides, of course, different formal definitions, the measures also differ in how correctly they treat unknown values. Manhattan and Hamming distance do not excel in this respect: when computing by-attribute distances, if any of the two values are missing, the corresponding distance is set to 0.5 (on a normalized scale where the largest difference in attribute values is 1.0). Relief distance is similar to Manhattan, but with a more correct treatment for discrete attributes: it computes the expected distances by the probability distributions computed from the data (see any Kononenko's papers on ReliefF for the definition).</P>
37
38<P>The most correct treatment of unknown values is done by the Euclidean metrics which computes and uses the probability distributions of discrete attributes, while for continuous distributions it computes the expected distance assuming the Gaussian distribution of attribute values, where the distribution's parameters are again assessed from the data.</p>
39
40<P>The rows/columns of the resulting distance matrix can be labeled by the values of a certain attribute which can be chosen in the bottom box, <span class="option">Example label</span>.</P>
41
42
43<h2>Examples</h2>
44
45<P>This widget is a typical intermediate widget: it gives shows no user readable results and its output needs to be fed to a widget that can do something useful with the computed distances, for instance the <a href="DistanceMap.htm">Distance Map</a>, <a href="HierarchicalClustering.htm">Hierarchical Clustering</a> or <a href="MDS.htm">MDS</a>.</P>
46
47<img class="screenshot" src="ExampleDistance-Schema.png" alt="Association Rules" border=0>
48
49</body>
50</html>
Note: See TracBrowser for help on using the repository browser.