source: orange/docs/widgets/rst/unsupervized/exampledistance.rst @ 11359:8d54e79aa135

Revision 11359:8d54e79aa135, 2.1 KB checked in by Ales Erjavec <ales.erjavec@…>, 14 months ago (diff)

Cleanup of 'Widget catalog' documentation.

Fixed rst text formating, replaced dead hardcoded reference links (now using
:ref:), etc.

Line 
1.. _Example Distance:
2
3Example Distance
4================
5
6.. image:: ../icons/ExampleDistance.png
7
8Computes distances between examples in the data set
9
10Signals
11-------
12
13Inputs:
14   - Examples
15      A list of examples
16
17
18Outputs:
19   - Distance Matrix
20      A matrix of example distances
21
22
23Description
24-----------
25
26Widget Example Distances computes the distances between the examples in the
27data sets. Don't confuse it with a similar widget for computing the distances
28between attributes.
29
30.. image:: images/ExampleDistance.png
31   :alt: Example Distance Widget
32
33The available :obj:`Distance Metrics` definitions are :obj:`Euclidean`,
34:obj:`Manhattan`, :obj:`Hammming` and :obj:`Relief`. Besides, of course,
35different formal definitions, the measures also differ in how correctly
36they treat unknown values. Manhattan and Hamming distance do not excel in
37this respect: when computing by-attribute distances, if any of the two values
38are missing, the corresponding distance is set to 0.5 (on a normalized scale
39where the largest difference in attribute values is 1.0). Relief distance is
40similar to Manhattan, but with a more correct treatment for discrete
41attributes: it computes the expected distances by the probability distribution
42computed from the data (see any Kononenko's papers on ReliefF for the
43definition).
44
45The most correct treatment of unknown values is done by the Euclidean metrics
46which computes and uses the probability distributions of discrete attributes,
47while for continuous distributions it computes the expected distance assuming
48the Gaussian distribution of attribute values, where the distribution's
49parameters are again assessed from the data.
50
51The rows/columns of the resulting distance matrix can be labeled by the
52values of a certain attribute which can be chosen in the bottom box,
53:obj:`Example label`.
54
55
56Examples
57--------
58
59This widget is a typical intermediate widget: it gives shows no user readable
60results and its output needs to be fed to a widget that can do something
61useful with the computed distances, for instance the :ref:`Distance Map`,
62:ref:`Hierarchical Clustering` or :ref:`MDS`.
63
64.. image:: images/ExampleDistance-Schema.png
65   :alt: Association Rules
Note: See TracBrowser for help on using the repository browser.