#
source:
orange/docs/widgets/rst/unsupervized/exampledistance.rst
@
11359:8d54e79aa135

Revision 11359:8d54e79aa135, 2.1 KB checked in by Ales Erjavec <ales.erjavec@…>, 14 months ago (diff) |
---|

Line | |
---|---|

1 | .. _Example Distance: |

2 | |

3 | Example Distance |

4 | ================ |

5 | |

6 | .. image:: ../icons/ExampleDistance.png |

7 | |

8 | Computes distances between examples in the data set |

9 | |

10 | Signals |

11 | ------- |

12 | |

13 | Inputs: |

14 | - Examples |

15 | A list of examples |

16 | |

17 | |

18 | Outputs: |

19 | - Distance Matrix |

20 | A matrix of example distances |

21 | |

22 | |

23 | Description |

24 | ----------- |

25 | |

26 | Widget Example Distances computes the distances between the examples in the |

27 | data sets. Don't confuse it with a similar widget for computing the distances |

28 | between attributes. |

29 | |

30 | .. image:: images/ExampleDistance.png |

31 | :alt: Example Distance Widget |

32 | |

33 | The available :obj:`Distance Metrics` definitions are :obj:`Euclidean`, |

34 | :obj:`Manhattan`, :obj:`Hammming` and :obj:`Relief`. Besides, of course, |

35 | different formal definitions, the measures also differ in how correctly |

36 | they treat unknown values. Manhattan and Hamming distance do not excel in |

37 | this respect: when computing by-attribute distances, if any of the two values |

38 | are missing, the corresponding distance is set to 0.5 (on a normalized scale |

39 | where the largest difference in attribute values is 1.0). Relief distance is |

40 | similar to Manhattan, but with a more correct treatment for discrete |

41 | attributes: it computes the expected distances by the probability distribution |

42 | computed from the data (see any Kononenko's papers on ReliefF for the |

43 | definition). |

44 | |

45 | The most correct treatment of unknown values is done by the Euclidean metrics |

46 | which computes and uses the probability distributions of discrete attributes, |

47 | while for continuous distributions it computes the expected distance assuming |

48 | the Gaussian distribution of attribute values, where the distribution's |

49 | parameters are again assessed from the data. |

50 | |

51 | The rows/columns of the resulting distance matrix can be labeled by the |

52 | values of a certain attribute which can be chosen in the bottom box, |

53 | :obj:`Example label`. |

54 | |

55 | |

56 | Examples |

57 | -------- |

58 | |

59 | This widget is a typical intermediate widget: it gives shows no user readable |

60 | results and its output needs to be fed to a widget that can do something |

61 | useful with the computed distances, for instance the :ref:`Distance Map`, |

62 | :ref:`Hierarchical Clustering` or :ref:`MDS`. |

63 | |

64 | .. image:: images/ExampleDistance-Schema.png |

65 | :alt: Association Rules |

**Note:**See TracBrowser for help on using the repository browser.