- 02/06/12 10:56:22 (2 years ago)
- 1 edited
r9663 r9719 1 2 1 3 .. automodule:: Orange.distance 2 4 … … 5 7 ########################################## 6 8 7 This page describes a bunch of classes for different metrics for measure 8 distances (dissimilarities) between instances. 9 Distance measures typically have to be adjusted to the data. For instance, 10 when the data set contains continuous features, the distances between 11 continuous values should be normalized to ensure that all features have 12 similar impats, e.g. by dividing the distance with the range. 9 13 10 Typical (although not all) measures of distance between instances require 11 some "learning" - adjusting the measure to the data. For instance, when 12 the dataset contains continuous features, the distances between continuous 13 values should be normalized, e.g. by dividing the distance with the range 14 of possible values or with some interquartile distance to ensure that all 15 features have, in principle, similar impacts. 16 17 Different measures of distance thus appear in pairs - a class that measures 18 the distance and a class that constructs it based on the data. The abstract 19 classes representing such a pair are `ExamplesDistance` and 20 `ExamplesDistanceConstructor`. 14 Distance measures thus appear in pairs - a class that measures 15 the distance (:obj:`Distance`) and a class that constructs it based on the 16 data (:obj:`DistanceConstructor`). 21 17 22 18 Since most measures work on normalized distances between corresponding 23 features, there is an abstract intermediate class 24 `ExamplesDistance_Normalized` that takes care of normalizing. 25 The remaining classes correspond to different ways of defining the distances, 26 such as Manhattan or Euclidean distance. 19 features, an abstract class `DistanceNormalized` takes care of 20 normalizing. 27 21 28 Unknown values are treated correctly only by Euclidean and Relief distance. 29 For other measure of distance, a distance between unknown and known or between 30 two unknown values is always 0.5. 22 Unknown values are treated correctly only by Euclidean and Relief 23 distance. For other measures, a distance between unknown and known or 24 two unknown values is always 0.5. 31 25 32 .. class:: ExamplesDistance 26 .. class:: Distance 33 27 34 28 .. method:: __call__(instance1, instance2) 35 29 36 Return s a distance between the given instances as floating point number. 30 Return. 37 31 38 .. class:: ExamplesDistanceConstructor 32 .. class:: DistanceConstructor 39 33 40 34 .. method:: __call__([instances, weightID][, distributions][, basic_var_stat]) 41 35 42 Constructs an instance of ExamplesDistance. 43 Not all the data needs to be given. Most measures can be constructed 44 from basic_var_stat; if it is not given, they can help themselves 45 either by instances or distributions. 46 Some (e.g. ExamplesDistance_Hamming) even do not need any arguments. 36 Constructs an :obj:`Distance`. Not all the data needs to be 37 given. Most measures can be constructed from basic_var_stat; 38 if it is not given, they can help themselves either by instances 39 or distributions. Some do not need any arguments. 47 40 48 .. class:: ExamplesDistance_Normalized 41 .. class:: Normalized 49 42 50 43 This abstract class provides a function which is given two instances
Note: See TracChangeset for help on using the changeset viewer.