Changeset 9821:6cc715432fa7 in orange for docs/reference/rst/Orange.distance.rst
 Timestamp:
 02/06/12 20:01:44 (2 years ago)
 Branch:
 default
 Parents:
 9818:2ec8ecdb81e5 (diff), 9819:e11e2ff31f47 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.  rebase_source:
 b34679a1d0e04c1b45b79ac3fddaa9094e6523ca
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

docs/reference/rst/Orange.distance.rst
r9819 r9821 5 5 ########################################## 6 6 7 The following example demonstrates how to compute distances between two instances: 8 9 .. literalinclude:: code/distancesimple.py 10 :lines: 17 11 12 A matrix with all pairwise distances can be computed with :obj:`distance_matrix`: 13 14 .. literalinclude:: code/distancesimple.py 15 :lines: 911 16 17 Unknown values are treated correctly only by Euclidean and Relief 18 distance. For other measures, a distance between unknown and known or 19 between two unknown values is always 0.5. 20 21 =================== 22 Computing distances 23 =================== 24 7 25 Distance measures typically have to be adjusted to the data. For instance, 8 26 when the data set contains continuous features, the distances between … … 10 28 similar impats, e.g. by dividing the distance with the range. 11 29 12 Distance measures thus appear in pairs  a class that measures 13 the distance (:obj:`Distance`) and a class that constructs it based on the 14 data (:obj:`DistanceConstructor`). 30 Distance measures thus appear in pairs: 15 31 16 Since most measures work on normalized distances between corresponding 17 features, an abstract class `DistanceNormalized` takes care of 18 normalizing. 19 20 Unknown values are treated correctly only by Euclidean and Relief 21 distance. For other measures, a distance between unknown and known or 22 between two unknown values is always 0.5. 23 24 .. autofunction:: distance_matrix 25 26 .. class:: Distance 27 28 .. method:: __call__(instance1, instance2) 29 30 Return a distance between the given instances (as a floating point number). 32  a class that constructs the distance measure based on the 33 data (subclass of :obj:`DistanceConstructor`, for 34 example :obj:`Euclidean`), and returns is as 35  a class that measures the distance between two instances 36 (subclass of :obj:`Distance`, for example :obj:`EuclideanDistance`). 31 37 32 38 .. class:: DistanceConstructor … … 38 44 not given, instances or distributions can be used. 39 45 40 .. class:: Distance Normalized46 .. class:: Distance 41 47 42 An abstract class that provides normalization.48 .. method:: __call__(instance1, instance2) 43 49 44 .. attribute:: normalizers50 Return a distance between the given instances (as a floating point number). 45 51 46 A precomputed list of normalizing factors for feature values. They are: 52 Pairwise distances 53 ================== 47 54 48  1/(max_valuemin_value) for continuous and 1/number_of_values 49 for ordinal features. 50 If either feature is unknown, the distance is 0.5. Such factors 51 are used to multiply differences in feature's values. 52  ``1`` for nominal features; the distance 53 between two values is 0 if they are same (or at least one is 54 unknown) and 1 if they are different. 55  ``0`` for ignored features. 55 .. autofunction:: distance_matrix 56 56 57 .. attribute:: bases, averages, variances 57 ========= 58 Measures 59 ========= 58 60 59 The minimal values, averages and variances 60 (continuous features only). 61 62 .. attribute:: domain_version 63 64 The domain version changes each time a domain description is 65 changed (i.e. features are added or removed). 66 67 .. method:: feature_distances(instance1, instance2) 68 69 Return a list of floats representing normalized distances between 70 pairs of feature values of the two instances. 61 Distance measures are defined with two classes: a subclass of obj:`DistanceConstructor` 62 and a subclass of :obj:`Distance`. 71 63 72 64 .. class:: Hamming … … 81 73 The maximal distance 82 74 between two feature values. If dist is the result of 83 ~:obj:`DistanceNormalized.feature_distances`,75 :obj:`~DistanceNormalized.feature_distances`, 84 76 then :class:`Maximal` returns ``max(dist)``. 85 77 … … 89 81 The sum of absolute values 90 82 of distances between pairs of features, e.g. ``sum(abs(x) for x in dist)`` 91 where dist is the result of ~:obj:`DistanceNormalized.feature_distances`.83 where dist is the result of :obj:`~DistanceNormalized.feature_distances`. 92 84 93 85 .. class:: Euclidean … … 96 88 The square root of sum of squared perfeature distances, 97 89 i.e. ``sqrt(sum(x*x for x in dist))``, where dist is the result of 98 ~:obj:`DistanceNormalized.feature_distances`.90 :obj:`~DistanceNormalized.feature_distances`. 99 91 100 92 .. method:: distributions … … 137 129 This class is derived directly from :obj:`Distance`. 138 130 139 140 131 .. autoclass:: PearsonR 141 132 :members: … … 150 141 :members: 151 142 143 .. autoclass:: Mahalanobis 144 :members: 152 145 146 .. autoclass:: MahalanobisDistance 147 :members: 148 149 ========= 150 Utilities 151 ========= 152 153 .. class:: DistanceNormalized 154 155 An abstract class that provides normalization. 156 157 .. attribute:: normalizers 158 159 A precomputed list of normalizing factors for feature values. They are: 160 161  1/(max_valuemin_value) for continuous and 1/number_of_values 162 for ordinal features. 163 If either feature is unknown, the distance is 0.5. Such factors 164 are used to multiply differences in feature's values. 165  ``1`` for nominal features; the distance 166 between two values is 0 if they are same (or at least one is 167 unknown) and 1 if they are different. 168  ``0`` for ignored features. 169 170 .. attribute:: bases, averages, variances 171 172 The minimal values, averages and variances 173 (continuous features only). 174 175 .. attribute:: domain_version 176 177 The domain version changes each time a domain description is 178 changed (i.e. features are added or removed). 179 180 .. method:: feature_distances(instance1, instance2) 181 182 Return a list of floats representing normalized distances between 183 pairs of feature values of the two instances. 184 185
Note: See TracChangeset
for help on using the changeset viewer.