Ignore:
Timestamp:
02/06/12 16:33:23 (2 years ago)
Author:
markotoplak
Branch:
default
Message:

Moved instance_distance_matrix to Orange.distance.distance_matrix

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/reference/rst/Orange.distance.rst

    r9720 r9752  
    11.. py:currentmodule:: Orange.distance 
    2  
    3 .. automodule:: Orange.distance 
    42 
    53########################################## 
     
    2422between two unknown values is always 0.5. 
    2523 
     24.. autofunction:: distance_matrix 
     25 
    2626.. class:: Distance 
    2727 
     
    3434    .. method:: __call__([instances, weightID][, distributions][, basic_var_stat]) 
    3535 
    36         Constructs an :obj:`Distance`.  Not all the data needs to be 
    37         given. Most measures can be constructed from basic_var_stat; 
    38         if it is not given, they can help themselves either by instances 
    39         or distributions. Some do not need any arguments. 
     36        Constructs an :obj:`Distance`. Not all arguments are required. 
     37        Most measures can be constructed from basic_var_stat; if it is 
     38        not given, instances or distributions can be used. 
    4039 
    4140.. class:: DistanceNormalized 
    4241 
    43     This abstract class provides a function which is given two instances 
    44     and returns a list of normalized distances between values of their 
    45     features. Many distance measuring classes need such a function and are 
    46     therefore derived from this class 
     42    An abstract class that provides normalization. 
    4743 
    4844    .. attribute:: normalizers 
    4945 
    50         A precomputed list of normalizing factors for feature values 
     46        A precomputed list of normalizing factors for feature values. They are: 
    5147 
    52         - If a factor positive, differences in feature's values 
    53           are multiplied by it; for continuous features the factor 
    54           would be 1/(max_value-min_value) and for ordinal features 
    55           the factor is 1/number_of_values. If either (or both) of 
    56           features are unknown, the distance is 0.5 
    57         - If a factor is -1, the feature is nominal; the distance 
    58           between two values is 0 if they are same (or at least 
    59           one is unknown) and 1 if they are different. 
    60         - If a factor is 0, the feature is ignored. 
     48        - 1/(max_value-min_value) for continuous and 1/number_of_values 
     49          for ordinal features. 
     50          If either feature is unknown, the distance is 0.5. Such factors 
     51          are used to multiply differences in feature's values. 
     52        - ``-1`` for nominal features; the distance 
     53          between two values is 0 if they are same (or at least one is 
     54          unknown) and 1 if they are different. 
     55        - ``0`` for ignored features. 
    6156 
    6257    .. attribute:: bases, averages, variances 
    6358 
    6459        The minimal values, averages and variances 
    65         (continuous features only) 
     60        (continuous features only). 
    6661 
    6762    .. attribute:: domain_version 
    6863 
    69         The domain version increases each time a domain description is 
    70         changed (i.e. features are added or removed); this checks  
    71         that the user is not attempting to measure distances between 
    72         instances that do not correspond to normalizers. 
     64        The domain version changes each time a domain description is 
     65        changed (i.e. features are added or removed). 
    7366 
    74     .. method:: attribute_distances(instance1, instance2) 
     67    .. method:: feature_distances(instance1, instance2) 
    7568 
    76         Return a list of floats representing distances between pairs of 
    77         feature values of the two instances. 
     69        Return a list of floats representing normalized distances between 
     70        pairs of feature values of the two instances. 
    7871 
    79 .. class:: HammingConstructor 
    8072.. class:: Hamming 
     73.. class:: HammingDistance 
    8174 
    82     Hamming distance between two instances is defined as the number of 
    83     features in which the two instances differ. Note that this measure 
    84     is not really appropriate for instances that contain continuous features. 
     75    The number of features in which the two instances differ. This measure 
     76    is not appropriate for instances that contain continuous features. 
    8577 
    86 .. class:: MaximalConstructor 
    8778.. class:: Maximal 
     79.. class:: MaximalDistance 
    8880 
    89     The maximal between two instances is defined as the maximal distance 
     81    The maximal distance 
    9082    between two feature values. If dist is the result of 
    91     DistanceNormalized.attribute_distances, 
    92     then Maximal returns max(dist). 
     83    ~:obj:`DistanceNormalized.feature_distances`, 
     84    then :class:`Maximal` returns ``max(dist)``. 
    9385 
    94 .. class:: ManhattanConstructor 
    9586.. class:: Manhattan 
     87.. class:: ManhattanDistance 
    9688 
    97     Manhattan distance between two instances is a sum of absolute values 
     89    The sum of absolute values 
    9890    of distances between pairs of features, e.g. ``sum(abs(x) for x in dist)`` 
    99     where dist is the result of ExamplesDistance_Normalized.attributeDistances. 
     91    where dist is the result of ~:obj:`DistanceNormalized.feature_distances`. 
    10092 
    101 .. class:: EuclideanConstructor 
    10293.. class:: Euclidean 
     94.. class:: EuclideanDistance 
    10395 
    104     Euclidean distance is a square root of sum of squared per-feature distances, 
     96    The square root of sum of squared per-feature distances, 
    10597    i.e. ``sqrt(sum(x*x for x in dist))``, where dist is the result of 
    106     ExamplesDistance_Normalized.attributeDistances. 
     98    ~:obj:`DistanceNormalized.feature_distances`. 
    10799 
    108100    .. method:: distributions 
    109101 
    110         An object of type 
    111         :obj:`~Orange.statistics.distribution.Distribution` that holds 
     102        A :obj:`~Orange.statistics.distribution.Distribution` containing 
    112103        the distributions for all discrete features used for 
    113104        computation of distances between known and unknown values. 
    114105 
    115     .. method:: bothSpecialDist 
     106    .. method:: both_special_dist 
    116107 
    117108        A list containing the distance between two unknown values for each 
    118109        discrete feature. 
    119110 
    120     This measure of distance deals with unknown values by computing the 
    121     expected square of distance based on the distribution obtained from the 
     111    Unknown values are handled by computing the 
     112    expected square of distance based on the distribution from the 
    122113    "training" data. Squared distance between 
    123114 
    124         - A known and unknown continuous attribute equals squared distance 
    125           between the known and the average, plus variance 
    126         - Two unknown continuous attributes equals double variance 
    127         - A known and unknown discrete attribute equals the probability 
    128           that the unknown attribute has different value than the known 
    129           (i.e., 1 - probability of the known value) 
    130         - Two unknown discrete attributes equals the probability that two 
     115        - A known and unknown continuous feature equals squared distance 
     116          between the known and the average, plus variance. 
     117        - Two unknown continuous features equals double variance. 
     118        - A known and unknown discrete feature equals the probability 
     119          that the unknown feature has different value than the known 
     120          (i.e., 1 - probability of the known value). 
     121        - Two unknown discrete features equals the probability that two 
    131122          random chosen values are equal, which can be computed as 
    132123          1 - sum of squares of probabilities. 
    133124 
    134     Continuous cases can be handled by averages and variances inherited from 
    135     ExamplesDistance_normalized. The data for discrete cases are stored in 
    136     distributions (used for unknown vs. known value) and in bothSpecial 
    137     (the precomputed distance between two unknown values). 
     125    Continuous cases are handled as inherited from 
     126    :class:`DistanceNormalized`. The data for discrete cases are 
     127    stored in distributions (used for unknown vs. known value) and 
     128    in :obj:`both_special_dist` (the precomputed distance between two 
     129    unknown values). 
    138130 
    139 .. class:: ReliefConstructor 
    140131.. class:: Relief 
     132.. class:: ReliefDistance 
    141133 
    142     Relief is similar to Manhattan distance, but incorporates a more 
    143     correct treatment of undefined values, which is used by ReliefF measure. 
     134    Relief is similar to Manhattan distance, but incorporates the 
     135    treatment of undefined values, which is used by ReliefF measure. 
    144136 
    145 This class is derived directly from ExamplesDistance, not from ExamplesDistance_Normalized. 
     137    This class is derived directly from :obj:`Distance`. 
    146138 
    147139 
     
    149141    :members: 
    150142 
    151 .. autoclass:: SpearmanR 
     143.. autoclass:: PearsonRDistance 
    152144    :members: 
    153145 
    154 .. autoclass:: PearsonRConstructor 
     146.. autoclass:: SpearmanR 
    155147    :members: 
    156148 
    157149.. autoclass:: SpearmanRConstructor 
    158150    :members: 
     151 
     152 
Note: See TracChangeset for help on using the changeset viewer.