Changeset 7532:b19b076bfb0a in orange


Ignore:
Timestamp:
02/04/11 21:11:10 (3 years ago)
Author:
anze <anze.staric@…>
Branch:
default
Convert:
714b3951f9d07ba135ccbeb9d0ee0402ada7b49a
Message:

review

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/projection/mds.py

    r7493 r7532  
    55   single: projection; multidimensional scaling (mds) 
    66 
    7 The functionality to perform multidimensional scaling. 
    8  
    9 ======================== 
     7The functionality to perform multidimensional scaling 
     8(http://en.wikipedia.org/wiki/Multidimensional_scaling). 
     9 
     10************************ 
    1011Multidimensional Scaling 
    11 ======================== 
     12************************ 
    1213 
    1314The main class to perform multidimensional scaling is 
     
    1617.. autoclass:: Orange.projection.mds.MDS 
    1718   :members: 
    18    :show-inheritance: 
     19   :exclude-members: Torgerson, getDistance, getStress 
    1920 
    2021Stress functions 
    21 ---------------- 
     22================ 
    2223 
    2324Stress functions that can be used for MDS have to be implemented as functions 
     
    4445   * Orange.projection.mds.SgnSammonStress 
    4546 
    46 ============== 
    4747Usage Examples 
    4848============== 
     
    5151--------------- 
    5252 
    53 In our first example, we will take iris data set, compute the Euclidean 
    54 distance between the instances and then run MDS on a distance matrix. 
    55  
    56 .. 
    57    Other distance functions that can be used are described in TODO. 
    58  
    59 part of `mds-scatterplot.py`_ (uses `iris.tab`_) 
     53The following script computes the Euclidean distance between the data instances 
     54of the iris dataset and runs MDS on a distance matrix. Coordinates computed 
     55with MDS are plotted using matplotlib (not included with orange,  
     56http://matplotlib.sourceforge.net/). 
     57 
     58Example (`mds-scatterplot.py`_, uses `iris.tab`_) 
    6059 
    6160.. literalinclude:: code/mds-scatterplot.py 
    62     :lines: 7-21 
     61    :lines: 7- 
    6362 
    6463.. _mds-scatterplot.py: code/mds-scatterplot.py 
    6564.. _iris.tab: code/iris.tab 
    6665 
    67 Notice that we are running MDS through 100 iterations. We will now use 
    68 matplotlib to plot the data points using the coordinates computed with MDS (you 
    69 need to install matplotlib, it does not come with Orange). Each data point in 
    70 iris is classified in one of the three classes, so we will use colors to denote 
    71 instance's class. 
    72  
    73 part of `mds-scatterplot.py`_ (uses `iris.tab`_) 
    74  
    75 .. literalinclude:: code/mds-scatterplot.py 
    76     :lines: 27-39 
    77  
    78 After running this script, a file named *mds-scatterplot.py.png* appears on the 
    79 filesystem: 
     66The script produces a file named *mds-scatterplot.py.png*. Color denotes 
     67point's class. Iris is a relatively simple data set with respect to 
     68classification; to no surprise we see that MDS finds such instance 
     69placement in 2D where instances of different classes are well separated. 
     70Note that MDS has no knowledge of points' classes. 
    8071 
    8172.. image:: files/mds-scatterplot.png 
    8273 
    83 Iris is a relatively simple data set with respect to classification, and to no 
    84 surprise we see that MDS found such instance placement in 2D where instances of 
    85 different classes are well separated. Notice also that MDS does this with no 
    86 knowledge of the instance classes. 
     74 
    8775 
    8876A more advanced example 
    8977----------------------- 
    9078 
    91 We are going to write a script that performs 10 steps of Smacof optimization 
    92 before computing the stress. This is suitable if you have a large dataset and 
    93 want to save some time. First we load the data and compute the distance matrix 
    94 (just like in our previous example). 
    95  
    96 part of `mds-advanced.py`_ (uses `iris.tab`_) 
     79The following script performs 10 steps of Smacof optimization before computing 
     80the stress. This is suitable if you have a large dataset and want to save some 
     81time. 
     82 
     83Example (`mds-advanced.py`_, uses `iris.tab`_) 
    9784 
    9885.. literalinclude:: code/mds-advanced.py 
    99     :lines: 7-18 
     86    :lines: 7- 
    10087 
    10188.. _mds-advanced.py: code/mds-advanced.py 
    102  
    103 Then we construct the MDS instance and perform the initial Torgerson 
    104 approximation, after which we update the stress matrix using the 
    105 Orange.projection.mds.KruskalStress function. 
    106  
    107 part of `mds-advanced.py`_ (uses `iris.tab`_) 
    108  
    109 .. literalinclude:: code/mds-advanced.py 
    110     :lines: 20-23 
    111  
    112 And finally the main optimization loop, after which we print the projected 
    113 points along with the data: 
    114  
    115 .. literalinclude:: code/mds-advanced.py 
    116     :lines: 25-37 
    11789 
    11890A few representative lines of the output are:: 
     
    208180    Main class for performing multidimensional scaling. 
    209181     
    210     Constructor takes the following parameters: 
    211      
    212182    :param distances: original dissimilarity - a distance matrix to operate on. 
    213183    :type distances: :class:`Orange.core.SymMatrix` 
     
    228198    .. attribute:: distances 
    229199     
    230        An :class:`Orange.core.SymMatrix` that contains the distances that we 
     200       An :class:`Orange.core.SymMatrix` containing the distances that we 
    231201       want to achieve (LSMT changes these). 
    232202        
    233203    .. attribute:: projectedDistances 
    234204 
    235        An :class:`Orange.core.SymMatrix` that contains the distances between 
     205       An :class:`Orange.core.SymMatrix` containing the distances between 
    236206       projected points. 
    237207        
    238208    .. attribute:: originalDistances 
    239209 
    240        An :class:`Orange.core.SymMatrix` that contains the original distances 
     210       An :class:`Orange.core.SymMatrix` containing the original distances 
    241211       between points. 
    242212        
     
    247217    .. attribute:: dim 
    248218 
    249        An int holding the dimension of the projected space. 
     219       An integer holding the dimension of the projected space. 
    250220        
    251221    .. attribute:: n 
     
    302272    def calcDistance(self): 
    303273        """ 
    304         Compute the distances between points and updates the 
     274        Compute the distances between points and update the 
    305275        :obj:`projectedDistances` matrix. 
    306276         
     
    313283        """ 
    314284        Compute the stress between the current :obj:`projectedDistances` and 
    315         :obj:`distances` matrix using *stressFunc* and updates the 
     285        :obj:`distances` matrix using *stressFunc* and update the 
    316286        :obj:`stress` matrix and :obj:`avgStress` accordingly. 
    317287         
     
    329299            progressCallback=None): 
    330300        """ 
    331         A convenience function that performs optimization until stopping 
    332         conditions are met. Stopping conditions are: 
     301        Perform optimization until stopping conditions are met. 
     302        Stopping conditions are: 
    333303            
    334304           * optimization runs for *iter* iterations of SMACOFstep function, or 
     
    336306             eps * old stress. 
    337307         
    338         :param iter: maximal number of optimization iterations. 
     308        :param iter: maximum number of optimization iterations. 
    339309        :type iter: int 
    340310         
    341         :param stressFunc: stress function 
     311        :param stressFunc: stress function. 
    342312        """ 
    343313        self.optimize(iter, stressFunc, eps, progressCallback) 
    344314 
    345     def Torgerson(self): 
    346         """ 
    347         Run the torgerson algorithm that computes an initial analytical 
     315    def torgerson(self): 
     316        """ 
     317        Run the Torgerson algorithm that computes an initial analytical 
    348318        solution of the problem. 
    349319         
     
    387357#        X = matrixmultiply(U,D) 
    388358#        self.X = take(X,idx,1) 
    389  
     359    Torgerson = torgerson 
    390360    # Kruskal's monotone transformation 
    391361    def LSMT(self): 
Note: See TracChangeset for help on using the changeset viewer.