# Changeset 7532:b19b076bfb0a in orange

Ignore:
Timestamp:
02/04/11 21:11:10 (3 years ago)
Branch:
default
Convert:
714b3951f9d07ba135ccbeb9d0ee0402ada7b49a
Message:

review

File:
1 edited

Unmodified
Added
Removed
• ## orange/Orange/projection/mds.py

 r7493 single: projection; multidimensional scaling (mds) The functionality to perform multidimensional scaling. ======================== The functionality to perform multidimensional scaling (http://en.wikipedia.org/wiki/Multidimensional_scaling). ************************ Multidimensional Scaling ======================== ************************ The main class to perform multidimensional scaling is .. autoclass:: Orange.projection.mds.MDS :members: :show-inheritance: :exclude-members: Torgerson, getDistance, getStress Stress functions ---------------- ================ Stress functions that can be used for MDS have to be implemented as functions * Orange.projection.mds.SgnSammonStress ============== Usage Examples ============== --------------- In our first example, we will take iris data set, compute the Euclidean distance between the instances and then run MDS on a distance matrix. .. Other distance functions that can be used are described in TODO. part of `mds-scatterplot.py`_ (uses `iris.tab`_) The following script computes the Euclidean distance between the data instances of the iris dataset and runs MDS on a distance matrix. Coordinates computed with MDS are plotted using matplotlib (not included with orange, http://matplotlib.sourceforge.net/). Example (`mds-scatterplot.py`_, uses `iris.tab`_) .. literalinclude:: code/mds-scatterplot.py :lines: 7-21 :lines: 7- .. _mds-scatterplot.py: code/mds-scatterplot.py .. _iris.tab: code/iris.tab Notice that we are running MDS through 100 iterations. We will now use matplotlib to plot the data points using the coordinates computed with MDS (you need to install matplotlib, it does not come with Orange). Each data point in iris is classified in one of the three classes, so we will use colors to denote instance's class. part of `mds-scatterplot.py`_ (uses `iris.tab`_) .. literalinclude:: code/mds-scatterplot.py :lines: 27-39 After running this script, a file named *mds-scatterplot.py.png* appears on the filesystem: The script produces a file named *mds-scatterplot.py.png*. Color denotes point's class. Iris is a relatively simple data set with respect to classification; to no surprise we see that MDS finds such instance placement in 2D where instances of different classes are well separated. Note that MDS has no knowledge of points' classes. .. image:: files/mds-scatterplot.png Iris is a relatively simple data set with respect to classification, and to no surprise we see that MDS found such instance placement in 2D where instances of different classes are well separated. Notice also that MDS does this with no knowledge of the instance classes. A more advanced example ----------------------- We are going to write a script that performs 10 steps of Smacof optimization before computing the stress. This is suitable if you have a large dataset and want to save some time. First we load the data and compute the distance matrix (just like in our previous example). part of `mds-advanced.py`_ (uses `iris.tab`_) The following script performs 10 steps of Smacof optimization before computing the stress. This is suitable if you have a large dataset and want to save some time. Example (`mds-advanced.py`_, uses `iris.tab`_) .. literalinclude:: code/mds-advanced.py :lines: 7-18 :lines: 7- .. _mds-advanced.py: code/mds-advanced.py Then we construct the MDS instance and perform the initial Torgerson approximation, after which we update the stress matrix using the Orange.projection.mds.KruskalStress function. part of `mds-advanced.py`_ (uses `iris.tab`_) .. literalinclude:: code/mds-advanced.py :lines: 20-23 And finally the main optimization loop, after which we print the projected points along with the data: .. literalinclude:: code/mds-advanced.py :lines: 25-37 A few representative lines of the output are:: Main class for performing multidimensional scaling. Constructor takes the following parameters: :param distances: original dissimilarity - a distance matrix to operate on. :type distances: :class:`Orange.core.SymMatrix` .. attribute:: distances An :class:`Orange.core.SymMatrix` that contains the distances that we An :class:`Orange.core.SymMatrix` containing the distances that we want to achieve (LSMT changes these). .. attribute:: projectedDistances An :class:`Orange.core.SymMatrix` that contains the distances between An :class:`Orange.core.SymMatrix` containing the distances between projected points. .. attribute:: originalDistances An :class:`Orange.core.SymMatrix` that contains the original distances An :class:`Orange.core.SymMatrix` containing the original distances between points. .. attribute:: dim An int holding the dimension of the projected space. An integer holding the dimension of the projected space. .. attribute:: n def calcDistance(self): """ Compute the distances between points and updates the Compute the distances between points and update the :obj:`projectedDistances` matrix. """ Compute the stress between the current :obj:`projectedDistances` and :obj:`distances` matrix using *stressFunc* and updates the :obj:`distances` matrix using *stressFunc* and update the :obj:`stress` matrix and :obj:`avgStress` accordingly. progressCallback=None): """ A convenience function that performs optimization until stopping conditions are met. Stopping conditions are: Perform optimization until stopping conditions are met. Stopping conditions are: * optimization runs for *iter* iterations of SMACOFstep function, or eps * old stress. :param iter: maximal number of optimization iterations. :param iter: maximum number of optimization iterations. :type iter: int :param stressFunc: stress function :param stressFunc: stress function. """ self.optimize(iter, stressFunc, eps, progressCallback) def Torgerson(self): """ Run the torgerson algorithm that computes an initial analytical def torgerson(self): """ Run the Torgerson algorithm that computes an initial analytical solution of the problem. #        X = matrixmultiply(U,D) #        self.X = take(X,idx,1) Torgerson = torgerson # Kruskal's monotone transformation def LSMT(self):
Note: See TracChangeset for help on using the changeset viewer.