Changeset 7532:b19b076bfb0a in orange
 Timestamp:
 02/04/11 21:11:10 (3 years ago)
 Branch:
 default
 Convert:
 714b3951f9d07ba135ccbeb9d0ee0402ada7b49a
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

orange/Orange/projection/mds.py
r7493 r7532 5 5 single: projection; multidimensional scaling (mds) 6 6 7 The functionality to perform multidimensional scaling. 8 9 ======================== 7 The functionality to perform multidimensional scaling 8 (http://en.wikipedia.org/wiki/Multidimensional_scaling). 9 10 ************************ 10 11 Multidimensional Scaling 11 ======================== 12 ************************ 12 13 13 14 The main class to perform multidimensional scaling is … … 16 17 .. autoclass:: Orange.projection.mds.MDS 17 18 :members: 18 : showinheritance:19 :excludemembers: Torgerson, getDistance, getStress 19 20 20 21 Stress functions 21  22 ================ 22 23 23 24 Stress functions that can be used for MDS have to be implemented as functions … … 44 45 * Orange.projection.mds.SgnSammonStress 45 46 46 ==============47 47 Usage Examples 48 48 ============== … … 51 51  52 52 53 In our first example, we will take iris data set, compute the Euclidean 54 distance between the instances and then run MDS on a distance matrix. 55 56 .. 57 Other distance functions that can be used are described in TODO. 58 59 part of `mdsscatterplot.py`_ (uses `iris.tab`_) 53 The following script computes the Euclidean distance between the data instances 54 of the iris dataset and runs MDS on a distance matrix. Coordinates computed 55 with MDS are plotted using matplotlib (not included with orange, 56 http://matplotlib.sourceforge.net/). 57 58 Example (`mdsscatterplot.py`_, uses `iris.tab`_) 60 59 61 60 .. literalinclude:: code/mdsscatterplot.py 62 :lines: 7 2161 :lines: 7 63 62 64 63 .. _mdsscatterplot.py: code/mdsscatterplot.py 65 64 .. _iris.tab: code/iris.tab 66 65 67 Notice that we are running MDS through 100 iterations. We will now use 68 matplotlib to plot the data points using the coordinates computed with MDS (you 69 need to install matplotlib, it does not come with Orange). Each data point in 70 iris is classified in one of the three classes, so we will use colors to denote 71 instance's class. 72 73 part of `mdsscatterplot.py`_ (uses `iris.tab`_) 74 75 .. literalinclude:: code/mdsscatterplot.py 76 :lines: 2739 77 78 After running this script, a file named *mdsscatterplot.py.png* appears on the 79 filesystem: 66 The script produces a file named *mdsscatterplot.py.png*. Color denotes 67 point's class. Iris is a relatively simple data set with respect to 68 classification; to no surprise we see that MDS finds such instance 69 placement in 2D where instances of different classes are well separated. 70 Note that MDS has no knowledge of points' classes. 80 71 81 72 .. image:: files/mdsscatterplot.png 82 73 83 Iris is a relatively simple data set with respect to classification, and to no 84 surprise we see that MDS found such instance placement in 2D where instances of 85 different classes are well separated. Notice also that MDS does this with no 86 knowledge of the instance classes. 74 87 75 88 76 A more advanced example 89 77  90 78 91 We are going to write a script that performs 10 steps of Smacof optimization 92 before computing the stress. This is suitable if you have a large dataset and 93 want to save some time. First we load the data and compute the distance matrix 94 (just like in our previous example). 95 96 part of `mdsadvanced.py`_ (uses `iris.tab`_) 79 The following script performs 10 steps of Smacof optimization before computing 80 the stress. This is suitable if you have a large dataset and want to save some 81 time. 82 83 Example (`mdsadvanced.py`_, uses `iris.tab`_) 97 84 98 85 .. literalinclude:: code/mdsadvanced.py 99 :lines: 7 1886 :lines: 7 100 87 101 88 .. _mdsadvanced.py: code/mdsadvanced.py 102 103 Then we construct the MDS instance and perform the initial Torgerson104 approximation, after which we update the stress matrix using the105 Orange.projection.mds.KruskalStress function.106 107 part of `mdsadvanced.py`_ (uses `iris.tab`_)108 109 .. literalinclude:: code/mdsadvanced.py110 :lines: 2023111 112 And finally the main optimization loop, after which we print the projected113 points along with the data:114 115 .. literalinclude:: code/mdsadvanced.py116 :lines: 2537117 89 118 90 A few representative lines of the output are:: … … 208 180 Main class for performing multidimensional scaling. 209 181 210 Constructor takes the following parameters:211 212 182 :param distances: original dissimilarity  a distance matrix to operate on. 213 183 :type distances: :class:`Orange.core.SymMatrix` … … 228 198 .. attribute:: distances 229 199 230 An :class:`Orange.core.SymMatrix` that containsthe distances that we200 An :class:`Orange.core.SymMatrix` containing the distances that we 231 201 want to achieve (LSMT changes these). 232 202 233 203 .. attribute:: projectedDistances 234 204 235 An :class:`Orange.core.SymMatrix` that containsthe distances between205 An :class:`Orange.core.SymMatrix` containing the distances between 236 206 projected points. 237 207 238 208 .. attribute:: originalDistances 239 209 240 An :class:`Orange.core.SymMatrix` that containsthe original distances210 An :class:`Orange.core.SymMatrix` containing the original distances 241 211 between points. 242 212 … … 247 217 .. attribute:: dim 248 218 249 An int holding the dimension of the projected space.219 An integer holding the dimension of the projected space. 250 220 251 221 .. attribute:: n … … 302 272 def calcDistance(self): 303 273 """ 304 Compute the distances between points and update sthe274 Compute the distances between points and update the 305 275 :obj:`projectedDistances` matrix. 306 276 … … 313 283 """ 314 284 Compute the stress between the current :obj:`projectedDistances` and 315 :obj:`distances` matrix using *stressFunc* and update sthe285 :obj:`distances` matrix using *stressFunc* and update the 316 286 :obj:`stress` matrix and :obj:`avgStress` accordingly. 317 287 … … 329 299 progressCallback=None): 330 300 """ 331 A convenience function that performs optimization until stopping332 conditions are met.Stopping conditions are:301 Perform optimization until stopping conditions are met. 302 Stopping conditions are: 333 303 334 304 * optimization runs for *iter* iterations of SMACOFstep function, or … … 336 306 eps * old stress. 337 307 338 :param iter: maxim alnumber of optimization iterations.308 :param iter: maximum number of optimization iterations. 339 309 :type iter: int 340 310 341 :param stressFunc: stress function 311 :param stressFunc: stress function. 342 312 """ 343 313 self.optimize(iter, stressFunc, eps, progressCallback) 344 314 345 def Torgerson(self):346 """ 347 Run the torgerson algorithm that computes an initial analytical315 def torgerson(self): 316 """ 317 Run the Torgerson algorithm that computes an initial analytical 348 318 solution of the problem. 349 319 … … 387 357 # X = matrixmultiply(U,D) 388 358 # self.X = take(X,idx,1) 389 359 Torgerson = torgerson 390 360 # Kruskal's monotone transformation 391 361 def LSMT(self):
Note: See TracChangeset
for help on using the changeset viewer.