Changeset 10061:2769831a41e3 in orange


Ignore:
Timestamp:
02/08/12 10:53:16 (2 years ago)
Author:
ales_erjavec
Branch:
default
rebase_source:
83c341738af90d806dd99d52efef030e9106d1fb
Message:

CamelCase to underscore. Minor changes to docstring formating.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Orange/projection/som.py

    r9994 r10061  
    99   single: projection; self-organizing map (SOM) 
    1010 
    11 An implementation of `self-organizing map <http://en.wikipedia.org/wiki/Self-organizing_map>`_ algorithm (SOM).  
    12 SOM is an unsupervised learning  
    13 algorithm that infers low, typically two-dimensional discretized representation of the input space, 
    14 called a map. The map preserves topological properties of the input space, such that 
    15 the cells that are close in the map include data instances that are similar to each other. 
     11An implementation of  
     12`self-organizing map <http://en.wikipedia.org/wiki/Self-organizing_map>`_ algorithm (SOM).  
     13SOM is an unsupervised learning  algorithm that infers low,  
     14typically two-dimensional discretized representation of the input 
     15space, called a map. The map preserves topological properties of the 
     16input space, such that the cells that are close in the map include data  
     17instances that are similar to each other. 
    1618 
    1719================================= 
     
    1921================================= 
    2022 
    21 The main class for inference of self-organizing maps is :obj:`SOMLearner`. The class initializes 
    22 the topology of the map and returns an inference objects which, given the data, performs the  
    23 optimization of the map::  
     23The main class for inference of self-organizing maps is :obj:`SOMLearner`.  
     24The class initializes the topology of the map and returns an inference 
     25objects which, given the data, performs the optimization of the map::  
    2426 
    2527   import Orange 
     
    2931   map = som(data) 
    3032 
     33 
    3134.. autodata:: NeighbourhoodGaussian 
    3235 
     
    5962 
    6063Supervised learning requires class-labeled data. For training, 
    61 class information is first added to data instances as a regular feature  
    62 by extending the feature vectors accordingly. Next, the map is trained, and the 
    63 training data projected to nodes. Each node then classifies to the majority class. 
    64 For classification, the data instance is projected the cell, returning the associated class. 
    65 An example of the code that trains and then classifies on the same data set is:: 
     64class information is first added to data instances as a regular 
     65feature by extending the feature vectors accordingly. Next, the 
     66map is trained, and the training data projected to nodes. Each 
     67node then classifies to the majority class. The dimensions  
     68corresponding to the class features are then removed from the  
     69prototype vector of each node in the map. For classification,  
     70the data instance is projected to the best matching cell, returning  
     71the associated class. 
     72 
     73An example of the code that trains and then classifies on the same 
     74data set is:: 
    6675 
    6776    import Orange 
     
    156165# Inference of Self-Organizing Maps  
    157166 
     167from Orange.misc import deprecated_keywords, \ 
     168                        deprecated_attribute 
     169 
    158170class Solver(object): 
    159     """ SOM Solver class used to train the map. Supports batch and sequential training. 
    160     Based on ideas from `SOM Toolkit for Matlab <http://www.cis.hut.fi/somtoolbox>`_. 
     171    """ SOM Solver class used to train the map. Supports batch  
     172    and sequential training. Based on ideas from 
     173    `SOM Toolkit for Matlab <http://www.cis.hut.fi/somtoolbox>`_. 
    161174 
    162175    :param neighbourhood: neighborhood function id 
    163     :type neighbourhood: :obj:`NeighbourhoodGaussian`, :obj:`NeighbourhoodBubble`, or :obj:`NeighbourhoodEpanechicov` 
     176    :type neighbourhood: :obj:`NeighbourhoodGaussian`,  
     177        :obj:`NeighbourhoodBubble`, or :obj:`NeighbourhoodEpanechicov` 
     178         
    164179    :param radius_ini: initial radius 
    165180    :type radius_ini: int 
     181     
    166182    :param raduis_fin: final radius 
    167183    :type raduis_fin: int 
     184     
    168185    :param epoch: number of training interactions 
    169186    :type epoch: int 
    170     :param batch_train: if True run the batch training algorithm (default), else use the sequential one 
     187     
     188    :param batch_train: if True run the batch training algorithm  
     189        (default), else use the sequential one 
    171190    :type batch_train: bool 
     191     
    172192    :param learning_rate: learning rate for the sequential training algorithm 
    173193    :type learning_rate: float 
     194     
    174195    """ 
    175196     
     
    194215        return (1 - epoch/self.epochs)*self.learning_rate 
    195216             
    196     def __call__(self, data, map, progressCallback=None): 
    197         """ Train the map from data. Pass progressCallback function to report on the progress. 
     217    @deprecated_keywords({"progressCallback": "progress_callback"}) 
     218    def __call__(self, data, map, progress_callback=None): 
     219        """ Train the map from data. Pass progress_callback function to report on the progress. 
    198220        """ 
    199221        self.data = data 
     
    203225        self.bmu_cache = {} 
    204226        if self.batch_train: 
    205             self.train_batch(progressCallback) 
     227            self.train_batch(progress_callback) 
    206228        else: 
    207             self.train_sequential(progressCallback) 
     229            self.train_sequential(progress_callback) 
    208230        return self.map 
    209231 
    210     def train_sequential(self, progressCallback): 
     232    def train_sequential(self, progress_callback): 
    211233        """Sequential training algorithm.  
    212234        """ 
     
    225247                random.shuffle(ind) 
    226248            self.train_step_sequential(epoch, ind) 
    227             if progressCallback: 
    228                 progressCallback(100.0*epoch/self.epochs) 
     249            if progress_callback: 
     250                progress_callback(100.0*epoch/self.epochs) 
    229251            self.qerror.append(numpy.mean(numpy.sqrt(self.distances))) 
    230252#            print epoch, "q error:", numpy.mean(numpy.sqrt(self.distances)), self.radius(epoch) 
     
    263285            self.vectors[nonzero] = self.vectors[nonzero] - Dx[nonzero] * numpy.reshape(h, (len(h), 1)) 
    264286 
    265     def train_batch(self, progressCallback=None): 
     287    @deprecated_keywords({"progressCallback": "progress_callback"}) 
     288    def train_batch(self, progress_callback=None): 
    266289        """Batch training algorithm. 
    267290        """ 
     
    280303        for epoch in range(self.epochs): 
    281304            self.train_step_batch(epoch) 
    282             if progressCallback: 
    283                 progressCallback(100.0*epoch/self.epochs) 
     305            if progress_callback: 
     306                progress_callback(100.0*epoch/self.epochs) 
    284307            if False and epoch > 5 and numpy.mean(numpy.abs(numpy.array(self.qerror[-5:-1]) - self.qerror[-1])) <= self.eps: 
    285308                break 
     
    324347         
    325348        self.vectors[nonzero] = S[nonzero] / A[nonzero] 
    326  
     349         
    327350 
    328351class SOMLearner(orange.Learner): 
    329     """An implementation of self-organizing map. Considers an input data set, projects the data  
    330     instances onto a map, and returns a result in the form of a classifier holding projection 
    331     information together with an algorithm to project new data instances. Uses :obj:`Map` for 
    332     representation of projection space, :obj:`Solver` for training, and returns a trained  
    333     map with information on projection of the training data as crafted by :obj:`SOMMap`. 
     352    """An implementation of self-organizing map. Considers  
     353    an input data set, projects the data instances onto a map,  
     354    and returns a result in the form of a classifier holding  
     355    projection information together with an algorithm to project 
     356    new data instances. Uses :obj:`Map` for representation of  
     357    projection space, :obj:`Solver` for training, and returns a  
     358    trained map with information on projection of the training 
     359    data as crafted by :obj:`SOMMap`. 
    334360     
    335361    :param map_shape: dimension of the map 
    336362    :type map_shape: tuple 
    337     :param initialize: initialization type id; linear  
    338       initialization assigns the data to the cells according to its position in two-dimensional 
    339       principal component projection 
     363     
     364    :param initialize: initialization type id; linear initialization  
     365        assigns the data to the cells according to its position in 
     366        two-dimensional principal component projection     
    340367    :type initialize: :obj:`InitializeRandom` or :obj:`InitializeLinear` 
     368     
    341369    :param topology: topology type id 
    342370    :type topology: :obj:`HexagonalTopology` or :obj:`RectangularTopology` 
     371     
    343372    :param neighbourhood: cell neighborhood type id 
    344     :type neighbourhood: :obj:`NeighbourhoodGaussian`, obj:`NeighbourhoodBubble`, or obj:`NeighbourhoodEpanechicov` 
     373    :type neighbourhood: :obj:`NeighbourhoodGaussian`,  
     374        obj:`NeighbourhoodBubble`, or obj:`NeighbourhoodEpanechicov` 
     375         
    345376    :param batch_train: perform batch training? 
    346377    :type batch_train: bool 
     378     
    347379    :param learning_rate: learning rate 
    348380    :type learning_rate: float 
     381     
    349382    :param radius_ini: initial radius 
    350383    :type radius_ini: int 
     384     
    351385    :param radius_fin: final radius 
    352386    :type radius_fin: int 
     387     
    353388    :param epochs: number of epochs (iterations of a training steps) 
    354389    :type epochs: int 
     390     
    355391    :param solver: a class with the optimization algorithm 
     392     
    356393    """ 
    357      
    358     def __new__(cls, examples=None, weightId=0, **kwargs): 
     394    @deprecated_keywords({"weightId": "weight_id"}) 
     395    def __new__(cls, examples=None, weight_id=0, **kwargs): 
    359396        self = orange.Learner.__new__(cls, **kwargs) 
    360397        if examples is not None: 
    361398            self.__init__(**kwargs) 
    362             return self.__call__(examples, weightId) 
     399            return self.__call__(examples, weight_id) 
    363400        else: 
    364401            return self 
     
    381418        orange.Learner.__init__(self, **kwargs) 
    382419         
    383     def __call__(self, data, weightID=0, progressCallback=None): 
     420    @deprecated_keywords({"weightID": "weight_id", 
     421                          "progressCallback": "progress_callback"}) 
     422    def __call__(self, data, weight_id=0, progress_callback=None): 
    384423        numdata, classes, w = data.toNumpyMA() 
    385424        map = Map(self.map_shape, topology=self.topology) 
     
    390429        map = self.solver(batch_train=self.batch_train, eps=self.eps, neighbourhood=self.neighbourhood, 
    391430                     radius_ini=self.radius_ini, radius_fin=self.radius_fin, learning_rate=self.learning_rate, 
    392                      epochs=self.epochs)(numdata, map, progressCallback=progressCallback) 
     431                     epochs=self.epochs)(numdata, map, progress_callback=progress_callback) 
    393432        return SOMMap(map, data) 
    394433 
     
    400439    :param data: the data to be mapped on the map 
    401440    :type data: :obj:`Orange.data.Table` 
     441     
    402442    """ 
    403443     
     
    406446        self.examples = data 
    407447        for node in map: 
    408             node.referenceExample = orange.Example(orange.Domain(self.examples.domain.attributes, False), 
     448            node.reference_example = orange.Example(orange.Domain(self.examples.domain.attributes, False), 
    409449                                                 [(var(value) if var.varType == orange.VarTypes.Continuous else var(int(value))) \ 
    410450                                                  for var, value in zip(self.examples.domain.attributes, node.vector)]) 
     
    412452 
    413453        for ex in self.examples: 
    414             node = self.getBestMatchingNode(ex) 
     454            node = self.get_best_matching_node(ex) 
    415455            node.examples.append(ex) 
    416456 
     
    423463            self.classVar = None 
    424464 
    425     def getBestMatchingNode(self, example): 
     465    def get_best_matching_node(self, example): 
    426466        """Return the best matching node for a given data instance 
    427467        """ 
     
    431471        bmu = ma.argmin(ma.sum(Dist**2, 1)) 
    432472        return list(self.map)[bmu] 
     473     
     474    getBestMatchingNode = \ 
     475        deprecated_attribute("getBestMatchingNode", 
     476                             "get_best_matching_node") 
    433477         
    434478    def __call__(self, example, what=orange.GetValue): 
    435         bmu = self.getBestMatchingNode(example) 
     479        bmu = self.get_best_matching_node(example) 
    436480        return bmu.classifier(example, what) 
    437481 
     
    456500 
    457501class SOMSupervisedLearner(SOMLearner): 
    458     """SOMSupervisedLearner is a class used to learn SOM from orange.ExampleTable, by using the 
    459     class information in the learning process. This is achieved by adding a value for each class 
    460     to the training instances, where 1.0 signals class membership and all other values are 0.0. 
    461     After the training, the new values are discarded from the node vectors. 
     502    """SOMSupervisedLearner is a class used to learn SOM from 
     503    orange.ExampleTable, by using the class information in the 
     504    learning process. This is achieved by adding a value for each 
     505    class to the training instances, where 1.0 signals class membership 
     506    and all other values are 0.0. After the training, the new values  
     507    are discarded from the node vectors. 
    462508     
    463509    :param data: class-labeled data set 
    464510    :type data: :obj:`Orange.data.Table` 
    465     :param progressCallback: a one argument function to report on inference progress (in %) 
     511    :param progress_callback: a one argument function to report  
     512        on inference progress (in %) 
     513         
    466514    """ 
    467     def __call__(self, examples, weightID=0, progressCallback=None): 
     515    @deprecated_keywords({"weightID": "weight_id", 
     516                          "progressCallback": "progress_callback"}) 
     517    def __call__(self, examples, weight_id=0, progress_callback=None): 
    468518        data, classes, w = examples.toNumpyMA() 
    469519        nval = len(examples.domain.classVar.values) 
     
    478528        map = Solver(batch_train=self.batch_train, eps=self.eps, neighbourhood=self.neighbourhood, 
    479529                     radius_ini=self.radius_ini, radius_fin=self.radius_fin, learning_rate=self.learning_rate, 
    480                      epoch=self.epochs)(data, map, progressCallback=progressCallback) 
     530                     epoch=self.epochs)(data, map, progress_callback=progress_callback) 
    481531        for node in map: 
    482532            node.vector = node.vector[:-nval] 
     
    493543        Node position. 
    494544 
    495     .. attribute:: referenceExample 
     545    .. attribute:: reference_example 
    496546 
    497547        Reference data instance (a prototype). 
     
    499549    .. attribute:: examples 
    500550     
    501         Data set with instances training instances that were mapped to the node.  
     551        Data set with instances training instances that were mapped  
     552        to the node. 
     553          
    502554    """ 
    503555    def __init__(self, pos, map=None, vector=None): 
     
    505557        self.map = map 
    506558        self.vector = vector 
     559         
     560    referenceExample = deprecated_attribute("referenceExample", "reference_example") 
    507561 
    508562class Map(object): 
    509     """Self organizing map (the structure). Includes methods for data initialization. 
     563    """Self organizing map (the structure). Includes methods for 
     564    data initialization. 
    510565     
    511566    .. attribute:: map 
     
    516571     
    517572        Data set that was considered when optimizing the map. 
     573         
    518574    """ 
    519575     
     
    551607    def unit_distances(self): 
    552608        """Return a NxN numpy.array of internode distances (based on 
    553         node position in the map, not vector space) where N is the number of 
    554         nodes. 
     609        node position in the map, not vector space) where N is the  
     610        number of nodes. 
     611         
    555612        """ 
    556613        nodes = list(self) 
     
    564621 
    565622    def unit_coords(self): 
    566         """ Return the unit coordinates of all nodes in the map as an numpy.array. 
     623        """ Return the unit coordinates of all nodes in the map  
     624        as an numpy.array. 
     625         
    567626        """ 
    568627        nodes = list(self) 
     
    582641        """Initialize the map nodes vectors randomly, by supplying 
    583642        either training data or dimension of the data. 
     643         
    584644        """ 
    585645        if data is not None: 
     
    595655        """ Initialize the map node vectors linearly over the subspace 
    596656        of the two most significant eigenvectors. 
     657         
    597658        """ 
    598659        data = data.copy() #ma.array(data) 
     
    634695            node.vector = vectors[i] 
    635696 
    636     def getUMat(self): 
     697    def get_u_matrix(self): 
    637698        return getUMat(self) 
     699     
     700    getUMat = deprecated_attribute("getUMat", "get_u_matrix") 
    638701         
    639702########################################################################## 
    640703# Supporting functions  
    641704 
    642 def getUMat(som): 
     705def get_u_matrix(som): 
    643706    dim1=som.map_shape[0]*2-1 
    644707    dim2=som.map_shape[1]*2-1 
     
    646709    a=numpy.zeros((dim1, dim2)) 
    647710    if som.topology == HexagonalTopology: 
    648         return __fillHex(a, som) 
     711        return __fill_hex(a, som) 
    649712    else: 
    650         return __fillRect(a, som) 
    651  
    652 def __fillHex(array, som): 
     713        return __fill_rect(a, som) 
     714     
     715def getUMat(som): 
     716    import warnings 
     717    warnings.warn("Deprecated function name 'getUMat'. Use 'get_u_matrix' instead.", 
     718                  DeprecationWarning) 
     719    return get_u_matrix(som) 
     720 
     721def __fill_hex(array, som): 
    653722    xDim, yDim = som.map_shape 
    654723##    for n in som.nodes: 
     
    672741    return array 
    673742 
    674 def __fillRect(array, som): 
     743def __fill_rect(array, som): 
    675744    xDim, yDim = som.map_shape 
    676745    d = dict([((i, j), som[i, j]) for i in range(xDim) for j in range(yDim)]) 
     
    698767    data = orange.ExampleTable("iris.tab") 
    699768    learner = SOMLearner() 
    700     learner = SOMLearner(batch_train=True, initialize=InitializeLinear, radius_ini=3, radius_fin=1, neighbourhood=Map.NeighbourhoodGaussian, epochs=1000) 
     769    learner = SOMLearner(batch_train=True, 
     770                         initialize=InitializeLinear,  
     771                         radius_ini=3, 
     772                         radius_fin=1, 
     773                         neighbourhood=Map.NeighbourhoodGaussian,  
     774                         epochs=1000) 
    701775    map = learner(data) 
    702776    for e in data: 
Note: See TracChangeset for help on using the changeset viewer.