Changeset 9020:99a0a9f4b1b3 in orange


Ignore:
Timestamp:
09/26/11 14:36:15 (3 years ago)
Author:
markotoplak
Branch:
default
Convert:
83d2ec2cd84e09c58502cb78a8842c66471e3f96
Message:

Orange.classification.svm updates.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/classification/svm/__init__.py

    r9013 r9020  
    88********************************* 
    99 
    10 This module includes classes that wrap the `LibSVM library 
     10This module wraps the `LibSVM library 
    1111<http://www.csie.ntu.edu.tw/~cjlin/libsvm/>`_, a library for `support vector 
    1212machines <http://en.wikipedia.org/wiki/Support_vector_machine>`_ (SVM). SVM  
    1313learners from LibSVM behave like ordinary Orange learners and can be 
    14 used as Python objects in training, classification and evaluation tasks. The 
    15 implementation supports Python-based kernels, that can be plugged-in into the 
     14used as Python objects for training, classification and evaluation. The 
     15implementation supports Python-based kernels, that can be plugged-in the 
    1616LibSVM. 
    1717 
    1818.. note:: SVM can perform poorly on some data sets. Choose the parameters  
    19           carefully. In case of low classification accuracy, try scaling the  
    20           data and different parameters. :obj:`SVMLearnerEasy` class does this  
    21           automatically (similar to the `svm-easy.py`_ script in the LibSVM  
     19          carefully. In cases of low classification accuracy, try scaling the  
     20          data and try with different parameters. :obj:`SVMLearnerEasy` class does this  
     21          automatically (it is similar to the `svm-easy.py` script in the LibSVM  
    2222          distribution). 
    2323           
     
    2525============ 
    2626 
    27 Choose an SVM learner suitable for the problem at hand. :obj:`SVMLearner` is a  
     27Choose an SVM learner suitable for the problem. :obj:`SVMLearner` is a  
    2828general SVM learner. Use :obj:`SVMLearnerSparse` to learn from the  
    29 :obj:`Orange.data.Table` meta attributes. :obj:`SVMLearnerEasy` will help with 
     29meta attributes. :obj:`SVMLearnerEasy` helps with 
    3030the data normalization and parameter tuning. Learn with a fast  
    31 :obj:`LinearLearner` on data sets with large number of features. 
    32  
    33 How to use SVM learners (`svm-easy.py`_ uses: `vehicle.tab`_):  
     31:obj:`LinearLearner` on data sets with a large number of features. 
     32 
     33The next example shows how to use SVM learners and that :obj:`SVMLearnerEasy`  
     34with automatic data preprocessing and parameter tuning  
     35outperforms :obj:`SVMLearner` with the default :obj:`~SVMLearner.nu` and :obj:`~SVMLearner.gamma`:  
    3436     
    3537.. literalinclude:: code/svm-easy.py 
    36  
    37 :obj:`SVMLearnerEasy` with automatic data preprocessing and parameter tuning  
    38 outperforms :obj:`SVMLearner` with the default nu and gamma parameters. 
    3938 
    4039.. autoclass:: Orange.classification.svm.SVMLearner 
     
    5958.. automethod:: Orange.classification.svm.table_to_svm_format 
    6059 
    61 How to get lienear SVM weights (`svm-linear-weights.py`_,  
    62 uses: `brown-selected.tab`_): 
    63      
     60The following example shows how to get linear SVM weights:  
     61 
    6462.. literalinclude:: code/svm-linear-weights.py     
    6563 
     
    7270.. _kernel-wrapper: 
    7371 
    74  
    7572Kernel wrappers 
    7673=============== 
    7774 
    78 Use kernel wrappers to build a custom kernel. All wrapper constructors take one 
     75Kernel wrappers are used to build custom kernels. All wrapper constructors take one 
    7976or more Python functions (`wrapped` attribute) to wrap. The function must be a 
    80 positive definite kernel, with two attributes (of type double) and return a  
    81 double 
     77positive definite kernel, taking two floating point parameters of type double, and return a  
     78floating point number 
    8279 
    8380.. autoclass:: Orange.classification.svm.kernels.KernelWrapper 
     
    108105   :members: 
    109106 
    110 Example (`svm-custom-kernel.py`_ uses: `iris.tab`_) 
     107Example: 
    111108 
    112109.. literalinclude:: code/svm-custom-kernel.py 
     
    182179def max_nu(data): 
    183180    """Return the maximum nu parameter for Nu_SVC support vector learning 
    184      for the given data table.  
    185      
    186     :param data: data table with continuous features 
     181     for the given data.  
     182     
     183    :param data: data with continuous features 
    187184    :type data: Orange.data.Table 
    188185     
     
    200197     
    201198class SVMLearner(_SVMLearner): 
    202     """:param svm_type: defines the type of SVM (can be C_SVC, Nu_SVC  
     199    """:param svm_type: defines the SVM type (can be C_SVC, Nu_SVC  
    203200        (default), OneClass, Epsilon_SVR, Nu_SVR) 
    204201    :type svm_type: SVMLearner.SVMType 
    205     :param kernel_type: defines the type of a kernel to use for learning 
     202    :param kernel_type: defines the kernel type for learning 
    206203        (can be kernels.RBF (default), kernels.Linear, kernels.Polynomial,  
    207204        kernels.Sigmoid, kernels.Custom) 
     
    216213    :param kernel_func: function that will be called if `kernel_type` is 
    217214        `Custom`. It must accept two :obj:`Orange.data.Instance` arguments and 
    218         return a float (the distance between the instances). 
     215        return a distance between the instances. 
    219216    :type kernel_func: callable function 
    220217    :param C: C parameter for C_SVC, Epsilon_SVR, Nu_SVR 
     
    228225    :param eps: tolerance of termination criterion (default 0.001) 
    229226    :type eps: float 
    230     :param probability: determines if a probability model should be build 
     227    :param probability: build a probability model 
    231228        (default False) 
    232229    :type probability: bool 
    233     :param shrinking: determines whether to use shrinking heuristics  
     230    :param shrinking: use shrinking heuristics  
    234231        (default True) 
    235232    :type shrinking: bool 
     
    284281 
    285282    def __call__(self, data, weight=0): 
    286         """Construct a SVM classifier 
    287          
    288         :param table: data table with continuous features 
     283        """Construct a SVM classifier. 
     284         
     285        :param table: data with continuous features 
    289286        :type table: Orange.data.Table 
    290287        :param weight: refer to `LibSVM documentation  
     
    341338    def tune_parameters(self, data, parameters=None, folds=5, verbose=0,  
    342339                       progress_callback=None): 
    343         """Tune the parameters of the SVMLearner on given instances using  
     340        """Tune parameters on given data using  
    344341        cross validation. 
    345342         
    346         :param data: data table on which to tune the parameters 
     343        :param data: data for parameter tuning 
    347344        :type data: Orange.data.Table  
    348         :param parameters: if not set defaults to ["nu", "C", "gamma"] 
     345        :param parameters: defaults to ["nu", "C", "gamma"] 
    349346        :type parameters: list of strings 
    350347        :param folds: number of folds used for cross validation 
     
    355352        :type progress_callback: callback function 
    356353             
    357         Example: 
     354        An example that tunes the `gamma` parameter on `data` using 3-fold cross  
     355        validation. 
    358356         
    359357            >>> svm = Orange.classification.svm.SVMLearner() 
    360358            >>> svm.tune_parameters(table, parameters=["gamma"], folds=3) 
    361359             
    362         This code tunes the `gamma` parameter on `data` using 3-fold cross  
    363         validation   
    364          
    365360        """ 
    366361         
     
    444439class SVMLearnerSparse(SVMLearner): 
    445440     
    446     """Same as SVMLearner except that it learns from the  
    447     :obj:`Orange.data.Table` meta attributes. 
    448      
     441    """A :class:`SVMLearner` that learns from  
     442    meta attributes. 
     443 
    449444    Meta attributes do not need to be registered with the data set domain, or  
    450     present in all the instances. Use this if you are learning from a large  
    451     sparse data set. 
     445    present in all the instances. Use this for large  
     446    sparse data sets. 
    452447     
    453448    """ 
     
    459454class SVMLearnerEasy(SVMLearner): 
    460455     
    461     """Same as :obj:`SVMLearner` except that it will automatically scale the  
    462     data and perform parameter optimization using the  
    463     :obj:`SVMLearner.tune_parameters` method. Similar to the easy.py script in  
    464     LibSVM package. Use this if the SVMLearner performs badly. 
    465      
     456    """Apart from the functionality of :obj:`SVMLearner` it automatically scales the  
     457    data and perform parameter optimization with the  
     458    :func:`SVMLearner.tune_parameters`. It is similar to the easy.py script in  
     459    the LibSVM package. 
     460 
    466461    """ 
    467462     
     
    514509 
    515510class LinearLearner(Orange.core.LinearLearner): 
    516      
    517     """A wrapper around Orange.core.LinearLearner with a default 
    518     solver_type == L2Loss_SVM_Dual  
    519      
    520     The default in Orange.core.LinearLearner is L2_LR 
    521      
     511    """A fast learner (LinearLearner) with a default solver type 
     512    ``L2Loss_SVM_Dual``. 
    522513    """ 
    523514     
     
    531522         
    532523    def __init__(self, **kwargs): 
    533         if kwargs.get("solver_type", None) in [Orange.core.LinearLearner.L2_LR,  
    534                                                None]: 
    535             kwargs = dict(kwargs) 
     524        if "solver_type" not in kwargs: 
     525            #The default in Orange.core.LinearLearner is L2_LR 
    536526            kwargs["solver_type"] = Orange.core.LinearLearner.L2Loss_SVM_Dual 
    537527        for name, val in kwargs.items(): 
     
    539529 
    540530def get_linear_svm_weights(classifier, sum=True): 
    541     """Extract attribute weights from the linear svm classifier. 
     531    """Extract attribute weights from the linear SVM classifier. 
    542532     
    543533    For multi class classification the weights are square-summed over all binary  
    544     one vs. one classifiers. If you want weights for each binary classifier pass  
    545     `sum=False` flag (In this case the order of reported weights are for class1  
    546     vs class2, class1 vs class3 ... class2 vs class3 ... classifiers). 
     534    one vs. one classifiers. If obj:`sum` is False, the reported weights are is a 
     535    seqeunce: class1 vs class2, class1 vs class3 ... class2 vs class3 ... . 
    547536         
    548537    """ 
     
    636625     
    637626    def __init__(self, learner=None, **kwargs): 
    638         """:param learner: Learner used for weight esstimation  
     627        """:param learner: Learner used for weight estimation  
    639628            (default LinearLearner(solver_type=L2Loss_SVM_Dual)) 
    640629        :type learner: Orange.core.Learner  
     
    666655class RFE(object): 
    667656     
    668     """Recursive feature elimination using linear svm derived attribute  
     657    """Recursive feature elimination using linear SVM derived attribute  
    669658    weights. 
    670659     
     
    672661     
    673662        >>> rfe = RFE(SVMLearner(kernel_type=kernels.Linear, \ 
    674 normalization=False)) # normalization=False -> do not change the domain  
     663normalization=False)) # normalization=False does not change the domain  
    675664        >>> data_with_removed_features = rfe(table, 5) # table with 5 best attributes 
    676665         
     
    681670                            kernels.Linear, normalization=False) 
    682671 
    683     @Orange.misc.deprecated_keywords({"progressCallback": "progress_callback"}) 
    684     def get_attr_scores(self, data, stopAt=0, progress_callback=None): 
    685         """Return a dict mapping attributes to scores (scores are not scores  
    686         in a general meaning; they represent the step number at which they  
    687         were removed from the recursive evaluation). 
    688          
     672    @Orange.misc.deprecated_keywords({"progressCallback": "progress_callback", "stopAt": "stop_at" }) 
     673    def get_attr_scores(self, data, stop_at=0, progress_callback=None): 
     674        """Return a dictionary mapping attributes to scores. 
     675        A score is a step number at which the attribute 
     676        was removed from the recursive evaluation. 
    689677        """ 
    690678        iter = 1 
     
    692680        attrScores = {} 
    693681         
    694         while len(attrs) > stopAt: 
     682        while len(attrs) > stop_at: 
    695683            weights = get_linear_svm_weights(self.learner(data), sum=False) 
    696684            if progress_callback: 
    697                 progress_callback(100. * iter / (len(attrs) - stopAt)) 
     685                progress_callback(100. * iter / (len(attrs) - stop_at)) 
    698686            score = dict.fromkeys(attrs, 0) 
    699687            for w in weights: 
     
    721709         
    722710        """ 
    723         scores = self.get_attr_scores(data, progressCallback=progress_callback) 
     711        scores = self.get_attr_scores(data, progress_callback=progress_callback) 
    724712        scores = sorted(scores.items(), key=lambda item: item[1]) 
    725713         
Note: See TracChangeset for help on using the changeset viewer.