Changeset 9074:39b097e31cb2 in orange


Ignore:
Timestamp:
10/07/11 10:46:48 (3 years ago)
Author:
matija <matija.polajnar@…>
Branch:
default
Convert:
4066f4e3cc963723597d34654430941ab8cfcd1e
Message:

Work on Orange.classification.rules documentation; hopefully closes #966.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/classification/rules.py

    r9062 r9074  
    1010************************** 
    1111 
    12 Orange implements several supervised rule induction algorithms 
    13 and rule-based classification methods. First, there is an implementation of the classic  
    14 `CN2 induction algorithm <http://www.springerlink.com/content/k6q2v76736w5039r/>`_.  
    15 The implementation of CN2 is modular, providing the opportunity to change, specialize 
    16 and improve the algorithm. The implementation is thus based on the rule induction  
    17 framework that we describe below. 
     12This module implements supervised rule induction algorithms 
     13and rule-based classification methods, specifically the  
     14`CN2 induction algorithm <http://www.springerlink.com/content/k6q2v76736w5039r/>`_ 
     15in multiple variants, including an argument-based learning one.  
     16The implementation is modular, based on the rule induction  
     17framework that is described below, providing the opportunity to change, specialize 
     18and improve the algorithm. 
    1819 
    1920CN2 algorithm 
     
    3940.. _titanic.tab: code/titanic.tab 
    4041 
    41 This is the resulting printout:: 
     42The result:: 
    4243     
    4344    IF sex=['female'] AND status=['first'] AND age=['child'] THEN survived=yes<0.000, 1.000> 
     
    178179    IF TRUE THEN survived=yes<0.000, 5.000> 
    179180 
    180 Notice that we first need to set the rule_finder component, because the default 
    181 components are not constructed when the learner is constructed, but only when 
    182 we run it on data. At that time, the algorithm checks which components are 
    183 necessary and sets defaults. Similarly, when the learner finishes, it destructs 
    184 all *default* components. Continuing with our example, assume that we wish to 
    185 set a different validation function and a different bean width. This is simply 
    186 written as: 
     181Notice that it is first necessary to set the :obj:`rule_finder` component, 
     182because the default components are not constructed when the learner is 
     183constructed, but only when we run it on data. At that time, the algorithm 
     184checks which components are necessary and sets defaults. Similarly, when the 
     185learner finishes, it destructs all *default* components. Continuing with our 
     186example, assume that we wish to set a different validation function and a 
     187different bean width. This is simply written as: 
    187188 
    188189part of `rules-customized.py`_ (uses `titanic.tab`_) 
     
    193194.. py:class:: Orange.classification.rules.Rule(filter, classifier, lr, dist, ce, w = 0, qu = -1) 
    194195    
    195    Base class for presentation of a single induced rule. 
     196   Representation of a single induced rule. 
    196197    
    197198   Parameters, that can be passed to the constructor, correspond to the first 
     
    305306              rules=rule_list, instances=all_instances) 
    306307                 
    307    The four customizable components here are the invoked data_stopping, 
    308    rule_finder, cover_and_remove and rule_stopping objects. By default, components 
    309    of the original CN2 algorithm will be used, but this can be changed by 
    310    modifying those attributes: 
     308   The four customizable components here are the invoked :obj:`data_stopping`, 
     309   :obj:`rule_finder`, :obj:`cover_and_remove` and :obj:`rule_stopping` 
     310   objects. By default, components of the original CN2 algorithm are be used, 
     311   but this can be changed by modifying those attributes: 
    311312    
    312313   .. attribute:: data_stopping 
     
    326327      that decides from the last rule learned if it is worthwhile to use the 
    327328      rule and learn more rules. By default, no rule stopping criteria is 
    328       used (rule_stopping==None), thus accepting all rules. 
     329      used (:obj:`rule_stopping` == :obj:`None`), thus accepting all 
     330      rules. 
    329331        
    330332   .. attribute:: cover_and_remove 
    331333        
    332334      an object of class 
    333       :class:`~Orange.classification.rules.RuleCovererAndRemover` 
    334       that removes instances covered by the rule and returns remaining 
    335       instances. The default implementation 
     335      :class:`~Orange.classification.rules.RuleCovererAndRemover` that removes 
     336      instances covered by the rule and returns remaining instances. The 
     337      default implementation 
    336338      (:class:`~Orange.classification.rules.RuleCovererAndRemover_Default`) 
    337339      only removes the instances that belong to given target class, except if 
    338       it is not given (ie. target_class==-1). 
     340      it is not given (ie. :obj:`target_class` == -1). 
    339341     
    340342   .. attribute:: rule_finder 
     
    345347      :class:`~Orange.classification.rules.RuleBeamFinder`. 
    346348 
    347    Constructor can be given the following parameters: 
    348      
    349349   :param store_instances: if set to True, the rules will have data instances 
    350350       stored. 
     
    354354   :type target_class: int 
    355355    
    356    :param base_rules: Rules that we would like to use in rule_finder to 
     356   :param base_rules: Rules that we would like to use in :obj:`rule_finder` to 
    357357       constrain the learning space. If not set, it will be set to a set 
    358358       containing only an empty rule. 
     
    383383      :type target_class: int  
    384384       
    385       :param base_rules: Rules that we would like to use in rule_finder to 
    386           constrain the learning space. If not set, it will be set to a set 
     385      :param base_rules: Rules that we would like to use in :obj:`rule_finder` 
     386          to constrain the learning space. If not set, it will be set to a set 
    387387          containing only an empty rule. 
    388388      :type base_rules: :class:`~Orange.classification.rules.RuleList` 
     
    423423      an object of class 
    424424      :class:`~Orange.classification.rules.RuleBeamInitializer` 
    425       used to initialize rules_star and for selecting the 
     425      used to initialize :obj:`rules_star` and for selecting the 
    426426      initial best rule. By default 
    427427      (:class:`~Orange.classification.rules.RuleBeamInitializer_Default`), 
    428       base_rules are returned as starting rulesSet and the best from base_rules 
    429       is set as best_rule. If base_rules are not set, this class will return 
    430       rules_star with rule that covers all instances (has no selectors) and 
    431       this rule will be also used as best_rule. 
     428      :obj:`base_rules` are returned as starting :obj:`rulesSet` and the best 
     429      from :obj:`base_rules` is set as :obj:`best_rule`. If :obj:`base_rules` 
     430      are not set, this class will return :obj:`rules_star` with rule that 
     431      covers all instances (has no selectors) and this rule will be also used 
     432      as :obj:`best_rule`. 
    432433    
    433434   .. attribute:: candidate_selector 
     
    436437      :class:`~Orange.classification.rules.RuleBeamCandidateSelector` 
    437438      used to separate a subset from the current 
    438       rules_star and return it. These rules will be used in the next 
     439      :obj:`rules_star` and return it. These rules will be used in the next 
    439440      specification step. Default component (an instance of 
    440441      :class:`~Orange.classification.rules.RuleBeamCandidateSelector_TakeAll`) 
    441       takes all rules in rules_star 
     442      takes all rules in :obj:`rules_star`. 
    442443     
    443444   .. attribute:: refiner 
     
    703704    is called and the resulting classifier is returned instead of the learner. 
    704705 
    705     Constructor can be given the following parameters: 
    706      
    707706    :param evaluator: an object that evaluates a rule from covered instances. 
    708707        By default, entropy is used as a measure.  
     
    710709    :param beam_width: width of the search beam. 
    711710    :type beam_width: int 
    712     :param alpha: significance level of the statistical test to determine 
    713         whether rule is good enough to be returned by rulefinder. Likelihood 
    714         ratio statistics is used that gives an estimate if rule is 
    715         statistically better than the default rule. 
     711    :param alpha: significance level of the likelihood ratio statistics to 
     712        determine whether rule is better than the default rule. 
    716713    :type alpha: float 
    717714 
     
    759756    classifier. 
    760757     
    761     When constructing the classifier manually, the following parameters can 
    762     be passed: 
    763      
    764758    :param rules: learned rules to be used for classification (mandatory). 
    765759    :type rules: :class:`~Orange.classification.rules.RuleList` 
     
    793787         
    794788        :rtype: :class:`Orange.data.Value`,  
    795               :class:`Orange.statistics.Distribution` or a tuple with both 
     789              :class:`Orange.statistics.distribution.Distribution` or a tuple with both 
    796790        """ 
    797791        classifier = None 
     
    836830    is called and the resulting classifier is returned instead of the learner. 
    837831 
    838     Constructor can be given the following parameters: 
    839      
    840832    :param evaluator: an object that evaluates a rule from covered instances. 
    841833        By default, Laplace's rule of succession is used as a measure.  
     
    843835    :param beam_width: width of the search beam. 
    844836    :type beam_width: int 
    845     :param alpha: significance level of the statistical test to determine 
    846         whether rule is good enough to be returned by rulefinder. Likelihood 
    847         ratio statistics is used that gives an estimate if rule is 
    848         statistically better than the default rule. 
     837    :param alpha: significance level of the likelihood ratio statistics to 
     838        determine whether rule is better than the default rule. 
    849839    :type alpha: float 
    850840    """ 
     
    912902    construct the classifier. 
    913903     
    914     When constructing the classifier manually, the following parameters can 
    915     be passed: 
    916      
    917904    :param rules: learned rules to be used for classification (mandatory). 
    918905    :type rules: :class:`~Orange.classification.rules.RuleList` 
     
    948935         
    949936        :rtype: :class:`Orange.data.Value`,  
    950               :class:`Orange.statistics.Distribution` or a tuple with both 
     937              :class:`Orange.statistics.distribution.Distribution` or a tuple with both 
    951938        """ 
    952939        def add(disc1, disc2, sumd): 
     
    1002989    The difference between classical CN2 unordered and CN2-SD is selection of 
    1003990    specific evaluation function and covering function: 
    1004     :class:`Orange.classifier.rules.WRACCEvaluator` is used to implement 
     991    :class:`~Orange.classification.rules.WRACCEvaluator` is used to implement 
    1005992    weight-relative accuracy and  
    1006     :class:`Orange.classifier.rules.CovererAndRemover_MultWeight` avoids 
     993    :class:`~Orange.classification.rules.CovererAndRemover_MultWeight` avoids 
    1007994    excluding covered instances, multiplying their weight by the value of 
    1008995    mult parameter instead. 
     
    1011998    is called and the resulting classifier is returned instead of the learner. 
    1012999 
    1013     Constructor can be given the following parameters: 
    1014      
    10151000    :param evaluator: an object that evaluates a rule from covered instances. 
    10161001        By default, weighted relative accuracy is used. 
    10171002    :type evaluator: :class:`~Orange.classification.rules.RuleEvaluator` 
     1003     
    10181004    :param beam_width: width of the search beam. 
    10191005    :type beam_width: int 
    1020     :param alpha: significance level of the statistical test to determine 
    1021         whether rule is good enough to be returned by rulefinder. Likelihood 
    1022         ratio statistics is used that gives an estimate if rule is 
    1023         statistically better than the default rule. 
     1006     
     1007    :param alpha: significance level of the likelihood ratio statistics to 
     1008        determine whether rule is better than the default rule. 
    10241009    :type alpha: float 
     1010     
    10251011    :param mult: multiplicator for weights of covered instances. 
    10261012    :type mult: float 
     
    14651451                allowed_conditions = [c for c in p.filter.conditions] 
    14661452                pruned_conditions = self.prune_arg_conditions(ae, allowed_conditions, examples, weight_id) 
    1467                 p.baseDist = orange.Distribution(examples.domain.classVar, examples, weight_id) 
     1453                p.baseDist = Orange.statistics.distribution.Distribution(examples.domain.classVar, examples, weight_id) 
    14681454                p.filter.conditions = pruned_conditions 
    14691455                p.learner.setattr("arg_length", 0) 
     
    15511537    is called and the resulting classifier is returned instead of the learner. 
    15521538 
    1553     Constructor can be given the following parameters: 
    1554      
    15551539    :param evaluator: an object that evaluates a rule from covered instances. 
    15561540        By default, weighted relative accuracy is used. 
    15571541    :type evaluator: :class:`~Orange.classification.rules.RuleEvaluator` 
     1542     
    15581543    :param beam_width: width of the search beam. 
    15591544    :type beam_width: int 
    1560     :param alpha: significance level of the statistical test to determine 
    1561         whether rule is good enough to be returned by rulefinder. Likelihood 
    1562         ratio statistics is used that gives an estimate if rule is 
    1563         statistically better than the default rule. 
     1545     
     1546    :param alpha: significance level of the likelihood ratio statistics to 
     1547        determine whether rule is better than the default rule. 
    15641548    :type alpha: float 
     1549     
    15651550    :param mult: multiplicator for weights of covered instances. 
    15661551    :type mult: float 
     
    17131698class CovererAndRemover_MultWeights(RuleCovererAndRemover): 
    17141699    """ 
    1715     Covering and removing of instances using weight multiplication. 
     1700    Covering and removing of instances using weight multiplication: 
     1701     
    17161702    :param mult: weighting multiplication factor 
    1717     :type mult: float 
    1718      
     1703    :type mult: float     
    17191704    """ 
    17201705     
Note: See TracChangeset for help on using the changeset viewer.