Changeset 10370:5e8e3ba6c2fa in orange


Ignore:
Timestamp:
02/25/12 22:39:37 (2 years ago)
Author:
janezd <janez.demsar@…>
Branch:
default
Message:

Removed prefix Rule from most classes related to classification rules.
Moved the documentation to rst and polished it

Files:
2 edited

Legend:

Unmodified
Added
Removed
  • Orange/classification/rules.py

    r10219 r10370  
    1 """ 
    2  
    3 .. index:: rule induction 
    4  
    5 .. index::  
    6    single: classification; rule induction 
    7  
    8 ************************** 
    9 Rule induction (``rules``) 
    10 ************************** 
    11  
    12 This module implements supervised rule induction algorithms 
    13 and rule-based classification methods, specifically the  
    14 `CN2 induction algorithm <http://www.springerlink.com/content/k6q2v76736w5039r/>`_ 
    15 in multiple variants, including an argument-based learning one.  
    16 The implementation is modular, based on the rule induction  
    17 framework that is described below, providing the opportunity to change, specialize 
    18 and improve the algorithm. 
    19  
    20 CN2 algorithm 
    21 ============= 
    22  
    23 .. index::  
    24    single: classification; CN2 
    25  
    26 Several variations of well-known CN2 rule learning algorithms are implemented. 
    27 All are implemented by wrapping the 
    28 :class:`~Orange.classification.rules.RuleLearner` class. Each CN2 learner class 
    29 in this module changes some of RuleLearner's replaceable components to reflect 
    30 the required behavior. 
    31  
    32 Usage is consistent with typical learner usage in Orange: 
    33  
    34 :download:`rules-cn2.py <code/rules-cn2.py>` 
    35  
    36 .. literalinclude:: code/rules-cn2.py 
    37     :lines: 7- 
    38  
    39 The result:: 
    40      
    41     IF sex=['female'] AND status=['first'] AND age=['child'] THEN survived=yes<0.000, 1.000> 
    42     IF sex=['female'] AND status=['second'] AND age=['child'] THEN survived=yes<0.000, 13.000> 
    43     IF sex=['male'] AND status=['second'] AND age=['child'] THEN survived=yes<0.000, 11.000> 
    44     IF sex=['female'] AND status=['first'] THEN survived=yes<4.000, 140.000> 
    45     IF status=['first'] AND age=['child'] THEN survived=yes<0.000, 5.000> 
    46     IF sex=['male'] AND status=['second'] THEN survived=no<154.000, 14.000> 
    47     IF status=['crew'] AND sex=['female'] THEN survived=yes<3.000, 20.000> 
    48     IF status=['second'] THEN survived=yes<13.000, 80.000> 
    49     IF status=['third'] AND sex=['male'] AND age=['adult'] THEN survived=no<387.000, 75.000> 
    50     IF status=['crew'] THEN survived=no<670.000, 192.000> 
    51     IF age=['child'] AND sex=['male'] THEN survived=no<35.000, 13.000> 
    52     IF sex=['male'] THEN survived=no<118.000, 57.000> 
    53     IF age=['child'] THEN survived=no<17.000, 14.000> 
    54     IF TRUE THEN survived=no<89.000, 76.000> 
    55      
    56 .. autoclass:: Orange.classification.rules.CN2Learner 
    57    :members: 
    58    :show-inheritance: 
    59    :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
    60       ruleFinder, ruleStopping, storeInstances, targetClass, weightID 
    61     
    62 .. autoclass:: Orange.classification.rules.CN2Classifier 
    63    :members: 
    64    :show-inheritance: 
    65    :exclude-members: beamWidth, resultType 
    66     
    67 .. index:: unordered CN2 
    68  
    69 .. index::  
    70    single: classification; unordered CN2 
    71  
    72 .. autoclass:: Orange.classification.rules.CN2UnorderedLearner 
    73    :members: 
    74    :show-inheritance: 
    75    :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
    76       ruleFinder, ruleStopping, storeInstances, targetClass, weightID 
    77     
    78 .. autoclass:: Orange.classification.rules.CN2UnorderedClassifier 
    79    :members: 
    80    :show-inheritance: 
    81     
    82 .. index:: CN2-SD 
    83 .. index:: subgroup discovery 
    84  
    85 .. index::  
    86    single: classification; CN2-SD 
    87     
    88 .. autoclass:: Orange.classification.rules.CN2SDUnorderedLearner 
    89    :members: 
    90    :show-inheritance: 
    91    :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
    92       ruleFinder, ruleStopping, storeInstances, targetClass, weightID 
    93     
    94 .. autoclass:: Orange.classification.rules.CN2EVCUnorderedLearner 
    95    :members: 
    96    :show-inheritance: 
    97     
    98 References 
    99 ---------- 
    100  
    101 * Clark, Niblett. `The CN2 Induction Algorithm 
    102   <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.9180>`_. Machine 
    103   Learning 3(4):261--284, 1989. 
    104 * Clark, Boswell. `Rule Induction with CN2: Some Recent Improvements 
    105   <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.1700>`_. In 
    106   Machine Learning - EWSL-91. Proceedings of the European Working Session on 
    107   Learning, pp 151--163, Porto, Portugal, March 1991. 
    108 * Lavrac, Kavsek, Flach, Todorovski: `Subgroup Discovery with CN2-SD 
    109   <http://jmlr.csail.mit.edu/papers/volume5/lavrac04a/lavrac04a.pdf>`_. Journal 
    110   of Machine Learning Research 5: 153-188, 2004. 
    111  
    112  
    113 Argument based CN2 
    114 ================== 
    115  
    116 Orange also supports argument-based CN2 learning. 
    117  
    118 .. autoclass:: Orange.classification.rules.ABCN2 
    119    :members: 
    120    :show-inheritance: 
    121    :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
    122       ruleFinder, ruleStopping, storeInstances, targetClass, weightID, 
    123       argument_id 
    124     
    125    This class has many more undocumented methods; see the source code for 
    126    reference. 
    127     
    128 .. autoclass:: Orange.classification.rules.ABCN2Ordered 
    129    :members: 
    130    :show-inheritance: 
    131     
    132 .. autoclass:: Orange.classification.rules.ABCN2M 
    133    :members: 
    134    :show-inheritance: 
    135    :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
    136       ruleFinder, ruleStopping, storeInstances, targetClass, weightID 
    137  
    138 Thismodule has many more undocumented argument-based learning related classed; 
    139 see the source code for reference. 
    140  
    141 References 
    142 ---------- 
    143  
    144 * Bratko, Mozina, Zabkar. `Argument-Based Machine Learning 
    145   <http://www.springerlink.com/content/f41g17t1259006k4/>`_. Lecture Notes in 
    146   Computer Science: vol. 4203/2006, 11-17, 2006. 
    147  
    148  
    149 Rule induction framework 
    150 ======================== 
    151  
    152 A general framework of classes supports the described CN2 implementation, and 
    153 can in fact be fine-tuned to specific needs by replacing individual components. 
    154 Here is a simple example, while a detailed architecture can be observed 
    155 in description of classes that follows it: 
    156  
    157 part of :download:`rules-customized.py <code/rules-customized.py>` 
    158  
    159 .. literalinclude:: code/rules-customized.py 
    160     :lines: 7-17 
    161  
    162 probability with m=50. The result is:: 
    163  
    164     IF sex=['male'] AND status=['second'] AND age=['adult'] THEN survived=no<154.000, 14.000> 
    165     IF sex=['male'] AND status=['third'] AND age=['adult'] THEN survived=no<387.000, 75.000> 
    166     IF sex=['female'] AND status=['first'] THEN survived=yes<4.000, 141.000> 
    167     IF status=['crew'] AND sex=['male'] THEN survived=no<670.000, 192.000> 
    168     IF status=['second'] THEN survived=yes<13.000, 104.000> 
    169     IF status=['third'] AND sex=['male'] THEN survived=no<35.000, 13.000> 
    170     IF status=['first'] AND age=['adult'] THEN survived=no<118.000, 57.000> 
    171     IF status=['crew'] THEN survived=yes<3.000, 20.000> 
    172     IF sex=['female'] THEN survived=no<106.000, 90.000> 
    173     IF TRUE THEN survived=yes<0.000, 5.000> 
    174  
    175 Notice that it is first necessary to set the :obj:`rule_finder` component, 
    176 because the default components are not constructed when the learner is 
    177 constructed, but only when we run it on data. At that time, the algorithm 
    178 checks which components are necessary and sets defaults. Similarly, when the 
    179 learner finishes, it destructs all *default* components. Continuing with our 
    180 example, assume that we wish to set a different validation function and a 
    181 different bean width. This is simply written as: 
    182  
    183 part of :download:`rules-customized.py <code/rules-customized.py>` 
    184  
    185 .. literalinclude:: code/rules-customized.py 
    186     :lines: 19-23 
    187  
    188 .. py:class:: Orange.classification.rules.Rule(filter, classifier, lr, dist, ce, w = 0, qu = -1) 
    189     
    190    Representation of a single induced rule. 
    191     
    192    Parameters, that can be passed to the constructor, correspond to the first 
    193    7 attributes. All attributes are: 
    194     
    195    .. attribute:: filter 
    196     
    197       contents of the rule; this is the basis of the Rule class. Must be of 
    198       type :class:`Orange.core.Filter`; an instance of 
    199       :class:`Orange.core.Filter_values` is set as a default. 
    200     
    201    .. attribute:: classifier 
    202        
    203       each rule can be used as a classical Orange like 
    204       classifier. Must be of type :class:`Orange.classification.Classifier`. 
    205       By default, an instance of :class:`Orange.classification.ConstantClassifier` is used. 
    206     
    207    .. attribute:: learner 
    208        
    209       learner to be used for making a classifier. Must be of type 
    210       :class:`Orange.classification.Learner`. By default, 
    211       :class:`Orange.classification.majority.MajorityLearner` is used. 
    212     
    213    .. attribute:: class_distribution 
    214        
    215       distribution of class in data instances covered by this rule 
    216       (:class:`Orange.statistics.distribution.Distribution`). 
    217     
    218    .. attribute:: examples 
    219        
    220       data instances covered by this rule (:class:`Orange.data.Table`). 
    221     
    222    .. attribute:: weight_id 
    223     
    224       ID of the weight meta-attribute for the stored data instances (int). 
    225     
    226    .. attribute:: quality 
    227        
    228       quality of the rule. Rules with higher quality are better (float). 
    229     
    230    .. attribute:: complexity 
    231     
    232       complexity of the rule (float). Complexity is used for 
    233       selecting between rules with equal quality, where rules with lower 
    234       complexity are preferred. Typically, complexity corresponds to the 
    235       number of selectors in rule (actually to number of conditions in filter), 
    236       but, obviously, any other measure can be applied. 
    237     
    238    .. method:: filter_and_store(instances, weight_id=0, target_class=-1) 
    239     
    240       Filter passed data instances and store them in the attribute 'examples'. 
    241       Also, compute class_distribution, set weight of stored examples and create 
    242       a new classifier using 'learner' attribute. 
    243        
    244       :param weight_id: ID of the weight meta-attribute. 
    245       :type weight_id: int 
    246       :param target_class: index of target class; -1 for all. 
    247       :type target_class: int 
    248     
    249    Objects of this class can be invoked: 
    250  
    251    .. method:: __call__(instance, instances, weight_id=0, target_class=-1) 
    252     
    253       There are two ways of invoking this method. One way is only passing the 
    254       data instance; then the Rule object returns True if the given instance is 
    255       covered by the rule's filter. 
    256        
    257       :param instance: data instance. 
    258       :type instance: :class:`Orange.data.Instance` 
    259        
    260       Another way of invocation is passing a table of data instances, 
    261       in which case a table of instances, covered by this rule, is returned. 
    262        
    263       :param instances: a table of data instances. 
    264       :type instances: :class:`Orange.data.Table` 
    265       :param ref: TODO 
    266       :type ref: bool 
    267       :param negate: if set to True, the result is inverted: the resulting 
    268           table contains instances *not* covered by the rule. 
    269       :type negate: bool 
    270  
    271 .. py:class:: Orange.classification.rules.RuleLearner(store_instances = true, target_class = -1, base_rules = Orange.classification.rules.RuleList()) 
    272     
    273    Bases: :class:`Orange.classification.Learner` 
    274     
    275    A base rule induction learner. The algorithm follows separate-and-conquer 
    276    strategy, which has its origins in the AQ family of algorithms 
    277    (Fuernkranz J.; Separate-and-Conquer Rule Learning, Artificial Intelligence 
    278    Review 13, 3-54, 1999). Basically, such algorithms search for the "best" 
    279    possible rule in learning instancess, remove covered data from learning 
    280    instances (separate) and repeat the process (conquer) on the remaining 
    281    instances. 
    282     
    283    The class' functionality can be best explained by showing its __call__ 
    284    function: 
    285     
    286    .. parsed-literal:: 
    287  
    288       def \_\_call\_\_(self, instances, weight_id=0): 
    289           rule_list = Orange.classification.rules.RuleList() 
    290           all_instances = Orange.data.Table(instances) 
    291           while not self.\ **data_stopping**\ (instances, weight_id, self.target_class): 
    292               new_rule = self.\ **rule_finder**\ (instances, weight_id, self.target_class, 
    293                                         self.base_rules) 
    294               if self.\ **rule_stopping**\ (rule_list, new_rule, instances, weight_id): 
    295                   break 
    296               instances, weight_id = self.\ **cover_and_remove**\ (new_rule, instances, 
    297                                                       weight_id, self.target_class) 
    298               rule_list.append(new_rule) 
    299           return Orange.classification.rules.RuleClassifier_FirstRule( 
    300               rules=rule_list, instances=all_instances) 
    301                  
    302    The four customizable components here are the invoked :obj:`data_stopping`, 
    303    :obj:`rule_finder`, :obj:`cover_and_remove` and :obj:`rule_stopping` 
    304    objects. By default, components of the original CN2 algorithm are be used, 
    305    but this can be changed by modifying those attributes: 
    306     
    307    .. attribute:: data_stopping 
    308     
    309       an object of class 
    310       :class:`~Orange.classification.rules.RuleDataStoppingCriteria` 
    311       that determines whether there will be any benefit from further learning 
    312       (ie. if there is enough data to continue learning). The default 
    313       implementation 
    314       (:class:`~Orange.classification.rules.RuleDataStoppingCriteria_NoPositives`) 
    315       returns True if there are no more instances of given class.  
    316     
    317    .. attribute:: rule_stopping 
    318        
    319       an object of class  
    320       :class:`~Orange.classification.rules.RuleStoppingCriteria` 
    321       that decides from the last rule learned if it is worthwhile to use the 
    322       rule and learn more rules. By default, no rule stopping criteria is 
    323       used (:obj:`rule_stopping` == :obj:`None`), thus accepting all 
    324       rules. 
    325         
    326    .. attribute:: cover_and_remove 
    327         
    328       an object of 
    329       :class:`RuleCovererAndRemover` that removes 
    330       instances covered by the rule and returns remaining instances. The 
    331       default implementation 
    332       (:class:`RuleCovererAndRemover_Default`) 
    333       only removes the instances that belong to given target class, except if 
    334       it is not given (ie. :obj:`target_class` == -1). 
    335      
    336    .. attribute:: rule_finder 
    337        
    338       an object of class 
    339       :class:`~Orange.classification.rules.RuleFinder` that learns a single 
    340       rule from instances. Default implementation is 
    341       :class:`~Orange.classification.rules.RuleBeamFinder`. 
    342  
    343    :param store_instances: if set to True, the rules will have data instances 
    344        stored. 
    345    :type store_instances: bool 
    346      
    347    :param target_class: index of a specific class being learned; -1 for all. 
    348    :type target_class: int 
    349     
    350    :param base_rules: Rules that we would like to use in :obj:`rule_finder` to 
    351        constrain the learning space. If not set, it will be set to a set 
    352        containing only an empty rule. 
    353    :type base_rules: :class:`~Orange.classification.rules.RuleList` 
    354  
    355 Rule finders 
    356 ------------ 
    357  
    358 .. class:: Orange.classification.rules.RuleFinder 
    359  
    360    Base class for all rule finders. These are used to learn a single rule from 
    361    instances. 
    362     
    363    Rule finders are invokable in the following manner: 
    364     
    365    .. method:: __call__(table, weight_id, target_class, base_rules) 
    366     
    367       Return a new rule, induced from instances in the given table. 
    368        
    369       :param table: data instances to learn from. 
    370       :type table: :class:`Orange.data.Table` 
    371        
    372       :param weight_id: ID of the weight meta-attribute for the stored data 
    373           instances. 
    374       :type weight_id: int 
    375        
    376       :param target_class: index of a specific class being learned; -1 for all. 
    377       :type target_class: int  
    378        
    379       :param base_rules: Rules that we would like to use in :obj:`rule_finder` 
    380           to constrain the learning space. If not set, it will be set to a set 
    381           containing only an empty rule. 
    382       :type base_rules: :class:`~Orange.classification.rules.RuleList` 
    383  
    384 .. class:: Orange.classification.rules.RuleBeamFinder 
    385     
    386    Bases: :class:`~Orange.classification.rules.RuleFinder` 
    387     
    388    Beam search for the best rule. This is the default class used in RuleLearner 
    389    to find the best rule. Pseudo code of the algorithm is shown here: 
    390  
    391    .. parsed-literal:: 
    392  
    393       def \_\_call\_\_(self, table, weight_id, target_class, base_rules): 
    394           prior = Orange.statistics.distribution.Distribution(table.domain.class_var, table, weight_id) 
    395           rules_star, best_rule = self.\ **initializer**\ (table, weight_id, target_class, base_rules, self.evaluator, prior) 
    396           \# compute quality of rules in rules_star and best_rule 
    397           ... 
    398           while len(rules_star) \> 0: 
    399               candidates, rules_star = self.\ **candidate_selector**\ (rules_star, table, weight_id) 
    400               for cand in candidates: 
    401                   new_rules = self.\ **refiner**\ (cand, table, weight_id, target_class) 
    402                   for new_rule in new_rules: 
    403                       if self.\ **rule_stopping_validator**\ (new_rule, table, weight_id, target_class, cand.class_distribution): 
    404                           new_rule.quality = self.\ **evaluator**\ (new_rule, table, weight_id, target_class, prior) 
    405                           rules_star.append(new_rule) 
    406                           if self.\ **validator**\ (new_rule, table, weight_id, target_class, prior) and 
    407                               new_rule.quality \> best_rule.quality: 
    408                               best_rule = new_rule 
    409               rules_star = self.\ **rule_filter**\ (rules_star, table, weight_id) 
    410           return best_rule 
    411  
    412    Bolded in the pseudo-code are several exchangeable components, exposed as 
    413    attributes. These are: 
    414  
    415    .. attribute:: initializer 
    416     
    417       an object of class 
    418       :class:`~Orange.classification.rules.RuleBeamInitializer` 
    419       used to initialize :obj:`rules_star` and for selecting the 
    420       initial best rule. By default 
    421       (:class:`~Orange.classification.rules.RuleBeamInitializer_Default`), 
    422       :obj:`base_rules` are returned as starting :obj:`rulesSet` and the best 
    423       from :obj:`base_rules` is set as :obj:`best_rule`. If :obj:`base_rules` 
    424       are not set, this class will return :obj:`rules_star` with rule that 
    425       covers all instances (has no selectors) and this rule will be also used 
    426       as :obj:`best_rule`. 
    427     
    428    .. attribute:: candidate_selector 
    429     
    430       an object of class 
    431       :class:`~Orange.classification.rules.RuleBeamCandidateSelector` 
    432       used to separate a subset from the current 
    433       :obj:`rules_star` and return it. These rules will be used in the next 
    434       specification step. Default component (an instance of 
    435       :class:`~Orange.classification.rules.RuleBeamCandidateSelector_TakeAll`) 
    436       takes all rules in :obj:`rules_star`. 
    437      
    438    .. attribute:: refiner 
    439     
    440       an object of class 
    441       :class:`~Orange.classification.rules.RuleBeamRefiner` 
    442       used to refine given rule. New rule should cover a 
    443       strict subset of examples covered by given rule. Default component 
    444       (:class:`~Orange.classification.rules.RuleBeamRefiner_Selector`) adds 
    445       a conjunctive selector to selectors present in the rule. 
    446      
    447    .. attribute:: rule_filter 
    448     
    449       an object of class 
    450       :class:`~Orange.classification.rules.RuleBeamFilter` 
    451       used to filter rules to keep beam relatively small 
    452       to contain search complexity. By default, it takes five best rules: 
    453       :class:`~Orange.classification.rules.RuleBeamFilter_Width`\ *(m=5)*\ . 
    454  
    455    .. method:: __call__(data, weight_id, target_class, base_rules) 
    456  
    457    Determines the next best rule to cover the remaining data instances. 
    458     
    459    :param data: data instances. 
    460    :type data: :class:`Orange.data.Table` 
    461     
    462    :param weight_id: index of the weight meta-attribute. 
    463    :type weight_id: int 
    464     
    465    :param target_class: index of the target class. 
    466    :type target_class: int 
    467     
    468    :param base_rules: existing rules. 
    469    :type base_rules: :class:`~Orange.classification.rules.RuleList` 
    470  
    471 Rule evaluators 
    472 --------------- 
    473  
    474 .. class:: Orange.classification.rules.RuleEvaluator 
    475  
    476    Base class for rule evaluators that evaluate the quality of the rule based 
    477    on covered data instances. All evaluators support being invoked in the 
    478    following manner: 
    479     
    480    .. method:: __call__(rule, instances, weight_id, target_class, prior) 
    481     
    482       Calculates a non-negative rule quality. 
    483        
    484       :param rule: rule to evaluate. 
    485       :type rule: :class:`~Orange.classification.rules.Rule` 
    486        
    487       :param instances: a table of instances, covered by the rule. 
    488       :type instances: :class:`Orange.data.Table` 
    489        
    490       :param weight_id: index of the weight meta-attribute. 
    491       :type weight_id: int 
    492        
    493       :param target_class: index of target class of this rule. 
    494       :type target_class: int 
    495        
    496       :param prior: prior class distribution. 
    497       :type prior: :class:`Orange.statistics.distribution.Distribution` 
    498  
    499 .. autoclass:: Orange.classification.rules.LaplaceEvaluator 
    500    :members: 
    501    :show-inheritance: 
    502    :exclude-members: targetClass, weightID 
    503  
    504 .. autoclass:: Orange.classification.rules.WRACCEvaluator 
    505    :members: 
    506    :show-inheritance: 
    507    :exclude-members: targetClass, weightID 
    508     
    509 .. class:: Orange.classification.rules.RuleEvaluator_Entropy 
    510  
    511    Bases: :class:`~Orange.classification.rules.RuleEvaluator` 
    512      
    513 .. class:: Orange.classification.rules.RuleEvaluator_LRS 
    514  
    515    Bases: :class:`~Orange.classification.rules.RuleEvaluator` 
    516  
    517 .. class:: Orange.classification.rules.RuleEvaluator_Laplace 
    518  
    519    Bases: :class:`~Orange.classification.rules.RuleEvaluator` 
    520  
    521 .. class:: Orange.classification.rules.RuleEvaluator_mEVC 
    522  
    523    Bases: :class:`~Orange.classification.rules.RuleEvaluator` 
    524     
    525 Instance covering and removal 
    526 ----------------------------- 
    527  
    528 .. class:: RuleCovererAndRemover 
    529  
    530    Base class for rule coverers and removers that, when invoked, remove 
    531    instances covered by the rule and return remaining instances. 
    532  
    533    .. method:: __call__(rule, instances, weights, target_class) 
    534     
    535       Calculates a non-negative rule quality. 
    536        
    537       :param rule: rule to evaluate. 
    538       :type rule: :class:`~Orange.classification.rules.Rule` 
    539        
    540       :param instances: a table of instances, covered by the rule. 
    541       :type instances: :class:`Orange.data.Table` 
    542        
    543       :param weights: index of the weight meta-attribute. 
    544       :type weights: int 
    545        
    546       :param target_class: index of target class of this rule. 
    547       :type target_class: int 
    548  
    549 .. autoclass:: CovererAndRemover_MultWeights 
    550  
    551 .. autoclass:: CovererAndRemover_AddWeights 
    552     
    553 Miscellaneous functions 
    554 ----------------------- 
    555  
    556 .. automethod:: Orange.classification.rules.rule_to_string 
    557  
    558 .. 
    559     Undocumented are: 
    560     Data-based Stopping Criteria 
    561     ---------------------------- 
    562     Rule-based Stopping Criteria 
    563     ---------------------------- 
    564     Rule-based Stopping Criteria 
    565     ---------------------------- 
    566  
    567 """ 
    568  
    5691import random 
    5702import math 
     
    5746import Orange 
    5757import Orange.core 
    576 from Orange.core import \ 
    577     RuleClassifier, \ 
    578     RuleClassifier_firstRule, \ 
    579     RuleClassifier_logit, \ 
    580     RuleLearner, \ 
    581     Rule, \ 
    582     RuleBeamCandidateSelector, \ 
    583     RuleBeamCandidateSelector_TakeAll, \ 
    584     RuleBeamFilter, \ 
    585     RuleBeamFilter_Width, \ 
    586     RuleBeamInitializer, \ 
    587     RuleBeamInitializer_Default, \ 
    588     RuleBeamRefiner, \ 
    589     RuleBeamRefiner_Selector, \ 
    590     RuleClassifierConstructor, \ 
    591     RuleCovererAndRemover, \ 
    592     RuleCovererAndRemover_Default, \ 
    593     RuleDataStoppingCriteria, \ 
    594     RuleDataStoppingCriteria_NoPositives, \ 
    595     RuleEvaluator, \ 
    596     RuleEvaluator_Entropy, \ 
    597     RuleEvaluator_LRS, \ 
    598     RuleEvaluator_Laplace, \ 
    599     RuleEvaluator_mEVC, \ 
    600     RuleFinder, \ 
    601     RuleBeamFinder, \ 
    602     RuleList, \ 
    603     RuleStoppingCriteria, \ 
    604     RuleStoppingCriteria_NegativeDistribution, \ 
    605     RuleValidator, \ 
    606     RuleValidator_LRS 
     8 
     9RuleClassifier = Orange.core.RuleClassifier 
     10RuleClassifier_firstRule = Orange.core.RuleClassifier_firstRule 
     11RuleClassifier_logit = Orange.core.RuleClassifier_logit 
     12RuleLearner = Orange.core.RuleLearner 
     13Rule = Orange.core.Rule 
     14RuleList = Orange.core.RuleList 
     15 
     16BeamCandidateSelector = Orange.core.RuleBeamCandidateSelector 
     17BeamCandidateSelector_TakeAll = Orange.core.RuleBeamCandidateSelector_TakeAll 
     18BeamFilter = Orange.core.RuleBeamFilter 
     19BeamFilter_Width = Orange.core.RuleBeamFilter_Width 
     20BeamInitializer = Orange.core.RuleBeamInitializer 
     21BeamInitializer_Default = Orange.core.RuleBeamInitializer_Default 
     22BeamRefiner = Orange.core.RuleBeamRefiner 
     23BeamRefiner_Selector = Orange.core.RuleBeamRefiner_Selector 
     24ClassifierConstructor = Orange.core.RuleClassifierConstructor 
     25CovererAndRemover = Orange.core.RuleCovererAndRemover 
     26CovererAndRemover_Default = Orange.core.RuleCovererAndRemover_Default 
     27DataStoppingCriteria = Orange.core.RuleDataStoppingCriteria 
     28DataStoppingCriteria_NoPositives = Orange.core.RuleDataStoppingCriteria_NoPositives 
     29Evaluator = Orange.core.RuleEvaluator 
     30Evaluator_Entropy = Orange.core.RuleEvaluator_Entropy 
     31Evaluator_LRS = Orange.core.RuleEvaluator_LRS 
     32Evaluator_Laplace = Orange.core.RuleEvaluator_Laplace 
     33Evaluator_mEVC = Orange.core.RuleEvaluator_mEVC 
     34Finder = Orange.core.RuleFinder 
     35BeamFinder = Orange.core.RuleBeamFinder 
     36StoppingCriteria = Orange.core.RuleStoppingCriteria 
     37StoppingCriteria_NegativeDistribution = Orange.core.RuleStoppingCriteria_NegativeDistribution 
     38Validator = Orange.core.RuleValidator 
     39Validator_LRS = Orange.core.RuleValidator_LRS 
     40     
    60741from Orange.misc import deprecated_keywords 
    60842from Orange.misc import deprecated_members 
     
    63973 
    64074 
    641 class LaplaceEvaluator(RuleEvaluator): 
     75class LaplaceEvaluator(Evaluator): 
    64276    """ 
    64377    Laplace's rule of succession. 
     
    65993 
    66094 
    661 class WRACCEvaluator(RuleEvaluator): 
     95class WRACCEvaluator(Evaluator): 
    66296    """ 
    66397    Weighted relative accuracy. 
     
    686120 
    687121 
    688 class MEstimateEvaluator(RuleEvaluator): 
     122class MEstimateEvaluator(Evaluator): 
    689123    """ 
    690124    Rule evaluator using m-estimate of probability rule evaluation function. 
     
    718152class CN2Learner(RuleLearner): 
    719153    """ 
    720     Classical CN2 (see Clark and Niblett; 1988) induces a set of ordered 
    721     rules, which means that classificator must try these rules in the same 
    722     order as they were learned. 
    723      
    724     If data instances are provided to the constructor, the learning algorithm 
    725     is called and the resulting classifier is returned instead of the learner. 
    726  
    727     :param evaluator: an object that evaluates a rule from covered instances. 
     154    Classical CN2 inducer (Clark and Niblett; 1988) that constructs a 
     155    set of ordered rules. Constructor returns either an instance of 
     156    :obj:`CN2Learner` or, if training data is provided, a 
     157    :obj:`CN2Classifier`. 
     158     
     159    :param evaluator: an object that evaluates a rule from instances. 
    728160        By default, entropy is used as a measure.  
    729     :type evaluator: :class:`~Orange.classification.rules.RuleEvaluator` 
     161    :type evaluator: :class:`~Orange.classification.rules.Evaluator` 
    730162    :param beam_width: width of the search beam. 
    731163    :type beam_width: int 
    732     :param alpha: significance level of the likelihood ratio statistics to 
    733         determine whether rule is better than the default rule. 
     164    :param alpha: significance level of the likelihood ratio statistics 
     165        to determine whether rule is better than the default rule. 
    734166    :type alpha: float 
    735167 
     
    744176            return self 
    745177 
    746     def __init__(self, evaluator=RuleEvaluator_Entropy(), beam_width=5, 
     178    def __init__(self, evaluator=Evaluator_Entropy(), beam_width=5, 
    747179        alpha=1.0, **kwds): 
    748180        self.__dict__.update(kwds) 
    749         self.rule_finder = RuleBeamFinder() 
    750         self.rule_finder.ruleFilter = RuleBeamFilter_Width(width=beam_width) 
     181        self.rule_finder = BeamFinder() 
     182        self.rule_finder.ruleFilter = BeamFilter_Width(width=beam_width) 
    751183        self.rule_finder.evaluator = evaluator 
    752         self.rule_finder.validator = RuleValidator_LRS(alpha=alpha) 
     184        self.rule_finder.validator = Validator_LRS(alpha=alpha) 
    753185 
    754186    def __call__(self, instances, weight=0): 
     
    772204class CN2Classifier(RuleClassifier): 
    773205    """ 
    774     Classical CN2 (see Clark and Niblett; 1988) classifies a new instance 
    775     using an ordered set of rules. Usually the learner 
    776     (:class:`~Orange.classification.rules.CN2Learner`) is used to construct the 
    777     classifier. 
    778      
    779     :param rules: learned rules to be used for classification (mandatory). 
    780     :type rules: :class:`~Orange.classification.rules.RuleList` 
    781      
    782     :param instances: data instances that were used for learning. 
     206    Classical CN2 classifier (Clark and Niblett; 1988) that predicts a 
     207    class from an ordered list of rules. The classifier is usually 
     208    constructed by :class:`~Orange.classification.rules.CN2Learner`. 
     209     
     210    :param rules: induced rules 
     211    :type rules: :class:`~Orange.classification.rules.List` 
     212     
     213    :param instances: stored training data instances 
    783214    :type instances: :class:`Orange.data.Table` 
    784215     
     
    842273class CN2UnorderedLearner(RuleLearner): 
    843274    """ 
    844     CN2 unordered (see Clark and Boswell; 1991) induces a set of unordered 
    845     rules - classification from rules does not assume ordering of rules. 
    846     Learning rules is quite similar to learning in classical CN2, where 
    847     the process of learning of rules is separated to learning rules for each 
    848     class. 
    849      
    850     If data instances are provided to the constructor, the learning algorithm 
    851     is called and the resulting classifier is returned instead of the learner. 
    852  
     275    Unordered CN2 (Clark and Boswell; 1991) induces a set of unordered 
     276    rules. Learning rules is quite similar to learning in classical 
     277    CN2, where the process of learning of rules is separated to 
     278    learning rules for each class. 
     279 
     280    Constructor returns either an instance of 
     281    :obj:`CN2UnorderedLearner` or, if training data is provided, a 
     282    :obj:`CN2UnorderedClassifier`. 
     283     
    853284    :param evaluator: an object that evaluates a rule from covered instances. 
    854285        By default, Laplace's rule of succession is used as a measure.  
    855     :type evaluator: :class:`~Orange.classification.rules.RuleEvaluator` 
     286    :type evaluator: :class:`~Orange.classification.rules.Evaluator` 
    856287    :param beam_width: width of the search beam. 
    857288    :type beam_width: int 
     
    868299            return self 
    869300 
    870     def __init__(self, evaluator=RuleEvaluator_Laplace(), beam_width=5, 
     301    def __init__(self, evaluator=Evaluator_Laplace(), beam_width=5, 
    871302        alpha=1.0, **kwds): 
    872303        self.__dict__.update(kwds) 
    873         self.rule_finder = RuleBeamFinder() 
    874         self.rule_finder.ruleFilter = RuleBeamFilter_Width(width=beam_width) 
     304        self.rule_finder = BeamFinder() 
     305        self.rule_finder.ruleFilter = BeamFilter_Width(width=beam_width) 
    875306        self.rule_finder.evaluator = evaluator 
    876         self.rule_finder.validator = RuleValidator_LRS(alpha=alpha) 
    877         self.rule_finder.rule_stoppingValidator = RuleValidator_LRS(alpha=1.0) 
    878         self.rule_stopping = RuleStopping_Apriori() 
    879         self.data_stopping = RuleDataStoppingCriteria_NoPositives() 
     307        self.rule_finder.validator = Validator_LRS(alpha=alpha) 
     308        self.rule_finder.rule_stoppingValidator = Validator_LRS(alpha=1.0) 
     309        self.rule_stopping = Stopping_Apriori() 
     310        self.data_stopping = DataStoppingCriteria_NoPositives() 
    880311 
    881312    @deprecated_keywords({"weight": "weight_id"}) 
     
    918349class CN2UnorderedClassifier(RuleClassifier): 
    919350    """ 
    920     CN2 unordered (see Clark and Boswell; 1991) classifies a new instance using 
    921     a set of unordered rules. Usually the learner 
    922     (:class:`~Orange.classification.rules.CN2UnorderedLearner`) is used to 
    923     construct the classifier. 
    924      
    925     :param rules: learned rules to be used for classification (mandatory). 
     351    Unordered CN2 classifier (Clark and Boswell; 1991) classifies an 
     352    instance using a set of unordered rules. The classifier is 
     353    typically constructed with 
     354    :class:`~Orange.classification.rules.CN2UnorderedLearner`. 
     355     
     356    :param rules: induced rules 
    926357    :type rules: :class:`~Orange.classification.rules.RuleList` 
    927358     
    928     :param instances: data instances that were used for learning. 
     359    :param instances: stored training data instances 
    929360    :type instances: :class:`Orange.data.Table` 
    930361     
     
    948379    def __call__(self, instance, result_type=Orange.classification.Classifier.GetValue, ret_rules=False): 
    949380        """ 
     381        The call has another optional argument that is used to tell 
     382        the classifier to also return the rules that cover the given 
     383        data instance. 
     384 
    950385        :param instance: instance to be classified. 
    951386        :type instance: :class:`Orange.data.Instance` 
     
    956391         
    957392        :rtype: :class:`Orange.data.Value`,  
    958               :class:`Orange.statistics.distribution.Distribution` or a tuple with both 
     393              :class:`Orange.statistics.distribution.Distribution` or a tuple with both, and a list of rules if :obj:`ret_rules` is ``True`` 
    959394        """ 
    960395        def add(disc1, disc2, sumd): 
     
    1006441class CN2SDUnorderedLearner(CN2UnorderedLearner): 
    1007442    """ 
    1008     CN2-SD (see Lavrac et al.; 2004) induces a set of unordered rules, which 
    1009     is the same as :class:`~Orange.classification.rules.CN2UnorderedLearner`. 
    1010     The difference between classical CN2 unordered and CN2-SD is selection of 
    1011     specific evaluation function and covering function: 
    1012     :class:`WRACCEvaluator` is used to implement 
    1013     weight-relative accuracy and  
    1014     :class:`CovererAndRemover_MultWeights` avoids 
    1015     excluding covered instances, multiplying their weight by the value of 
    1016     mult parameter instead. 
    1017      
    1018     If data instances are provided to the constructor, the learning algorithm 
    1019     is called and the resulting classifier is returned instead of the learner. 
     443    CN2-SD (Lavrac et al.; 2004) induces a set of unordered rules used 
     444    by :class:`~Orange.classification.rules.CN2UnorderedClassifier`. 
     445    CN2-SD differs from unordered CN2 by the default function and 
     446    covering function: :class:`WRACCEvaluator` computes weighted 
     447    relative accuracy and :class:`CovererAndRemover_MultWeights` 
     448    decreases the weight of covered data instances instead of removing 
     449    them. 
     450     
     451    Constructor returns either an instance of 
     452    :obj:`CN2SDUnorderedLearner` or, if training data is provided, a 
     453    :obj:`CN2UnorderedClassifier`. 
    1020454 
    1021455    :param evaluator: an object that evaluates a rule from covered instances. 
    1022456        By default, weighted relative accuracy is used. 
    1023     :type evaluator: :class:`~Orange.classification.rules.RuleEvaluator` 
     457    :type evaluator: :class:`~Orange.classification.rules.Evaluator` 
    1024458     
    1025459    :param beam_width: width of the search beam. 
     
    1059493class ABCN2(RuleLearner): 
    1060494    """ 
    1061     This is an implementation of argument-based CN2 using EVC as evaluation 
    1062     and LRC classification. 
    1063      
    1064     Rule learning parameters that can be passed to constructor: 
     495    Argument-based CN2 that uses EVC for evaluation 
     496    and LRC for classification. 
    1065497     
    1066498    :param width: beam width (default 5). 
    1067499    :type width: int 
    1068     :param learn_for_class: class for which to learn; None (default) if all 
    1069        classes are to be learnt. 
    1070     :param learn_one_rule: decides whether to rule one rule only (default 
    1071        False). 
     500    :param learn_for_class: class for which to learn; ``None`` (default) if all 
     501       classes are to be learned. 
     502    :param learn_one_rule: decides whether to learn only a single rule (default: 
     503       ``False``). 
    1072504    :type learn_one_rule: boolean 
    1073505    :param analyse_argument: index of argument to analyse; -1 to learn normally 
    1074506       (default) 
    1075507    :type analyse_argument: int 
    1076     :param debug: sets debug mode - prints some info during execution; False (default) 
     508    :param debug: sets debug mode that prints some info during execution (default: ``False``) 
    1077509    :type debug: boolean 
    1078510     
    1079     The following evaluator related arguments are supported: 
     511    The following evaluator related arguments are also supported: 
    1080512     
    1081513    :param m: m for m-estimate to be corrected with EVC (default 2). 
    1082514    :type m: int 
    1083515    :param opt_reduction: type of EVC correction: 0=no correction, 
    1084        1=pessimistic, 2=normal (default 2). 
     516       1=pessimistic, 2=normal (default). 
    1085517    :type opt_reduction: int 
    1086     :param nsampling: number of samples in estimating extreme value 
    1087        distribution for EVC (default 100). 
     518    :param nsampling: number of samples for estimation of extreme value 
     519       distribution for EVC (default: 100). 
    1088520    :type nsampling: int 
    1089521    :param evd: pre-given extreme value distributions. 
    1090522    :param evd_arguments: pre-given extreme value distributions for arguments. 
    1091523     
    1092     Those parameters control rule validation: 
     524    The following parameters control rule validation: 
    1093525     
    1094526    :param rule_sig: minimal rule significance (default 1.0). 
     
    1140572        self.postpruning = postpruning 
    1141573        # rule finder 
    1142         self.rule_finder = RuleBeamFinder() 
    1143         self.ruleFilter = RuleBeamFilter_Width(width=width) 
     574        self.rule_finder = BeamFinder() 
     575        self.ruleFilter = BeamFilter_Width(width=width) 
    1144576        self.ruleFilter_arguments = ABBeamFilter(width=width) 
    1145577        if max_rule_complexity - 1 < 0: 
    1146578            max_rule_complexity = 10 
    1147         self.rule_finder.rule_stoppingValidator = RuleValidator_LRS(alpha=1.0, min_quality=0., max_rule_complexity=max_rule_complexity - 1, min_coverage=min_coverage) 
    1148         self.refiner = RuleBeamRefiner_Selector() 
     579        self.rule_finder.rule_stoppingValidator = Validator_LRS(alpha=1.0, min_quality=0., max_rule_complexity=max_rule_complexity - 1, min_coverage=min_coverage) 
     580        self.refiner = BeamRefiner_Selector() 
    1149581        self.refiner_arguments = SelectorAdder(discretizer=Orange.feature.discretization.Entropy(forceAttribute=1, 
    1150582                                                                                           maxNumberOfIntervals=2)) 
     
    1152584        # evc evaluator 
    1153585        evdGet = Orange.core.EVDistGetter_Standard() 
    1154         self.rule_finder.evaluator = RuleEvaluator_mEVC(m=m, evDistGetter=evdGet, min_improved=min_improved, min_improved_perc=min_improved_perc) 
     586        self.rule_finder.evaluator = Evaluator_mEVC(m=m, evDistGetter=evdGet, min_improved=min_improved, min_improved_perc=min_improved_perc) 
    1155587        self.rule_finder.evaluator.returnExpectedProb = True 
    1156588        self.rule_finder.evaluator.optimismReduction = opt_reduction 
    1157589        self.rule_finder.evaluator.ruleAlpha = rule_sig 
    1158590        self.rule_finder.evaluator.attributeAlpha = att_sig 
    1159         self.rule_finder.evaluator.validator = RuleValidator_LRS(alpha=1.0, min_quality=min_quality, min_coverage=min_coverage, max_rule_complexity=max_rule_complexity - 1) 
     591        self.rule_finder.evaluator.validator = Validator_LRS(alpha=1.0, min_quality=min_quality, min_coverage=min_coverage, max_rule_complexity=max_rule_complexity - 1) 
    1160592 
    1161593        # learn stopping criteria 
    1162594        self.rule_stopping = None 
    1163         self.data_stopping = RuleDataStoppingCriteria_NoPositives() 
     595        self.data_stopping = DataStoppingCriteria_NoPositives() 
    1164596        # evd fitting 
    1165597        self.evd_creator = EVDFitter(self, n=nsampling) 
     
    1220652            while aes: 
    1221653                if self.analyse_argument > -1 and \ 
    1222                    (isinstance(self.analyse_argument, Orange.core.Example) and not Orange.core.Example(dich_data.domain, self.analyse_argument) == aes[0] or \ 
     654                   (isinstance(self.analyse_argument, Orange.data.Instance) and not Orange.data.Instance(dich_data.domain, self.analyse_argument) == aes[0] or \ 
    1223655                    isinstance(self.analyse_argument, int) and not dich_data[self.analyse_argument] == aes[0]): 
    1224656                    aes = aes[1:] 
     
    1392824 
    1393825    def change_domain(self, rule, cl, examples, weight_id): 
    1394         rule.filter = Orange.core.Filter_values(domain=examples.domain, 
    1395                                         conditions=rule.filter.conditions) 
     826        rule.filter = Orange.data.Values( 
     827            domain=examples.domain, conditions=rule.filter.conditions) 
    1396828        rule.filterAndStore(examples, weight_id, cl) 
    1397829        if hasattr(rule, "learner") and hasattr(rule.learner, "arg_example"): 
     
    1488920                p.filter.filter.conditions.extend(pruned_conditions) 
    1489921                # if argument does not contain all unspecialized reasons, add those reasons with minimum values 
    1490                 at_oper_pairs = [(c.position, c.oper) for c in p.filter.conditions if type(c) == Orange.core.ValueFilter_continuous] 
     922                at_oper_pairs = [(c.position, c.oper) for c in p.filter.conditions if type(c) == Orange.data.filter.ValueFilterContinuous] 
    1491923                for u in unspec_conditions: 
    1492924                    if not (u.position, u.oper) in at_oper_pairs: 
    1493925                        # find minimum value 
    1494                         if u.oper == Orange.core.ValueFilter_continuous.Greater or u.oper == Orange.core.ValueFilter_continuous.GreaterEqual: 
     926                        if u.oper == Orange.data.filter.ValueFilter.Greater or \ 
     927                            u.oper == Orange.data.filter.ValueFilter.GreaterEqual: 
    1495928                            u.ref = min([float(e[u.position]) - 10. for e in p.examples]) 
    1496929                        else: 
     
    1514947 
    1515948    def newFilter_values(self, filter): 
    1516         newFilter = Orange.core.Filter_values() 
     949        newFilter = Orange.data.filter.Values() 
    1517950        newFilter.conditions = filter.conditions[:] 
    1518951        newFilter.domain = filter.domain 
     
    1531964            return [] 
    1532965        cn2_learner = Orange.classification.rules.CN2UnorderedLearner() 
    1533         cn2_learner.rule_finder = RuleBeamFinder() 
     966        cn2_learner.rule_finder = BeamFinder() 
    1534967        cn2_learner.rule_finder.refiner = SelectorArgConditions(crit_example, allowed_conditions) 
    1535968        cn2_learner.rule_finder.evaluator = Orange.classification.rules.MEstimateEvaluator(self.rule_finder.evaluator.m) 
     
    1550983class CN2EVCUnorderedLearner(ABCN2): 
    1551984    """ 
    1552     CN2-SD (see Lavrac et al.; 2004) induces a set of unordered rules in a 
    1553     simmilar manner as 
    1554     :class:`~Orange.classification.rules.CN2SDUnorderedLearner`. This 
    1555     implementation uses the EVC rule evaluation. 
    1556      
    1557     If data instances are provided to the constructor, the learning algorithm 
    1558     is called and the resulting classifier is returned instead of the learner. 
     985    A learner similar to CN2-SD (:obj:`CN2SDUnorderedLearner`) except that 
     986    it uses EVC for rule evaluation. 
    1559987 
    1560988    :param evaluator: an object that evaluates a rule from covered instances. 
    1561989        By default, weighted relative accuracy is used. 
    1562     :type evaluator: :class:`~Orange.classification.rules.RuleEvaluator` 
     990    :type evaluator: :class:`~Orange.classification.rules.Evaluator` 
    1563991     
    1564992    :param beam_width: width of the search beam. 
     
    15781006            max_rule_complexity=int(max_rule_complexity)) 
    15791007 
    1580 class DefaultLearner(Orange.core.Learner): 
    1581     """ 
    1582     Default lerner - returns default classifier with predefined output class. 
     1008class DefaultLearner(Orange.classification.Learner): 
     1009    """ 
     1010    Default learner - returns default classifier with predefined output class. 
    15831011    """ 
    15841012    def __init__(self, default_value=None): 
    15851013        self.default_value = default_value 
    15861014    def __call__(self, examples, weight_id=0): 
    1587         return Orange.classification.ConstantClassifier(self.default_value, defaultDistribution=Orange.core.Distribution(examples.domain.class_var, examples, weight_id)) 
     1015        return Orange.classification.ConstantClassifier(self.default_value, defaultDistribution=Orange.statistics.Distribution(examples.domain.class_var, examples, weight_id)) 
    15881016 
    15891017class ABCN2Ordered(ABCN2): 
     
    16241052 
    16251053 
    1626 class RuleStopping_Apriori(RuleStoppingCriteria): 
     1054class Stopping_Apriori(StoppingCriteria): 
    16271055    def __init__(self, apriori=None): 
    16281056        self.apriori = None 
     
    16401068 
    16411069 
    1642 class RuleStopping_SetRules(RuleStoppingCriteria): 
     1070class Stopping_SetRules(StoppingCriteria): 
    16431071    def __init__(self, validator): 
    1644         self.rule_stopping = RuleStoppingCriteria_NegativeDistribution() 
     1072        self.rule_stopping = StoppingCriteria_NegativeDistribution() 
    16451073        self.validator = validator 
    16461074 
     
    16521080 
    16531081 
    1654 class LengthValidator(RuleValidator): 
     1082class LengthValidator(Validator): 
    16551083    """ prune rules with more conditions than self.length. """ 
    16561084    def __init__(self, length= -1): 
     
    16631091 
    16641092 
    1665 class NoDuplicatesValidator(RuleValidator): 
     1093class NoDuplicatesValidator(Validator): 
    16661094    def __init__(self, alpha=.05, min_coverage=0, max_rule_length=0, rules=RuleList()): 
    16671095        self.rules = rules 
    1668         self.validator = RuleValidator_LRS(alpha=alpha, \ 
     1096        self.validator = Validator_LRS(alpha=alpha, \ 
    16691097            min_coverage=min_coverage, max_rule_length=max_rule_length) 
    16701098 
     
    16761104 
    16771105 
    1678 class RuleClassifier_BestRule(RuleClassifier): 
     1106class Classifier_BestRule(RuleClassifier): 
    16791107    def __init__(self, rules, instances, weight_id=0, **argkw): 
    16801108        self.rules = rules 
     
    17171145 
    17181146 
    1719 class CovererAndRemover_MultWeights(RuleCovererAndRemover): 
    1720     """ 
    1721     Covering and removing of instances using weight multiplication: 
     1147class CovererAndRemover_MultWeights(CovererAndRemover): 
     1148    """ 
     1149    Covering and removing of instances using weight multiplication. 
    17221150     
    17231151    :param mult: weighting multiplication factor 
     
    17461174 
    17471175 
    1748 class CovererAndRemover_AddWeights(RuleCovererAndRemover): 
     1176class CovererAndRemover_AddWeights(CovererAndRemover): 
    17491177    """ 
    17501178    Covering and removing of instances using weight addition. 
     
    17811209 
    17821210 
    1783 class CovererAndRemover_Prob(RuleCovererAndRemover): 
     1211class CovererAndRemover_Prob(CovererAndRemover): 
    17841212    """ This class impements probabilistic covering. """ 
    17851213    def __init__(self, examples, weight_id, target_class, apriori, argument_id): 
     
    18161244 
    18171245    def filter_covers_example(self, example, filter): 
    1818         filter_indices = RuleCoversArguments.filterIndices(filter) 
     1246        filter_indices = CoversArguments.filterIndices(filter) 
    18191247        if filter(example): 
    18201248            try: 
     
    18411269 
    18421270    def condIn(self, cond, filter_indices): # is condition in the filter? 
    1843         condInd = RuleCoversArguments.conditionIndex(cond) 
     1271        condInd = CoversArguments.conditionIndex(cond) 
    18441272        if operator.or_(condInd, filter_indices[cond.position]) == filter_indices[cond.position]: 
    18451273            return True 
     
    18701298    """ 
    18711299    def selectSign(oper): 
    1872         if oper == Orange.core.ValueFilter_continuous.Less: 
     1300        if oper == Orange.data.filter.ValueFilter.Less: 
    18731301            return "<" 
    1874         elif oper == Orange.core.ValueFilter_continuous.LessEqual: 
     1302        elif oper == Orange.data.filter.ValueFilter.LessEqual: 
    18751303            return "<=" 
    1876         elif oper == Orange.core.ValueFilter_continuous.Greater: 
     1304        elif oper == Orange.data.filter.ValueFilter.Greater: 
    18771305            return ">" 
    1878         elif oper == Orange.core.ValueFilter_continuous.GreaterEqual: 
     1306        elif oper == Orange.data.filter.ValueFilter.GreaterEqual: 
    18791307            return ">=" 
    18801308        else: return "=" 
     
    18921320        if i > 0: 
    18931321            ret += " AND " 
    1894         if type(c) == Orange.core.ValueFilter_discrete: 
     1322        if isinstance(c, Orange.data.filter.ValueFilterDiscrete): 
    18951323            ret += domain[c.position].name + "=" + str([domain[c.position].\ 
    18961324                values[int(v)] for v in c.values]) 
    1897         elif type(c) == Orange.core.ValueFilter_continuous: 
     1325        elif isinstance(c, Orange.data.filter.ValueFilterContinuous): 
    18981326            ret += domain[c.position].name + selectSign(c.oper) + str(c.ref) 
    1899     if rule.classifier and type(rule.classifier) == Orange.classification.ConstantClassifier\ 
     1327    if isinstance(rule.classifier, Orange.classification.ConstantClassifier) \ 
    19001328            and rule.classifier.default_val: 
    19011329        ret = ret + " THEN " + domain.class_var.name + "=" + \ 
    1902         str(rule.classifier.default_value) 
     1330            str(rule.classifier.default_value) 
    19031331        if show_distribution: 
    19041332            ret += str(rule.class_distribution) 
    1905     elif rule.classifier and type(rule.classifier) == Orange.classification.ConstantClassifier\ 
    1906             and type(domain.class_var) == Orange.core.EnumVariable: 
     1333    elif isinstance(rule.classifier, Orange.classification.ConstantClassifier) \ 
     1334            and isinstance(domain.class_var, Orange.feature.Discrete): 
    19071335        ret = ret + " THEN " + domain.class_var.name + "=" + \ 
    1908         str(rule.class_distribution.modus()) 
     1336            str(rule.class_distribution.modus()) 
    19091337        if show_distribution: 
    19101338            ret += str(rule.class_distribution) 
     
    19141342    if not instances.domain.class_var: 
    19151343        raise Exception("Class variable is required!") 
    1916     if instances.domain.class_var.varType == Orange.core.VarTypes.Continuous: 
     1344    if instances.domain.class_var.var_type != Orange.feature.Type.Discrete: 
    19171345        raise Exception("CN2 requires a discrete class!") 
    19181346 
     
    19251353 
    19261354def rules_equal(rule1, rule2): 
    1927     if not len(rule1.filter.conditions) == len(rule2.filter.conditions): 
     1355    if len(rule1.filter.conditions) != len(rule2.filter.conditions): 
    19281356        return False 
    19291357    for c1 in rule1.filter.conditions: 
     
    19311359        for c2 in rule2.filter.conditions: 
    19321360            try: 
    1933                 if not c1.position == c2.position: continue # same feature? 
    1934                 if not type(c1) == type(c2): continue # same type of condition 
    1935                 if type(c1) == Orange.core.ValueFilter_discrete: 
    1936                     if not type(c1.values[0]) == type(c2.values[0]): continue 
    1937                     if not c1.values[0] == c2.values[0]: continue # same value? 
    1938                 if type(c1) == Orange.core.ValueFilter_continuous: 
    1939                     if not c1.oper == c2.oper: continue # same operator? 
    1940                     if not c1.ref == c2.ref: continue #same threshold? 
     1361                if c1.position == c2.position and type(c1) == type(c2): 
     1362                    continue # same feature and type? 
     1363                if isinstance(c1, Orange.data.filter.ValueFilterDiscrete): 
     1364                    if type(c1.values[0]) != type(c2.values[0]) or \ 
     1365                            c1.values[0] != c2.values[0]: 
     1366                        continue # same value? 
     1367                if isinstance(c1, Orange.data.filter.ValueFilterContinuous): 
     1368                    if c1.oper != c2.oper or c1.ref != c2.ref: 
     1369                        continue # same operator? 
    19411370                found = True 
    19421371                break 
     
    19861415 
    19871416    def createRandomDataSet(self, data): 
    1988         newData = Orange.core.ExampleTable(data) 
     1417        newData = Orange.data.Table(data) 
    19891418        # shuffle data 
    19901419        cl_num = newData.toNumpy("C") 
    19911420        random.shuffle(cl_num[0][:, 0]) 
    1992         clData = Orange.core.ExampleTable(Orange.core.Domain([newData.domain.classVar]), cl_num[0]) 
     1421        clData = Orange.data.Table(Orange.data.Domain([newData.domain.classVar]), cl_num[0]) 
    19931422        for d_i, d in enumerate(newData): 
    19941423            d[newData.domain.classVar] = clData[d_i][newData.domain.classVar] 
     
    20291458        self.learner.ruleFinder.ruleStoppingValidator = Orange.core.RuleValidator_LRS(alpha=1.0) 
    20301459        self.learner.ruleFinder.ruleStoppingValidator.max_rule_complexity = 0 
    2031         self.learner.ruleFinder.refiner = Orange.core.RuleBeamRefiner_Selector() 
    2032         self.learner.ruleFinder.ruleFilter = Orange.core.RuleBeamFilter_Width(width=5) 
     1460        self.learner.ruleFinder.refiner = BeamRefiner_Selector() 
     1461        self.learner.ruleFinder.ruleFilter = BeamFilter_Width(width=5) 
    20331462 
    20341463 
     
    21061535        return self.createEVDistList(extremeDists) 
    21071536 
    2108 class ABBeamFilter(Orange.core.RuleBeamFilter): 
     1537class ABBeamFilter(BeamFilter): 
    21091538    """ 
    21101539    ABBeamFilter: Filters beam; 
     
    21171546 
    21181547    def __call__(self, rulesStar, examples, weight_id): 
    2119         newStar = Orange.core.RuleList() 
     1548        newStar = RuleList() 
    21201549        rulesStar.sort(lambda x, y:-cmp(x.quality, y.quality)) 
    21211550        argsNum = 0 
     
    21471576 
    21481577 
    2149 class RuleCoversArguments: 
     1578class CoversArguments: 
    21501579    """ 
    21511580    Class determines if rule covers one out of a set of arguments. 
     
    21571586            indNA = getattr(a.filter, "indices", None) 
    21581587            if not indNA: 
    2159                 a.filter.setattr("indices", RuleCoversArguments.filterIndices(a.filter)) 
     1588                a.filter.setattr("indices", CoversArguments.filterIndices(a.filter)) 
    21601589            self.indices.append(a.filter.indices) 
    21611590 
     
    21641593            return False 
    21651594        if not getattr(rule.filter, "indices", None): 
    2166             rule.filter.indices = RuleCoversArguments.filterIndices(rule.filter) 
     1595            rule.filter.indices = CoversArguments.filterIndices(rule.filter) 
    21671596        for index in self.indices: 
    21681597            if map(operator.or_, rule.filter.indices, index) == rule.filter.indices: 
     
    21761605        for c in filter.conditions: 
    21771606            ind[c.position] = operator.or_(ind[c.position], 
    2178                                          RuleCoversArguments.conditionIndex(c)) 
     1607                                         CoversArguments.conditionIndex(c)) 
    21791608        return ind 
    21801609    filterIndices = staticmethod(filterIndices) 
    21811610 
    21821611    def conditionIndex(c): 
    2183         if type(c) == Orange.core.ValueFilter_continuous: 
    2184             if (c.oper == Orange.core.ValueFilter_continuous.GreaterEqual or 
    2185                 c.oper == Orange.core.ValueFilter_continuous.Greater): 
     1612        if isinstance(c, Orange.data.filter.ValueFilterContinuous): 
     1613            if (c.oper == Orange.data.filter.ValueFilter.GreaterEqual or 
     1614                c.oper == Orange.data.filter.ValueFilter.Greater): 
    21861615                return 5# 0101 
    2187             elif (c.oper == Orange.core.ValueFilter_continuous.LessEqual or 
    2188                   c.oper == Orange.core.ValueFilter_continuous.Less): 
     1616            elif (c.oper == Orange.data.filter.ValueFilter.LessEqual or 
     1617                  c.oper == Orange.data.filter.ValueFilter.Less): 
    21891618                return 3 # 0011 
    21901619            else: 
     
    22131642 
    22141643 
    2215 class SelectorAdder(Orange.core.RuleBeamRefiner): 
     1644class SelectorAdder(BeamRefiner): 
    22161645    """ 
    22171646    Selector adder, this function is a refiner function: 
     
    22271656 
    22281657    def __call__(self, oldRule, data, weight_id, target_class= -1): 
    2229         inNotAllowedSelectors = RuleCoversArguments(self.not_allowed_selectors) 
    2230         new_rules = Orange.core.RuleList() 
     1658        inNotAllowedSelectors = CoversArguments(self.not_allowed_selectors) 
     1659        new_rules = RuleList() 
    22311660 
    22321661        # get positive indices (selectors already in the rule) 
    22331662        indices = getattr(oldRule.filter, "indices", None) 
    22341663        if not indices: 
    2235             indices = RuleCoversArguments.filterIndices(oldRule.filter) 
     1664            indices = CoversArguments.filterIndices(oldRule.filter) 
    22361665            oldRule.filter.setattr("indices", indices) 
    22371666 
     
    22401669        for nA in self.not_allowed_selectors: 
    22411670            #print indices, nA.filter.indices 
    2242             at_i, type_na = RuleCoversArguments.oneSelectorToCover(indices, nA.filter.indices) 
     1671            at_i, type_na = CoversArguments.oneSelectorToCover(indices, nA.filter.indices) 
    22431672            if at_i > -1: 
    22441673                negative_indices[at_i] = operator.or_(negative_indices[at_i], type_na) 
     
    22501679            if ind == 1: 
    22511680                continue 
    2252             if data.domain[i].varType == Orange.core.VarTypes.Discrete and not negative_indices[i] == 1: # DISCRETE attribute 
     1681            if data.domain[i].varType == Orange.feature.Type.Discrete and not negative_indices[i] == 1: # DISCRETE attribute 
    22531682                if self.example: 
    22541683                    values = [self.example[i]] 
     
    22571686                for v in values: 
    22581687                    tempRule = oldRule.clone() 
    2259                     tempRule.filter.conditions.append(Orange.core.ValueFilter_discrete(position=i, 
    2260                                                                                   values=[Orange.core.Value(data.domain[i], v)], 
    2261                                                                                   acceptSpecial=0)) 
     1688                    tempRule.filter.conditions.append( 
     1689                        Orange.data.filter.Discrete( 
     1690                            position=i, 
     1691                            values=[Orange.data.Value(data.domain[i], v)], 
     1692                            acceptSpecial=0)) 
    22621693                    tempRule.complexity += 1 
    2263                     tempRule.filter.indices[i] = 1 # 1 stands for discrete attribute (see RuleCoversArguments.conditionIndex) 
     1694                    tempRule.filter.indices[i] = 1 # 1 stands for discrete attribute (see CoversArguments.conditionIndex) 
    22641695                    tempRule.filterAndStore(oldRule.examples, oldRule.weightID, target_class) 
    22651696                    if len(tempRule.examples) < len(oldRule.examples): 
    22661697                        new_rules.append(tempRule) 
    2267             elif data.domain[i].varType == Orange.core.VarTypes.Continuous and not negative_indices[i] == 7: # CONTINUOUS attribute 
     1698            elif data.domain[i].varType == Orange.feature.Type.Continuous and not negative_indices[i] == 7: # CONTINUOUS attribute 
    22681699                try: 
    22691700                    at = data.domain[i] 
     
    22761707                        #LESS 
    22771708                        if not negative_indices[i] == 3: 
    2278                             tempRule = self.getTempRule(oldRule, i, Orange.core.ValueFilter_continuous.LessEqual, p, target_class, 3) 
     1709                            tempRule = self.getTempRule(oldRule, i, Orange.data.filter.ValueFilter.LessEqual, p, target_class, 3) 
    22791710                            if len(tempRule.examples) < len(oldRule.examples) and self.example[i] <= p:# and not inNotAllowedSelectors(tempRule): 
    22801711                                new_rules.append(tempRule) 
    22811712                        #GREATER 
    22821713                        if not negative_indices[i] == 5: 
    2283                             tempRule = self.getTempRule(oldRule, i, Orange.core.ValueFilter_continuous.Greater, p, target_class, 5) 
     1714                            tempRule = self.getTempRule(oldRule, i, Orange.data.filter.ValueFilter.Greater, p, target_class, 5) 
    22841715                            if len(tempRule.examples) < len(oldRule.examples) and self.example[i] > p:# and not inNotAllowedSelectors(tempRule): 
    22851716                                new_rules.append(tempRule) 
     
    22921723        tempRule = oldRule.clone() 
    22931724 
    2294         tempRule.filter.conditions.append(Orange.core.ValueFilter_continuous(position=pos, 
    2295                                                                         oper=oper, 
    2296                                                                         ref=ref, 
    2297                                                                         acceptSpecial=0)) 
     1725        tempRule.filter.conditions.append( 
     1726            Orange.data.filter.ValueFilterContinuous( 
     1727                position=pos, oper=oper, ref=ref, acceptSpecial=0)) 
    22981728        tempRule.complexity += 1 
    22991729        tempRule.filter.indices[pos] = operator.or_(tempRule.filter.indices[pos], atIndex) # from RuleCoversArguments.conditionIndex 
     
    23131743# This filter is the ugliest code ever! Problem is with Orange, I had some problems with inheriting deepCopy 
    23141744# I should take another look at it. 
    2315 class ArgFilter(Orange.core.Filter): 
     1745class ArgFilter(Orange.data.filter.Filter): 
    23161746    """ This class implements AB-covering principle. """ 
    2317     def __init__(self, argument_id=None, filter=Orange.core.Filter_values(), arg_example=None): 
     1747    def __init__(self, argument_id=None, filter=Orange.data.filter.Values(), arg_example=None): 
    23181748        self.filter = filter 
    23191749        self.indices = getattr(filter, "indices", []) 
    23201750        if not self.indices and len(filter.conditions) > 0: 
    2321             self.indices = RuleCoversArguments.filterIndices(filter) 
     1751            self.indices = CoversArguments.filterIndices(filter) 
    23221752        self.argument_id = argument_id 
    23231753        self.domain = self.filter.domain 
     
    23661796    def deep_copy(self): 
    23671797        newFilter = ArgFilter(argument_id=self.argument_id) 
    2368         newFilter.filter = Orange.core.Filter_values() #self.filter.deepCopy() 
     1798        newFilter.filter = Orange.data.filter.Values() #self.filter.deepCopy() 
    23691799        newFilter.filter.conditions = self.filter.conditions[:] 
    23701800        newFilter.domain = self.filter.domain 
     
    23781808ArgFilter = deprecated_members({"argumentID": "argument_id"})(ArgFilter) 
    23791809 
    2380 class SelectorArgConditions(Orange.core.RuleBeamRefiner): 
     1810class SelectorArgConditions(BeamRefiner): 
    23811811    """ 
    23821812    Selector adder, this function is a refiner function: 
     
    23901820    def __call__(self, oldRule, data, weight_id, target_class= -1): 
    23911821        if len(oldRule.filter.conditions) >= len(self.allowed_selectors): 
    2392             return Orange.core.RuleList() 
    2393         new_rules = Orange.core.RuleList() 
     1822            return RuleList() 
     1823        new_rules = RuleList() 
    23941824        for c in self.allowed_selectors: 
    23951825            # normal condition 
     
    24111841                for v in values: 
    24121842                    tempRule = oldRule.clone() 
    2413                     tempRule.filter.conditions.append(Orange.core.ValueFilter_continuous(position=c.position, 
    2414                                                                                     oper=c.oper, 
    2415                                                                                     ref=float(v), 
    2416                                                                                     acceptSpecial=0)) 
     1843                    tempRule.filter.conditions.append( 
     1844                        Orange.data.filter.ValueFilterContinuous( 
     1845                            position=c.position, oper=c.oper, 
     1846                            ref=float(v), acceptSpecial=0)) 
    24171847                    if tempRule(self.example): 
    2418                         tempRule.filterAndStore(oldRule.examples, oldRule.weightID, target_class) 
     1848                        tempRule.filterAndStore( 
     1849                            oldRule.examples, oldRule.weightID, target_class) 
    24191850                        if len(tempRule.examples) < len(oldRule.examples): 
    24201851                            new_rules.append(tempRule) 
     
    24401871        prob_dist = Orange.core.DistributionList() 
    24411872        for tex in res.results: 
    2442             d = Orange.core.Distribution(examples.domain.class_var) 
     1873            d = Orange.statistics.Distribution(examples.domain.class_var) 
    24431874            for di in range(len(d)): 
    24441875                d[di] = tex.probabilities[0][di] 
     
    24671898##            for e in examples: 
    24681899##                prob_dist.append(classifier(e,Orange.core.GetProbabilities)) 
    2469             cl = Orange.core.RuleClassifier_logit(rules, self.min_cl_sig, self.min_beta, examples, weight, self.set_prefix_rules, self.optimize_betas, classifier, prob_dist) 
     1900            cl = RuleClassifier_logit(rules, self.min_cl_sig, self.min_beta, examples, weight, self.set_prefix_rules, self.optimize_betas, classifier, prob_dist) 
    24701901        else: 
    2471             cl = Orange.core.RuleClassifier_logit(rules, self.min_cl_sig, self.min_beta, examples, weight, self.set_prefix_rules, self.optimize_betas) 
     1902            cl = RuleClassifier_logit(rules, self.min_cl_sig, self.min_beta, examples, weight, self.set_prefix_rules, self.optimize_betas) 
    24721903 
    24731904##        print "result" 
     
    24841915    def add_null_rule(self, rules, examples, weight): 
    24851916        for cl in examples.domain.class_var: 
    2486             tmpRle = Orange.core.Rule() 
    2487             tmpRle.filter = Orange.core.Filter_values(domain=examples.domain) 
     1917            tmpRle = Rule() 
     1918            tmpRle.filter = Orange.data.filter.Values(domain=examples.domain) 
    24881919            tmpRle.parentRule = None 
    24891920            tmpRle.filterAndStore(examples, weight, int(cl)) 
     
    24931924 
    24941925    def sort_rules(self, rules): 
    2495         new_rules = Orange.core.RuleList() 
     1926        new_rules = RuleList() 
    24961927        foundRule = True 
    24971928        while foundRule: 
     
    25221953 
    25231954 
    2524 class RuleClassifier_bestRule(Orange.core.RuleClassifier): 
     1955class RuleClassifier_bestRule(RuleClassifier): 
    25251956    """ 
    25261957    A very simple classifier, it takes the best rule of each class and 
     
    25301961        self.rules = rules 
    25311962        self.examples = examples 
    2532         self.apriori = Orange.core.Distribution(examples.domain.class_var, examples, weight_id) 
     1963        self.apriori = Orange.statistics.Distribution(examples.domain.class_var, examples, weight_id) 
    25331964        self.apriori_prob = [a / self.apriori.abs for a in self.apriori] 
    25341965        self.weight_id = weight_id 
     
    25381969    @deprecated_keywords({"retRules": "ret_rules"}) 
    25391970    def __call__(self, example, result_type=Orange.classification.Classifier.GetValue, ret_rules=False): 
    2540         example = Orange.core.Example(self.examples.domain, example) 
    2541         tempDist = Orange.core.Distribution(example.domain.class_var) 
     1971        example = Orange.data.Instance(self.examples.domain, example) 
     1972        tempDist = Orange.statistics.Distribution(example.domain.class_var) 
    25421973        best_rules = [None] * len(example.domain.class_var.values) 
    25431974 
     
    25601991        else: 
    25611992            tempDist.normalize() # prior probability 
    2562             tmp_examples = Orange.core.ExampleTable(self.examples) 
     1993            tmp_examples = Orange.data.Table(self.examples) 
    25631994            for r in best_rules: 
    25641995                if r: 
    25651996                    tmp_examples = r.filter(tmp_examples) 
    2566             tmpDist = Orange.core.Distribution(tmp_examples.domain.class_var, tmp_examples, self.weight_id) 
     1997            tmpDist = Orange.statistics.Distribution(tmp_examples.domain.class_var, tmp_examples, self.weight_id) 
    25671998            tmpDist.normalize() 
    25681999            probs = [0.] * len(self.examples.domain.class_var.values) 
    25692000            for i in range(len(self.examples.domain.class_var.values)): 
    25702001                probs[i] = tmpDist[i] + tempDist[i] * 2 
    2571             final_dist = Orange.core.Distribution(self.examples.domain.class_var) 
     2002            final_dist = Orange.statistics.Distribution(self.examples.domain.class_var) 
    25722003            for cl_i, cl in enumerate(self.examples.domain.class_var): 
    25732004                final_dist[cl] = probs[cl_i] 
     
    25772008            if result_type == Orange.classification.Classifier.GetValue: 
    25782009              return (final_dist.modus(), best_rules) 
    2579             if result_type == Orange.core.GetProbabilities: 
     2010            if result_type == Orange.classification.Classifier.GetProbabilities: 
    25802011              return (final_dist, best_rules) 
    25812012            return (final_dist.modus(), final_dist, best_rules) 
    25822013        if result_type == Orange.classification.Classifier.GetValue: 
    25832014          return final_dist.modus() 
    2584         if result_type == Orange.core.GetProbabilities: 
     2015        if result_type == Orange.classification.Classifier.GetProbabilities: 
    25852016          return final_dist 
    25862017        return (final_dist.modus(), final_dist) 
  • docs/reference/rst/Orange.classification.rules.rst

    r9372 r10370  
    1 .. automodule:: Orange.classification.rules 
     1.. py:currentmodule:: Orange.classification.rules 
     2 
     3.. index:: rule induction 
     4 
     5.. index::  
     6   single: classification; rule induction 
     7 
     8************************** 
     9Rule induction (``rules``) 
     10************************** 
     11 
     12Module ``rules`` implements supervised rule induction algorithms and 
     13rule-based classification methods. Rule induction is based on a 
     14comprehensive framework of components that can be modified or 
     15replaced. For ease of use, the module already provides multiple 
     16variations of `CN2 induction algorithm 
     17<http://www.springerlink.com/content/k6q2v76736w5039r/>`_. 
     18 
     19CN2 algorithm 
     20============= 
     21 
     22.. index::  
     23   single: classification; CN2 
     24 
     25The use of rule learning algorithms is consistent with a typical 
     26learner usage in Orange: 
     27 
     28:download:`rules-cn2.py <code/rules-cn2.py>` 
     29 
     30.. literalinclude:: code/rules-cn2.py 
     31    :lines: 7- 
     32 
     33:: 
     34     
     35    IF sex=['female'] AND status=['first'] AND age=['child'] THEN survived=yes<0.000, 1.000> 
     36    IF sex=['female'] AND status=['second'] AND age=['child'] THEN survived=yes<0.000, 13.000> 
     37    IF sex=['male'] AND status=['second'] AND age=['child'] THEN survived=yes<0.000, 11.000> 
     38    IF sex=['female'] AND status=['first'] THEN survived=yes<4.000, 140.000> 
     39    IF status=['first'] AND age=['child'] THEN survived=yes<0.000, 5.000> 
     40    IF sex=['male'] AND status=['second'] THEN survived=no<154.000, 14.000> 
     41    IF status=['crew'] AND sex=['female'] THEN survived=yes<3.000, 20.000> 
     42    IF status=['second'] THEN survived=yes<13.000, 80.000> 
     43    IF status=['third'] AND sex=['male'] AND age=['adult'] THEN survived=no<387.000, 75.000> 
     44    IF status=['crew'] THEN survived=no<670.000, 192.000> 
     45    IF age=['child'] AND sex=['male'] THEN survived=no<35.000, 13.000> 
     46    IF sex=['male'] THEN survived=no<118.000, 57.000> 
     47    IF age=['child'] THEN survived=no<17.000, 14.000> 
     48    IF TRUE THEN survived=no<89.000, 76.000> 
     49     
     50.. autoclass:: Orange.classification.rules.CN2Learner(evaluator=Evaluator_Entropy, beam_width=5, alpha=1) 
     51   :members: 
     52   :show-inheritance: 
     53   :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
     54      ruleFinder, ruleStopping, storeInstances, targetClass, weightID 
     55    
     56.. autoclass:: Orange.classification.rules.CN2Classifier 
     57   :members: 
     58   :show-inheritance: 
     59   :exclude-members: beamWidth, resultType 
     60    
     61.. index:: unordered CN2 
     62 
     63.. index::  
     64   single: classification; unordered CN2 
     65 
     66.. autoclass:: Orange.classification.rules.CN2UnorderedLearner(evaluator=Evaluator_Laplace(), beam_width=5, alpha=1.0) 
     67   :members: 
     68   :show-inheritance: 
     69   :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
     70      ruleFinder, ruleStopping, storeInstances, targetClass, weightID 
     71    
     72.. autoclass:: Orange.classification.rules.CN2UnorderedClassifier 
     73   :members: 
     74   :show-inheritance: 
     75    
     76.. index:: CN2-SD 
     77.. index:: subgroup discovery 
     78 
     79.. index::  
     80   single: classification; CN2-SD 
     81    
     82.. autoclass:: Orange.classification.rules.CN2SDUnorderedLearner(evaluator=WRACCEvaluator(), beam_width=5, alpha=0.05, mult=0.7) 
     83   :members: 
     84   :show-inheritance: 
     85   :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
     86      ruleFinder, ruleStopping, storeInstances, targetClass, weightID 
     87    
     88.. autoclass:: Orange.classification.rules.CN2EVCUnorderedLearner 
     89   :members: 
     90   :show-inheritance: 
     91    
     92 
     93.. 
     94    This part is commented out since 
     95    - there is no documentation on how to provide arguments 
     96    - the whole thing is represent original research work particular to 
     97      a specific project and belongs to an 
     98      extension rather than to the main package 
     99 
     100    Argument based CN2 
     101    ================== 
     102 
     103    Orange also supports argument-based CN2 learning. 
     104 
     105    .. autoclass:: Orange.classification.rules.ABCN2 
     106       :members: 
     107       :show-inheritance: 
     108       :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
     109      ruleFinder, ruleStopping, storeInstances, targetClass, weightID, 
     110      argument_id 
     111 
     112       This class has many more undocumented methods; see the source code for 
     113       reference. 
     114 
     115    .. autoclass:: Orange.classification.rules.ABCN2Ordered 
     116       :members: 
     117       :show-inheritance: 
     118 
     119    .. autoclass:: Orange.classification.rules.ABCN2M 
     120       :members: 
     121       :show-inheritance: 
     122       :exclude-members: baseRules, beamWidth, coverAndRemove, dataStopping, 
     123      ruleFinder, ruleStopping, storeInstances, targetClass, weightID 
     124 
     125    Thismodule has many more undocumented argument-based learning 
     126    related classed; see the source code for reference. 
     127 
     128    References 
     129    ---------- 
     130 
     131    * Bratko, Mozina, Zabkar. `Argument-Based Machine Learning 
     132      <http://www.springerlink.com/content/f41g17t1259006k4/>`_. Lecture Notes in 
     133      Computer Science: vol. 4203/2006, 11-17, 2006. 
     134 
     135 
     136Rule induction framework 
     137======================== 
     138 
     139The classes described above are based on a more general framework that 
     140can be fine-tuned to specific needs by replacing individual components. 
     141Here is an example: 
     142 
     143part of :download:`rules-customized.py <code/rules-customized.py>` 
     144 
     145.. literalinclude:: code/rules-customized.py 
     146    :lines: 7-17 
     147 
     148:: 
     149 
     150    IF sex=['male'] AND status=['second'] AND age=['adult'] THEN survived=no<154.000, 14.000> 
     151    IF sex=['male'] AND status=['third'] AND age=['adult'] THEN survived=no<387.000, 75.000> 
     152    IF sex=['female'] AND status=['first'] THEN survived=yes<4.000, 141.000> 
     153    IF status=['crew'] AND sex=['male'] THEN survived=no<670.000, 192.000> 
     154    IF status=['second'] THEN survived=yes<13.000, 104.000> 
     155    IF status=['third'] AND sex=['male'] THEN survived=no<35.000, 13.000> 
     156    IF status=['first'] AND age=['adult'] THEN survived=no<118.000, 57.000> 
     157    IF status=['crew'] THEN survived=yes<3.000, 20.000> 
     158    IF sex=['female'] THEN survived=no<106.000, 90.000> 
     159    IF TRUE THEN survived=yes<0.000, 5.000> 
     160 
     161In the example, we wanted to use a rule evaluator based on the 
     162m-estimate and set ``m`` to 50. The evaluator is a subcomponent of the 
     163:obj:`rule_finder` component. Thus, to be able to set the evaluator, 
     164we first set the :obj:`rule_finder` component, then we added the 
     165desired subcomponent and set its options. All other components, which 
     166are left unspecified, are provided by the learner at the training time 
     167and removed afterwards. 
     168 
     169Continuing with the example, assume that we wish to set a different 
     170validation function and a different beam width. 
     171 
     172part of :download:`rules-customized.py <code/rules-customized.py>` 
     173 
     174.. literalinclude:: code/rules-customized.py 
     175    :lines: 19-23 
     176 
     177.. py:class:: Orange.classification.rules.Rule(filter, classifier, lr, dist, ce, w = 0, qu = -1) 
     178    
     179   Represents a single rule. Constructor arguments correspond to the 
     180   first seven of the attributes (from :obj:`filter` to 
     181   :obj:`quality`) below. 
     182    
     183   .. attribute:: filter 
     184    
     185      Rule condition; an instance of 
     186      :class:`Orange.data.filter.Filter`, typically an instance of a 
     187      class derived from :class:`Orange.data.filter.Values` 
     188    
     189   .. attribute:: classifier 
     190       
     191      A rule predicts the class by calling an embedded classifier that 
     192      must be an instance of 
     193      :class:`~Orange.classification.Classifier`, typically 
     194      :class:`~Orange.classification.ConstantClassifier`. This 
     195      classifier is called by the rule classifier, such as 
     196      :obj:`RuleClassifier`. 
     197    
     198   .. attribute:: learner 
     199       
     200      Learner that is used for constructing a classifier. It must be 
     201      an instance of :class:`~Orange.classification.Learner`, 
     202      typically 
     203      :class:`~Orange.classification.majority.MajorityLearner`. 
     204    
     205   .. attribute:: class_distribution 
     206       
     207      Distribution of class in data instances covered by this rule 
     208      (:class:`~Orange.statistics.distribution.Distribution`). 
     209    
     210   .. attribute:: instances 
     211       
     212      Data instances covered by this rule (:class:`~Orange.data.Table`). 
     213    
     214   .. attribute:: weight_id 
     215    
     216      ID of the weight meta-attribute for the stored data instances 
     217      (``int``). 
     218    
     219   .. attribute:: quality 
     220       
     221      Quality of the rule. Rules with higher quality are better 
     222      (``float``). 
     223    
     224   .. attribute:: complexity 
     225    
     226      Complexity of the rule (``float``), typically the number of 
     227      selectors (conditions) in the rule. Complexity is used for 
     228      choosing between rules with the same quality; rules with lower 
     229      complexity are preferred. 
     230    
     231   .. method:: filter_and_store(instances, weight_id=0, target_class=-1) 
     232    
     233      Filter passed data instances and store them in :obj:`instances`. 
     234      Also, compute :obj:`class_distribution`, set the weight of 
     235      stored examples and create a new classifier using :obj:`learner`. 
     236       
     237      :param weight_id: ID of the weight meta-attribute. 
     238      :type weight_id: int 
     239      :param target_class: index of target class; -1 for all classes. 
     240      :type target_class: int 
     241    
     242   .. method:: __call__(instance) 
     243    
     244      Return ``True`` if the given instance matches the rule condition. 
     245       
     246      :param instance: data instance 
     247      :type instance: :class:`Orange.data.Instance` 
     248       
     249   .. method:: __call__(instances, ref=True, negate=False) 
     250 
     251      Return a table of instances that match the rule condition. 
     252       
     253      :param instances: a table of data instances 
     254      :type instances: :class:`Orange.data.Table` 
     255      :param ref: if ``True`` (default), the constructed table contains 
     256          references to the original data instances; if ``False``, the 
     257          data is copied 
     258      :type ref: bool 
     259      :param negate: inverts the selection 
     260      :type negate: bool 
     261 
     262 
     263 
     264.. py:class:: Orange.classification.rules.RuleLearner(store_instances=True, target_class=-1, base_rules=Orange.classification.rules.RuleList()) 
     265    
     266   Bases: :class:`Orange.classification.Learner` 
     267    
     268   A base rule induction learner. The algorithm follows 
     269   separate-and-conquer strategy, which has its origins in the AQ 
     270   family of algorithms (Fuernkranz J.; Separate-and-Conquer Rule 
     271   Learning, Artificial Intelligence Review 13, 3-54, 1999). Such 
     272   algorithms search for the optimal rule for the current training 
     273   set, remove the covered training instances (`separate`) and repeat 
     274   the process (`conquer`) on the remaining data. 
     275    
     276   :param store_instances: if ``True`` (default), the induced rules 
     277       contain a table with references to the stored data instances. 
     278   :type store_instances: bool 
     279     
     280   :param target_class: index of a specific class to learn; if -1 
     281        there is no target class 
     282   :type target_class: int 
     283    
     284   :param base_rules: An optional list of initial rules for constraining the :obj:`rule_finder`. 
     285   :type base_rules: :class:`~Orange.classification.rules.RuleList` 
     286 
     287   The class' functionality is best explained by its ``__call__`` 
     288   function. 
     289    
     290   .. parsed-literal:: 
     291 
     292      def \_\_call\_\_(self, instances, weight_id=0): 
     293          rule_list = Orange.classification.rules.RuleList() 
     294          all_instances = Orange.data.Table(instances) 
     295          while not self.\ **data_stopping**\ (instances, weight_id, self.target_class): 
     296              new_rule = self.\ **rule_finder**\ (instances, weight_id, self.target_class, self.base_rules) 
     297              if self.\ **rule_stopping**\ (rule_list, new_rule, instances, weight_id): 
     298                  break 
     299              instances, weight_id = self.\ **cover_and_remove**\ (new_rule, instances, weight_id, self.target_class) 
     300              rule_list.append(new_rule) 
     301          return Orange.classification.rules.RuleClassifier_FirstRule( 
     302              rules=rule_list, instances=all_instances) 
     303        
     304   The customizable components are :obj:`data_stopping`, 
     305   :obj:`rule_finder`, :obj:`cover_and_remove` and :obj:`rule_stopping` 
     306   objects. 
     307    
     308   .. attribute:: data_stopping 
     309    
     310      An instance of 
     311      :class:`~Orange.classification.rules.RuleDataStoppingCriteria` 
     312      that determines whether to continue the induction. The default 
     313      component, 
     314      :class:`~Orange.classification.rules.RuleDataStoppingCriteria_NoPositives` 
     315      returns ``True`` if there are no more instances of the target class.  
     316    
     317   .. attribute:: rule_finder 
     318       
     319      An instance of :class:`~Orange.classification.rules.RuleFinder` 
     320      that learns a single rule. Default is 
     321      :class:`~Orange.classification.rules.RuleBeamFinder`. 
     322 
     323   .. attribute:: rule_stopping 
     324       
     325      An instance of 
     326      :class:`~Orange.classification.rules.RuleStoppingCriteria` that 
     327      decides whether to use the induced rule or to discard it and stop 
     328      the induction. If ``None`` (default) all rules are accepted. 
     329        
     330   .. attribute:: cover_and_remove 
     331        
     332      An instance of :class:`RuleCovererAndRemover` that removes 
     333      instances covered by the rule and returns remaining 
     334      instances. The default implementation 
     335      (:class:`RuleCovererAndRemover_Default`) removes the instances 
     336      that belong to given target class; if the target is not 
     337      specified (:obj:`target_class` == -1), it removes all covered 
     338      instances.     
     339 
     340 
     341Rule finders 
     342------------ 
     343 
     344.. class:: Orange.classification.rules.RuleFinder 
     345 
     346   Base class for rule finders, which learn a single rule from 
     347   instances. 
     348    
     349   .. method:: __call__(table, weight_id, target_class, base_rules) 
     350    
     351      Induce a new rule from the given data. 
     352       
     353      :param table: training data instances 
     354      :type table: :class:`Orange.data.Table` 
     355       
     356      :param weight_id: ID of the weight meta-attribute 
     357      :type weight_id: int 
     358       
     359      :param target_class: index of a specific class being learned; -1 for all. 
     360      :type target_class: int  
     361       
     362      :param base_rules: A list of initial rules for constraining the search space 
     363      :type base_rules: :class:`~Orange.classification.rules.RuleList` 
     364 
     365 
     366.. class:: Orange.classification.rules.RuleBeamFinder 
     367    
     368   Bases: :class:`~Orange.classification.rules.RuleFinder` 
     369    
     370   Beam search for the best rule. This is the default finder for 
     371   :obj:`RuleLearner`. Pseudo code of the algorithm is as follows. 
     372 
     373   .. parsed-literal:: 
     374 
     375      def \_\_call\_\_(self, table, weight_id, target_class, base_rules): 
     376          prior = Orange.statistics.distribution.Distribution(table.domain.class_var, table, weight_id) 
     377          rules_star, best_rule = self.\ **initializer**\ (table, weight_id, target_class, base_rules, self.evaluator, prior) 
     378          \# compute quality of rules in rules_star and best_rule 
     379          ... 
     380          while len(rules_star) \> 0: 
     381              candidates, rules_star = self.\ **candidate_selector**\ (rules_star, table, weight_id) 
     382              for cand in candidates: 
     383                  new_rules = self.\ **refiner**\ (cand, table, weight_id, target_class) 
     384                  for new_rule in new_rules: 
     385                      if self.\ **rule_stopping_validator**\ (new_rule, table, weight_id, target_class, cand.class_distribution): 
     386                          new_rule.quality = self.\ **evaluator**\ (new_rule, table, weight_id, target_class, prior) 
     387                          rules_star.append(new_rule) 
     388                          if self.\ **validator**\ (new_rule, table, weight_id, target_class, prior) and 
     389                              new_rule.quality \> best_rule.quality: 
     390                              best_rule = new_rule 
     391              rules_star = self.\ **rule_filter**\ (rules_star, table, weight_id) 
     392          return best_rule 
     393           
     394   Modifiable components are shown in bold. These are: 
     395 
     396   .. attribute:: initializer 
     397    
     398      An instance of 
     399      :obj:`~Orange.classification.rules.RuleBeamInitializer` that 
     400      is used to construct the initial list of rules. The default, 
     401      :class:`~Orange.classification.rules.RuleBeamInitializer_Default`, 
     402      returns :obj:`base_rules`, or a rule with no conditions if 
     403      :obj:`base_rules` is not set. 
     404    
     405   .. attribute:: candidate_selector 
     406    
     407      An instance of 
     408      :class:`~Orange.classification.rules.RuleBeamCandidateSelector` 
     409      used to separate a subset of rules from the current 
     410      :obj:`rules_star` that will be further specialized.  The default 
     411      component, an instance of 
     412      :class:`~Orange.classification.rules.RuleBeamCandidateSelector_TakeAll`, 
     413      selects all rules. 
     414     
     415   .. attribute:: refiner 
     416    
     417      An instance of 
     418      :class:`~Orange.classification.rules.RuleBeamRefiner` that is 
     419      used for refining the rules. Refined rule should cover a strict 
     420      subset of instances covered by the given rule. Default component 
     421      (:class:`~Orange.classification.rules.RuleBeamRefiner_Selector`) 
     422      adds a conjunctive selector to selectors present in the rule. 
     423     
     424   .. attribute:: rule_filter 
     425    
     426      An instance of 
     427      :class:`~Orange.classification.rules.RuleBeamFilter` that is 
     428      used for filtering rules to trim the search beam. The default 
     429      component, 
     430      :class:`~Orange.classification.rules.RuleBeamFilter_Width`\ 
     431      *(m=5)*\, keeps the five best rules. 
     432 
     433   .. method:: __call__(data, weight_id, target_class, base_rules) 
     434 
     435       Determines the optimal rule to cover the given data instances. 
     436 
     437       :param data: data instances. 
     438       :type data: :class:`Orange.data.Table` 
     439 
     440       :param weight_id: index of the weight meta-attribute. 
     441       :type weight_id: int 
     442 
     443       :param target_class: index of the target class. 
     444       :type target_class: int 
     445 
     446       :param base_rules: existing rules 
     447       :type base_rules: :class:`~Orange.classification.rules.RuleList` 
     448 
     449Rule evaluators 
     450--------------- 
     451 
     452.. class:: Orange.classification.rules.RuleEvaluator 
     453 
     454   Base class for rule evaluators that evaluate the quality of the 
     455   rule based on the data instances they cover. 
     456    
     457   .. method:: __call__(rule, instances, weight_id, target_class, prior) 
     458    
     459      Calculate a (non-negative) rule quality. 
     460       
     461      :param rule: rule to evaluate 
     462      :type rule: :class:`~Orange.classification.rules.Rule` 
     463       
     464      :param instances: data instances covered by the rule 
     465      :type instances: :class:`Orange.data.Table` 
     466       
     467      :param weight_id: index of the weight meta-attribute 
     468      :type weight_id: int 
     469       
     470      :param target_class: index of target class of this rule 
     471      :type target_class: int 
     472       
     473      :param prior: prior class distribution 
     474      :type prior: :class:`Orange.statistics.distribution.Distribution` 
     475 
     476.. autoclass:: Orange.classification.rules.LaplaceEvaluator 
     477   :members: 
     478   :show-inheritance: 
     479   :exclude-members: targetClass, weightID 
     480 
     481.. autoclass:: Orange.classification.rules.WRACCEvaluator 
     482   :members: 
     483   :show-inheritance: 
     484   :exclude-members: targetClass, weightID 
     485    
     486.. class:: Orange.classification.rules.RuleEvaluator_Entropy 
     487 
     488   Bases: :class:`~Orange.classification.rules.RuleEvaluator` 
     489     
     490.. class:: Orange.classification.rules.RuleEvaluator_LRS 
     491 
     492   Bases: :class:`~Orange.classification.rules.RuleEvaluator` 
     493 
     494.. class:: Orange.classification.rules.RuleEvaluator_Laplace 
     495 
     496   Bases: :class:`~Orange.classification.rules.RuleEvaluator` 
     497 
     498.. class:: Orange.classification.rules.RuleEvaluator_mEVC 
     499 
     500   Bases: :class:`~Orange.classification.rules.RuleEvaluator` 
     501    
     502Instance covering and removal 
     503----------------------------- 
     504 
     505.. class:: RuleCovererAndRemover 
     506 
     507   Base class for rule coverers and removers that, when invoked, remove 
     508   instances covered by the rule and return remaining instances. 
     509 
     510.. autoclass:: CovererAndRemover_MultWeights 
     511 
     512.. autoclass:: CovererAndRemover_AddWeights 
     513    
     514Miscellaneous functions 
     515----------------------- 
     516 
     517.. automethod:: Orange.classification.rules.rule_to_string 
     518 
     519.. 
     520    Undocumented are: 
     521    Data-based Stopping Criteria 
     522    ---------------------------- 
     523    Rule-based Stopping Criteria 
     524    ---------------------------- 
     525    Rule-based Stopping Criteria 
     526    ---------------------------- 
     527 
     528References 
     529---------- 
     530 
     531* Clark, Niblett. `The CN2 Induction Algorithm 
     532  <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.9180>`_. Machine 
     533  Learning 3(4):261--284, 1989. 
     534* Clark, Boswell. `Rule Induction with CN2: Some Recent Improvements 
     535  <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.1700>`_. In 
     536  Machine Learning - EWSL-91. Proceedings of the European Working Session on 
     537  Learning, pp 151--163, Porto, Portugal, March 1991. 
     538* Lavrac, Kavsek, Flach, Todorovski: `Subgroup Discovery with CN2-SD 
     539  <http://jmlr.csail.mit.edu/papers/volume5/lavrac04a/lavrac04a.pdf>`_. Journal 
     540  of Machine Learning Research 5: 153-188, 2004. 
     541 
Note: See TracChangeset for help on using the changeset viewer.