Changeset 7242:9ba3fbcc592c in orange


Ignore:
Timestamp:
02/02/11 21:12:38 (3 years ago)
Author:
Gregor Rot <gregor.rot@…>
Branch:
default
Convert:
77a9b2c57a445f7bb2534cc4c5926b81a02c39b5
Message:
 
File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/associate/__init__.py

    r7237 r7242  
    1111 
    1212================= 
    13 Association rules inducer with Agrawal's algorithm 
     13Agrawal's algorithm 
    1414================= 
    1515 
     
    3838The last attribute deserves some explanation. The algorithm's running time (and its memory consumption) depends on the minimal support; the lower the requested support, the more eligible itemsets will be found. There is no general rule for knowing the itemset in advance (generally, value should be around 0.3, but this depends upon the number of different items, the diversity of examples...) so it's very easy to set the limit too low. In this case, the algorithm can induce hundreds of thousands of itemsets until it runs out of memory. To prevent this, it will stop inducing itemsets and report an error when the prescribed maximum maxItemSets is exceeded. In this case, you should increase the required support. On the other hand, you can (reasonably) increase the maxItemSets to as high as you computer is able to handle. 
    3939 
    40          
    41 Examples for AssociationRulesSparseInducer 
    42 ======== 
    43  
    4440We shall test the rule inducer on a dataset consisting of a brief description of Spanish Inquisition, given by Palin et al: 
    4541 
     
    5046The text needs to be cleaned of punctuation marks and capital letters at beginnings of the sentences, each sentence needs to be put in a new line and commas need to be inserted between the words. 
    5147 
    52 inquisition.basket :: 
     48**inquisition.basket** :: 
    5349 
    5450    nobody, expects, the, Spanish, Inquisition 
     
    6561Inducing the rules is trivial. 
    6662 
    67 assoc-agrawal.py (uses inquisition.basket) :: 
     63**assoc-agrawal.py** (uses inquisition.basket) :: 
    6864 
    6965    import Orange 
     
    9389If examples are weighted, weight can be passed as an additional argument to call operator. 
    9490 
    95 To get only a list of supported item sets, one should call the method getItemsets. The result is a list whose elements are tuples with two elements. The first is a tuple with indices of attributes in the item set. Sparse examples are usually represented with meta attributes, so this indices will be negative. The second element is  a list of indices supporting the item set, that is, containing all the items in the set. If storeExamples is False, the second element is None. 
    96  
    97 assoc-agrawal.py (uses inquisition.basket) :: 
     91To get only a list of supported item sets, one should call the method getItemsets. The result is a list whose elements are tuples with two elements. The first is a tuple with indices of attributes in the item set. Sparse examples are usually represented with meta attributes, so this indices will be negative. The second element is  a list of indices supporting the item set, that is, containing all the items in the set. If storeExamples is False, the second element is None. :: 
    9892 
    9993    inducer = Orange.associate.AssociationRulesSparseInducer(support = 0.5, storeExamples = True) 
     
    112106 
    113107================= 
    114 Association rules for non-sparse examples 
     108Non-sparse examples 
    115109================= 
    116110 
     
    134128    The maximal number of itemsets. 
    135129 
    136 Meaning of all attributes (except the new one, classificationRules) is the same as for AssociationRulesSparseInducer. See the description of maxItemSet there. 
    137  
    138 assoc.py (uses lenses.tab) :: 
    139  
    140     import orange 
    141  
    142     data = orange.ExampleTable("lenses") 
     130Meaning of all attributes (except the new one, classificationRules) is the same as for AssociationRulesSparseInducer. See the description of maxItemSet there. :: 
     131 
     132    import Orange 
     133 
     134    data = Orange.data.Table("lenses") 
    143135 
    144136    print "Association rules" 
    145     rules = orange.AssociationRulesInducer(data, supp = 0.5) 
     137    rules = Orange.associate.AssociationRulesInducer(data, supp = 0.5) 
    146138    for r in rules: 
    147139        print "%5.3f  %5.3f  %s" % (r.support, r.confidence, r) 
     
    158150To limit the algorithm to classification rules, set classificationRules to 1. :: 
    159151 
    160     import orange 
    161  
    162     data = orange.ExampleTable("inquisition") 
    163     rules = orange.AssociationRulesSparseInducer(data, 
     152    import Orange 
     153 
     154    data = Orange.data.Table("inquisition") 
     155    rules = Orange.associate.AssociationRulesSparseInducer(data, 
    164156                support = 0.5, storeExamples = True) 
    165157 
     
    178170Itemsets are induced in a similar fashion as for sparse data, except that the first element of the tuple, the item set, is represented not by indices of attributes, as before, but with tuples (attribute-index, value-index). :: 
    179171 
    180     inducer = orange.AssociationRulesInducer(support = 0.3, storeExamples = True) 
     172    inducer = Orange.associate.AssociationRulesInducer(support = 0.3, storeExamples = True) 
    181173    itemsets = inducer.getItemsets(data) 
    182174    print itemsets[8] 
     
    232224    Constructs an association rule and computes all measures listed above. 
    233225     
    234     .. method:: AssociationRule(left, right, support, confidence]]) 
     226    .. method:: AssociationRule(left, right, support, confidence) 
    235227    Construct association rule and sets its support and confidence. If you intend to pass such a rule to someone that expects more things to be set, you should set the manually - AssociationRules's constructor cannot compute anything from these two arguments. 
    236228     
     
    243235Association rule inducers do not store evidence about which example supports which rule (although this is available during induction, the information is discarded afterwards). Let us write a function that find the examples that confirm the rule (ie fit both sides of it) and those that contradict it (fit the left-hand side but not the right). :: 
    244236 
    245     import orange 
    246  
    247     data = orange.ExampleTable("lenses") 
    248  
    249     rules = orange.AssociationRulesInducer(data, supp = 0.3) 
     237    import Orange 
     238 
     239    data = Orange.data.Table("lenses") 
     240 
     241    rules = Orange.associate.AssociationRulesInducer(data, supp = 0.3) 
    250242    rule = rules[0] 
    251243 
Note: See TracChangeset for help on using the changeset viewer.