Changeset 5045:38095945049e in orange


Ignore:
Timestamp:
08/01/08 00:42:15 (6 years ago)
Author:
janezd <janez.demsar@…>
Branch:
default
Convert:
46a2e710114ae7842b90c62f622fdb3ef0e33d96
Message:
 
Location:
orange/doc/reference
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • orange/doc/reference/assoc-agrawal.py

    r526 r5045  
    33data = orange.ExampleTable("inquisition") 
    44 
    5 rules = orange.AssociationRulesSparseInducer(data, support = 0.5) 
     5rules = orange.AssociationRulesSparseInducer(data, support = 0.5, storeExamples = True) 
    66print "%5s   %5s" % ("supp", "conf") 
    77for r in rules: 
    88    print "%5.3f   %5.3f   %s" % (r.support, r.confidence, r) 
     9 
     10rule0 = rules[10] 
     11print rule0 
     12print "Match left: " 
     13print [rule0.examples[i] for i in rule0.matchLeft] 
     14print "\nMatch both: " 
     15print [rule0.examples[i] for i in rule0.matchBoth] 
     16 
     17inducer = orange.AssociationRulesSparseInducer(support = 0.5) 
     18itemsets = inducer.getItemsets(data) 
     19print itemsets[5] 
  • orange/doc/reference/assoc-rule.py

    r526 r5045  
    33data = orange.ExampleTable("lenses") 
    44 
    5 rules = orange.AssociationRulesInducer(data, support = 0.3) 
     5rules = orange.AssociationRulesInducer(data, support = 0.3, storeExamples = True) 
    66rule = rules[0] 
    77 
     
    2121        print example 
    2222print 
     23 
     24print rule 
     25print "Match left: " 
     26print "\n".join(str(rule.examples[i]) for i in rule.matchLeft) 
     27print "\nMatch both: " 
     28print "\n".join(str(rule.examples[i]) for i in rule.matchBoth) 
     29 
     30inducer = orange.AssociationRulesInducer(support = 0.3, storeExamples = True) 
     31itemsets = inducer.getItemsets(data) 
     32print itemsets[8] 
  • orange/doc/reference/associationRules.htm

    r2653 r5045  
    2424right side of the rule is the class attribute.</P> 
    2525 
     26<p>It is also possible to extract item sets instead of association rules. These are often more interesting than the rules themselves.</p> 
     27 
    2628<P>Besides association rule inducer, Orange also provides a rather 
    2729simplified method for classification by association rules.</P> 
     
    4749<DT>confidence</DT> 
    4850<DD>Minimal confidence for the rule.</DD> 
     51 
     52<DT>storeExamples</DT> 
     53<DD>Tells the inducer to store the examples covered by each rule and those confirming it</DD> 
    4954 
    5055<DT>maxItemSets</DT> 
     
    102107<P>If examples are weighted, weight can be passed as an additional argument to call operator.</P> 
    103108 
     109<p>To get only a list of supported item sets, one should call the method <code>getItemsets</code>. The result 
     110is a list whose elements are tuples with two elements. The first is a tuple with indices of attributes in the item set. Sparse examples are usually represented with meta attributes, so this indices will be negative. The second element is a list of indices supporting the item set, that is, containing all the items in the set. If <code>storeExamples</code> is <code>False</code>, the second element is <code>None</code>.</p> 
     111 
     112<p class="header"><a href="assoc-agrawal.py">assoc-agrawal.py</a> 
     113(uses <a href="inquisition.basket">inquisition.basket</a>)</p> 
     114<XMP class="code">inducer = orange.AssociationRulesSparseInducer(support = 0.5, storeExamples = True) 
     115itemsets = inducer.getItemsets(data) 
     116</XMP> 
     117 
     118<p>Now <code>itemsets</code> is a list of itemsets along with the examples supporting them since we set <code>storeExamples</code> to <code>True</code>.</p> 
     119 
     120<xmp class="code">>>> itemsets[5] 
     121((-11, -7), [1, 2, 3, 6, 9]) 
     122>>> [data.domain[i].name for i in itemsets[5][0]] 
     123['surprise', 'our'] 
     124</xmp> 
     125 
     126<p>The sixth itemset contains attributes with indices -11 and -7, that is, the words "surprise" and "our". The examples supporting it are those with indices 1,2, 3, 6 and 9.</p> 
     127 
     128<p>This way of representing the itemsets is not very programmer-friendly, but it is much more memory efficient than and faster to work with than using objects like Variable and Example.</p> 
     129 
     130 
     131 
    104132<H2>Association Rules for Non-sparse Examples</H2> 
    105133 
     
    124152<DD>If 1 (default is 0), the algorithm constructs classification rules instead of general association rules.</DD> 
    125153 
     154<DT>storeExamples</DT> 
     155<DD>Tells the inducer to store the examples covered by each rule and those confirming it</DD> 
     156 
    126157<DT>maxItemSets</DT> 
    127158<DD>The maximal number of itemsets.</DD> 
     
    175206<P><CODE>AssociationRulesInducer</CODE> can also work with weighted examples; the ID of weight attribute should be passed as an additional argument in a call.</P> 
    176207 
     208<p>Itemsets are induced in a similar fashion as for sparse data, except that the first element of the tuple, the item set, is represented not by indices of attributes, as before, but with tuples (attribute-index, value-index).</p> 
     209 
     210<p class="header"><a href="assoc-agrawal.py">part of assoc.py</a> 
     211(uses <a href="lenses.tab">lenses.tab</a>)</p> 
     212<xmp class="code">inducer = orange.AssociationRulesInducer(support = 0.3, storeExamples = True) 
     213itemsets = inducer.getItemsets(data) 
     214print itemsets[8]</xmp> 
     215 
     216<p>This prints out 
     217<xmp class="code">(((2, 1), (4, 0)), [2, 6, 10, 14, 15, 18, 22, 23])</xmp> 
     218meaning that the ninth itemset contains the second value of the third attribute, (2, 1), and the first value of the fifth, (4, 0).</p> 
    177219 
    178220<H2>Association Rule</H2> 
     
    202244<DT>lift</DT><DD><CODE>nExamples * nAppliesBoth / (nAppliesLeft * nAppliesRight)</CODE></DD> 
    203245<DT>leverage</DT><DD><CODE>(nAppliesBoth * nExamples - nAppliesLeft * nAppliesRight)</CODE></DD> 
     246 
     247<dt>examples, matchLeft, matchBoth</dt> 
     248<dd>If <code>storeExamples</code> was <code>True</code> during induction, <code>examples</code> contains a copy of the example table used to induce the rules. Attributes <code>matchLeft</code> and <code>matchBoth</code> are lists of integers, representing the indices of examples which match the left-hand side of the rule and both sides, respectively.</dd> 
    204249</DL> 
    205250 
     
    248293</XMP> 
    249294 
     295<p>The latter printouts get simpler and (way!) faster if we instruct the inducer to store the examples. We can then do, for instance, this.</p> 
     296 
     297 
     298(uses <a href="lenses.tab">lenses.tab</a>)</p> 
     299<XMP class=code>print "Match left: " 
     300print "\n".join(str(rule.examples[i]) for i in rule.matchLeft) 
     301print "\nMatch both: " 
     302print "\n".join(str(rule.examples[i]) for i in rule.matchBoth) 
     303</XMP> 
     304 
     305<p>The "contradicting" examples are then those whose indices are find in <code>matchLeft</code> but not in <code>matchBoth</code>. The memory friendlier and the faster ways to compute this are as follows.</p> 
     306 
     307<xmp class="code">>>> [x for x in rule.matchLeft if not x in rule.matchBoth] 
     308[0, 2, 8, 10, 16, 17, 18] 
     309>>> set(rule.matchLeft) - set(rule.matchBoth) 
     310set([0, 2, 8, 10, 16, 17, 18])</xmp> 
     311 
    250312</BODY>  
Note: See TracChangeset for help on using the changeset viewer.