Orange Forum • View topic - Instantiate AssociationRule from get_itemset() result?

Instantiate AssociationRule from get_itemset() result?

A place to ask questions about methods in Orange and how they are used and other general support.

Instantiate AssociationRule from get_itemset() result?

Postby rickyegeland » Sun Nov 18, 2012 20:29

Is it possible to create a new AssociationRule from the output of AssociationRulesInducer.get_itemsets() ?

I'm working on implementing the interval-merging behavior outlined in Srikant & Agrawal 1996 "Mining Quantitative Association Rules...". I see I can get get frequent itemsets from orange using get_itemsets(), which will allow me to make the re-binning decisions, but I'm wondering if I can then make AssociationRules directly from the itemsets & data without re-running AssociationRulesInducer() on the entire dataset?

Re: Instantiate AssociationRule from get_itemset() result?

Postby Ales » Mon Nov 19, 2012 14:35

rickyegeland wrote:Is it possible to create a new AssociationRule from the output of AssociationRulesInducer.get_itemsets() ?

Not directly from the output of get_itemsets, but new 'AssociationRule' instances can be constructed from Python code. See Orange.associate.AssociationRule for the documentation.

An important thing to note is how the rules are coded, i.e. the 'left' and 'right' members of the 'AssociationRule'. They are 'Orange.data.Instance' with the unknown values for all features that are not part of the rule and the values which are present represent the (feature = value) pairs of the rule.

Now the fun part of the whole thing is that the features in the rule can also be new transformations of the original features (Orange.feature.Descriptor.get_value_from). For instance you can use IntervalDiscretizer to construct the desired intervals.
A contrived example:
Code: Select all
import Orange
iris = Orange.data.Table("iris")
attr = iris.domain['petal length']

# Create intervals for the 'petal lenght' (will actually be a new feature)
disc = Orange.feature.discretization.IntervalDiscretizer(points=[2.2, 5])
new_attr = disc.construct_variable(attr)

# A new domain for the rule (with the discretized 'petal lenght')
new_domain = Orange.data.Domain(iris.domain[:2] + [new_attr] + iris.domain[3:4], iris.domain.class_var)
# Left part of the rule (i.e. petal lenght <= 2.20
left = Orange.data.Instance(new_domain, ["?", "?", 0, "?", "?"])
right = Orange.data.Instance(new_domain, ["?", "?", "?", "?", "Iris-setosa"])
rule1 = Orange.associate.AssociationRule(left, right, 50, 50, 50, 150)

left = left = Orange.data.Instance(new_domain, ["?", "?", 1, "?", "?"])
right = Orange.data.Instance(new_domain, ["?", "?", "?", "?", "Iris-versicolor"])
rule2 = Orange.associate.AssociationRule(left, right, 50, 50, 50, 150)

# Finally wrap the rules in a 'AssociationRules' list.
rules = Orange.associate.AssociationRules([rule1, rule2])

print rules
# <D_petal length=<=2.20 -> iris=Iris-setosa, D_petal length=(2.20, 5.00] -> iris=Iris-versicolor>


Re: Instantiate AssociationRule from get_itemset() result?

Postby rickyegeland » Tue Nov 20, 2012 17:10

Great, thanks for the info! I also see that the online documentation is much more detailed then the pydoc, I will look there form now on :-)


Return to Questions & Support



cron