Orange Forum • View topic - Association rules: sparse data and classes?

Association rules: sparse data and classes?

A place to ask questions about methods in Orange and how they are used and other general support.

Association rules: sparse data and classes?

Postby mjjohnson » Wed Nov 13, 2013 23:46

I'm trying to do association rule mining where I have sparse data (words in documents) as well as class labels (topics assigned to documents). I'm new to using Orange, and after looking through the documentation here and working through some examples, I tried importing a very small example set of my own.

Here is a very small example set (not real data, but same basic form as I'm working with):

Code: Select all
topic   words
d   basket
class   
tame wild   cat dog fox   
tame   cat
tame   dog cat
tame wild   wolf cat
wild   wolf
wild   fox


When I save this as 'test.tab' and then run the following from a Python interpreter:

Code: Select all
>>> import Orange; data = Orange.data.Table('test.tab'); Orange.associate.AssociationRulesInducer(data, support=0.1)

...I get no rules. If I ignore the multiclass attribute (mark it as "ignore"), then I can generate rules, but that's not helpful since I want classification rules. :) After looking at the documentation more closely, it appears that it's not possible to mix class labels and sparse/basket data. Is this correct? Is there some workaround I could do? (To add to the complication, I actually want to do multiclass rules, as I have multiple topics, but even single-class would be helpful...)

Re: Association rules: sparse data and classes?

Postby Ales » Thu Nov 14, 2013 13:16

mjjohnson wrote:After looking at the documentation more closely, it appears that it's not possible to mix class labels and sparse/basket data. Is this correct?
Yes. The AssociationRulesSparseInducer only considers basket data if it does not have any regular attributes or class labels.
mjjohnson wrote: Is there some workaround I could do?
You could move the class labels to the basket (i.e. add topic-tame, topic-wild, ...)
mjjohnson wrote:To add to the complication, I actually want to do multiclass rules, as I have multiple topics, but even single-class would be helpful...
You could induce rules on the whole basket data (including classes) and then just filter out all rules whose consequent is not a subset of the class labels, or induce just the itemsets and build the rules manually.

Re: Association rules: sparse data and classes?

Postby mjjohnson » Thu Nov 14, 2013 14:08

Ales wrote:You could induce rules on the whole basket data (including classes) and then just filter out all rules whose consequent is not a subset of the class labels, or induce just the itemsets and build the rules manually.


Thanks! That will work.


Return to Questions & Support



cron