Orange Forum • View topic - New to Orange: Inputing data in a TAB file & Finding Itemset

New to Orange: Inputing data in a TAB file & Finding Itemset

A place to ask questions about methods in Orange and how they are used and other general support.

New to Orange: Inputing data in a TAB file & Finding Itemset

Postby fmegahed » Sun Jan 13, 2013 19:20

Hello,

I was trying to run an example into orange where I wanted to find a frequent itemset among the data below:
{cat, and, dog, bites}
{Yahoo, news, claims, a, cat, mated, with, a, dog, and, produced, viable, offspring}
{cat, killer, likely, is, a, big, dog}
{professional, free, advice, on, dog, training, puppy, training}
{cat, and, kitten, training, and, behavior}
{Dog, &, cat, provides, dog, training, in, Eugene, Oregon}
{Dog, and, cat, is, a, slang, term, used, by, police, officers, for, a, male-female, relationship}
{Shop, for, your, show, dog, grooming, and, pet, supplies}

Therefore, I believe I should be using both the file widget and the itemset widget. However, I have the following problems.
* When the data is read, it first read that as 7 examples instead of 8. So I added a row which is {string, string, string, string} just to make Orange read the eight examples below and not use one of them as a header. Is there a way to avoid having a header column? Any suggestions to how to deal with such data since the number of items in the basket (each row) is different?

* How to make it read the elements in each row as a separate word?

* Is there a way to avoid the effect of capitalization such that Orange reads Cat and cat to be the same?

Thank you very much.
P.S: The data is copied as is from a .txt file

Re: New to Orange: Inputing data in a TAB file & Finding Ite

Postby Ales » Tue Jan 15, 2013 10:46

fmegahed wrote:Any suggestions to how to deal with such data since the number of items in the basket (each row) is different?
Use .basket format
fmegahed wrote:Is there a way to avoid the effect of capitalization such that Orange reads Cat and cat to be the same?
No.


Return to Questions & Support



cron