Orange Forum • View topic - Problem Loading Data

Problem Loading Data

A place to ask questions about methods in Orange and how they are used and other general support.

Problem Loading Data

Postby GrantD » Sun Apr 08, 2012 21:27

I'm very new to Orange, but I can't seem to get off the ground in my script because the data never imports properly. The problem centers on the ExampleTable() function not importing the class row properly.

The table I'm importing looks like this:

Image

It is tab delimited file with three header rows. I want to set the far right column to class.

My code looks like this:

Code: Select all
data1 = orange.ExampleTable("writeFile1.tab")
print data1[0]
>>>['?', '?', '?', '?', '?', '?', '?', '?', 'class']
print "Data 1 Classes:", len(data1.domain.classVar.name)


produces this error:

Traceback (most recent call last):
File "C:/NSF_Stuff/NLTK_Scripts/MachineLearning/ML_Script_TestTREE3.py", line 80, in <module>
print "Data 1 Classes:", len(data1.domain.classVar.name)
AttributeError: 'NoneType' object has no attribute 'name'

So the question is, how do I get orange to correctly interpret the third line of my .tab file? or alternatively, how can I force the last column to be the class?

Re: Problem Loading Data

Postby Ales » Tue Apr 10, 2012 15:48

It is hard to judge by the screen shot, but the problem is most likely inconsistent tabs.
Try importing your file in Excel or LibreOffice (import as csv with tab separators), fix it and then export it again.

Re: Problem Loading Data

Postby GrantD » Thu Apr 12, 2012 5:57

I've brought the data into Excel and the number of tab delimiters is correct. At this point I'm just looking for any way to force a certain variable in my table to be the class I want to predict. Reading in the .tab file just isn't working, so I'm wondering if there is a way to read in data as a list instead of a table?

Re: Problem Loading Data

Postby Ales » Thu Apr 12, 2012 12:32

GrantD wrote: I'm wondering if there is a way to read in data as a list instead of a table?
You can create a table with
Code: Select all
domain = orange.Domain([orange.EnumVariable("Modal", values=["N","Y"]),
                        orange.EnumVariable("SpatialPhrase", values=["N", "Y"])
                       ],
                       orange.EnumVariable("my_class", values=["N", "Y"])
                      )

table = orange.ExampleTable(domain,
                                    [["N", "N", "N"],
                                     ["Y", "Y", "Y"]]
                           )
(only two features and class shown)

Re: Problem Loading Data

Postby GrantD » Thu Apr 19, 2012 8:48

My workaround was this:

Code: Select all
data1 =orange.ExampleTable("writeFile1.tab")
new_domain1 = orange.Domain([a for a in data1.domain.variables if "Place" not in str(a)], data1.domain['Place'])
new_data1 = orange.ExampleTable(new_domain1, data1)


Seems to be working correctly.


Return to Questions & Support



cron