Orange Forum • View topic - Problems importing Continuous Values

Problems importing Continuous Values

Report bugs (or imagined bugs).
(Archived/read-only, please use our ticketing system for reporting bugs and their discussion.)
Forum rules
Archived/read-only, please use our ticketing system for reporting bugs and their discussion.

Problems importing Continuous Values

Postby piecurus » Wed Sep 09, 2009 18:01

Hi all,
first of all, congratulation for the improvemens. Now Orange is working perfectly on Ubuntu 9.04 and python 2.6!!

I found a ( possible ) bug. When ExampleTable imports continuous values, it does not import decimal digits. I mean that , if the real value is "24.23245", when I look at the data imported, I found the value "24,00000000000" ( with 11 zeros always).

I tried with tab and csv files but the result is the same.

If I don't specify that the attribute is continuous, the attribute is considered as EnumVariable.

Any idea about?

Thankyou very much.

Postby Janez » Wed Sep 09, 2009 20:40

Weird, but my first guess would be that you have a trivial error in the data. Can you send the first few lines and the line with that particular example either to the forum or to janez<<dot>>demsar>>at<<fri.uni-lj.si?

Postby piecurus » Wed Sep 09, 2009 21:13

Ok, I confirm I found a bug but I have no idea of why that happens...it's very strange.

These are few lines of my file dataset.csv
C#sX,C#sY,C#sZ,C#pho,C#phi,C#theta,c#myClass
-2.125,-24.3725585938,2.75,24.6190929445,1.48382803331,1.45886080066,classOne
-2.125,-24.3725585938,2.5,24.5924223575,1.48382803331,1.46896308589,classOne
-2.125,-24.2475585938,2.375,24.4560901977,1.48338197261,1.47353020707,classOne
-2.25,-24.3725585938,2.125,24.5682668784,1.47874030957,1.48419442903,classOne
-2.25,-24.3725585938,2.125,24.5682668784,1.47874030957,1.48419442903,classOne

if I start python with
"ipython -wthread", this is what I get

In [2]: data = orange.ExampleTable('dataset.csv')

In [3]: for index in range(5):
...: print(data[index])

[-2,00000000000, -24,00000000000, 2,00000000000, 24,000000000000, 1, 1,000000000000000, 'classOne']
[-2,00000000000, -24,00000000000, 2,00000000000, 24,000000000000, 1, 1,000000000000000, 'classOne']
[-2,00000000000, -24,00000000000, 2,00000000000, 24,000000000000, 1, 1,000000000000000, 'classOne']
[-2,00000000000, -24,00000000000, 2,00000000000, 24,000000000000, 1, 1,000000000000000, 'classOne']
[-2,00000000000, -24,00000000000, 2,00000000000, 24,000000000000, 1, 1,000000000000000, 'classOne']

In [4]: data.domain.attributes
Out[4]: <FloatVariable 'sX', FloatVariable 'sY', FloatVariable 'sZ', FloatVariable 'pho', FloatVariable 'phi', FloatVariable 'theta'>


If I start python just with "ipython", everything works fine.
In [2]: data = orange.ExampleTable('dataset.csv')

In [3]: for index in range(5):
...: print(data[index])
[-2.12500000000, -24.37255859375, 2.75000000000, 24.619092941284, 1.48383, 1.458860754966736, 'classOne']
[-2.12500000000, -24.37255859375, 2.50000000000, 24.592422485352, 1.48383, 1.468963027000427, 'classOne']
[-2.12500000000, -24.24755859375, 2.37500000000, 24.456090927124, 1.48338, 1.473530173301697, 'classOne']
[-2.25000000000, -24.37255859375, 2.12500000000, 24.568267822266, 1.47874, 1.484194397926331, 'classOne']
[-2.25000000000, -24.37255859375, 2.12500000000, 24.568267822266, 1.47874, 1.484194397926331, 'classOne']

In [4]: data.domain.attributes
Out[4]: <FloatVariable 'sX', FloatVariable 'sY', FloatVariable 'sZ', FloatVariable 'pho', FloatVariable 'phi', FloatVariable 'theta'>

Thankyou very much for answering.

Postby Janez » Fri Sep 25, 2009 18:48

When I wanted to start working on this, I noticed the commas: in one case you have -24.3725585938 and in another -24,0000000 -- the former number has a dot and the latter a comma. Could this have something to do with your locale settings?

To avoid problems with dots and commas, Orange supports both by converting commas to dots and the calling scanf. If your scanf expects commas (I have to confess I've no idea how the standard Linux/gcc scanf works), the 24.372 would become 24.000 and it would be printed as 24,000. Can you explore this a bit?

And I'll add some checking to Orange code for parsing numbers, too.

I've no clue how is this related to threads in ipython, though.

Postby Janez » Sat Sep 26, 2009 0:02

I may have fixed it - Orange now checks the locale before reading the values. Can you rebuild it and see if it works now?

Postby piecurus » Sun Sep 27, 2009 15:31

Hi,
at first time i noted the conversion from commas to dots. So I always convert all the commas to dots in the float values but the problem was still there ( taking in account that i'm writing csv files, I thought that the problem could be there. )
As soon as I can (tonight or tommow), I'll rebuild orange and I'll post you the results.

Postby piecurus » Wed Nov 11, 2009 13:51

Hi Janez ,
sorry for answering so late.
Now the values are imported correctly even with the ipython opcion -wthread.
Thankyou very much!


Return to Bugs