## from numpy arrays to orange data

5 posts
• Page

**1**of**1**### from numpy arrays to orange data

I have data and associated class in numpy array.

d = numpy.array([[1, 2, 3, 4, 5], [5, 4, 3, 2, 1], [5, 5, 2, 1, 1] ]) #this is data

c = numpy.array([0, 0, 1]) #this is class tag

How do I change it to Orange data format and feed it in orange classifiers?

d = numpy.array([[1, 2, 3, 4, 5], [5, 4, 3, 2, 1], [5, 5, 2, 1, 1] ]) #this is data

c = numpy.array([0, 0, 1]) #this is class tag

How do I change it to Orange data format and feed it in orange classifiers?

Last edited by Sheila on Thu Jul 10, 2014 15:03, edited 1 time in total.

### Re: from numpy arrays to orange data

If you are working with non-sparse date, the example below should do the job:

Taken from http://orange.biolab.si/docs/latest/reference/rst/Orange.data.table/#example-table-prog1 for more details.

- Code: Select all
`import numpy`

d = Orange.data.Domain([Orange.feature.Continuous('a%i' % x) for x in range(5)])

a = numpy.array([[1, 2, 3, 4, 5], [5, 4, 3, 2, 1]])

t = Orange.data.Table(a)

Taken from http://orange.biolab.si/docs/latest/reference/rst/Orange.data.table/#example-table-prog1 for more details.

### Re: from numpy arrays to orange data

Sheila wrote:I have data and associated class in numpy array.

data = numpy.array([[1, 2, 3, 4, 5], [5, 4, 3, 2, 1], [5, 5, 2, 1, 1] ])

class = numpy.array([0, 0, 1])

How do I change it to Orange data format and feed it in orange classifiers?

First you need to create an appropriate domain for the dataset (here I am assuming 5 continuous features and a discrete class variable)

- Code: Select all
`_, p = data.shape # number of features`

features = [Orange.feature.Continuous("X%i" % (i + 1)) for i in range(p)]

class_var = Orange.feature.Discrete("C", values=["0", "1"]) # class variable

domain = Orange.data.Domain(features, class_var)

print domain # --> [X1, X2, X3, X4, X5, C]

then pass the domain and the data array (with the class column included) to the Orange.data.Table constructor ...

- Code: Select all
`table = Orange.data.Table(domain, numpy.hstack((data, class_.reshape(-1, 1))))`

... and train a classifier

- Code: Select all
`tree = Orange.classification.tree.TreeLearner(table)`

### Re: from numpy arrays to orange data

I am still confused. I do not understand where to specify the class label.

Here 'd' is data matrix of size (n*m) (n samples and m features).

'c' is class label matrix of size (n*1) (for n samples)

d = numpy.array([[1, 2, 3], [5, 4, 3], [5, 0.5, 2], [0.4, 3, 0.2] , [0.1, 0.8, 3] ]) #this is data

c = numpy.array( [0, 1, 0,0,1] ) #this is class label

Is it possible to make a simple function which takes input data and corresponding class label and give Orange table!!!

Thank you.

Here 'd' is data matrix of size (n*m) (n samples and m features).

'c' is class label matrix of size (n*1) (for n samples)

d = numpy.array([[1, 2, 3], [5, 4, 3], [5, 0.5, 2], [0.4, 3, 0.2] , [0.1, 0.8, 3] ]) #this is data

c = numpy.array( [0, 1, 0,0,1] ) #this is class label

Is it possible to make a simple function which takes input data and corresponding class label and give Orange table!!!

Thank you.

### Re: from numpy arrays to orange data

Sheila wrote:I am still confused. I do not understand where to specify the class label.

The class is specified in the Orange.data.Domain(list_of_features, class_var) constructor, i.e.

if X1 X2 and X2 are the predictor variables and C the class variable (with "0" and "1" labels) then

- Code: Select all
`Orange.data.Domain([X1, X2, X3], C)`

- Code: Select all
`Orange.data.Table(domain, [[1, 2, 3, 0], [4, 3, 3, 1]])`

- Code: Select all
`domain:`

[X1 X2 X3 | C]

with values:

[ 1 2 3 | 0]

[ 4 3 3 | 1]

Sheila wrote:Is it possible to make a simple function which takes input data and corresponding class label and give Orange table!!!

- Code: Select all
`def create_table_with_binary_class(d, c):`

_, p = d.shape # number of features

features = [Orange.feature.Continuous("X%i" % (i + 1)) for i in range(p)]

class_var = Orange.feature.Discrete("C", values=["0", "1"]) # class variable

domain = Orange.data.Domain(features, class_var)

return Orange.data.Table(domain, numpy.hstack((data, c.reshape(-1, 1))))

5 posts
• Page

**1**of**1**