Orange Forum • View topic - Manually create sparse data table

Manually create sparse data table

A place to ask questions about methods in Orange and how they are used and other general support.

Manually create sparse data table

Postby menosys » Wed Jul 09, 2014 14:43


I am currently working with a sparse data set similar to the inquisition basket (, retrieved from a database.

So that, I would like to manually create a sparse data table from the data retrieved, instead of using a file. I have found a few examples about the creation of a data table from non-sparse data (, however I cannot find an equivalent for the sparse data set.

Using ipython and the inquisition basket, I had surprising results which don't help me understanding how to create this table. Any ideas about the code required and the input type ?

Code: Select all
import Orange
data ="inquisition.basket")
print data.domain
>>> {-3:nobody, -4:expects, -5:the, -6:Spanish, -7:Inquisition, -8:our, -9:chief, -10:weapon, -11:is, -12:surprise, -13:and, -14:fear, -15:two, -16:weapons, -17:are, -18:ruthless, -19:efficiency, -20:three, -21:an, -22:almost, -23:fanatical, -24:devotion, -25:to, -26:Pope, -27:four, -28:no, -29:amongst, -30:weaponry, -31:such, -32:elements, -33:as, -34:I'll, -35:come, -36:in, -37:again, -38:diverse, -39:nice, -40:red, -41:uniforms, -42:oh damn}
print data.to_numpy()
>>> (array([], shape=(10, 0), dtype=float64), None, None)

Re: Manually create sparse data table

Postby Ales » Thu Jul 10, 2014 16:30

Unfortunately this is very tedious.
You have to add the 'sparse features' as meta attributes to the domain
Code: Select all
domain =[])
mid = Orange.feature.Descriptor.new_meta_id
    {mid(): Orange.feature.Continuous("nobody"),
     mid(): Orange.feature.Continuous("expects")},

and then construct the data instances
Code: Select all
inst1 =
inst1["nobody"] = 1.0
inst2 =
inst2["expects"] = 2.0
table =, [inst1, inst2])

Re: Manually create sparse data table

Postby menosys » Thu Jul 17, 2014 13:54

Great !

It may be interesting to add this code sample here:

I have also found a former post (thanks to Ales again) with other details:

For those who don't understand the previous code, here is the equivalent with the simple market basket:

Code: Select all
import Orange

domain =[])
mid = Orange.feature.Descriptor.new_meta_id
    {mid(): Orange.feature.Continuous("Bread"),
     mid(): Orange.feature.Continuous("Milk"),
     mid(): Orange.feature.Continuous("Diapers"),
     mid(): Orange.feature.Continuous("Beer"),
     mid(): Orange.feature.Continuous("Eggs"),
     mid(): Orange.feature.Continuous("Cola")},

transaction1 =
transaction1["Bread"] = 1.0
transaction1["Milk"] = 1.0

transaction2 =
transaction2["Bread"] = 1.0
transaction2["Diapers"] = 1.0
transaction2["Beer"] = 1.0
transaction2["Eggs"] = 1.0

transaction3 =
transaction3["Milk"] = 1.0
transaction3["Diapers"] = 1.0
transaction3["Beer"] = 1.0
transaction3["Cola"] = 1.0

transaction4 =
transaction4["Bread"] = 1.0
transaction4["Milk"] = 1.0
transaction4["Diapers"] = 1.0
transaction4["Beer"] = 1.0

transaction5 =
transaction5["Bread"] = 1.0
transaction5["Milk"] = 1.0
transaction5["Diapers"] = 1.0
transaction5["Cola"] = 1.0

table =, [transaction1, transaction2, transaction3, transaction4, transaction5])

Return to Questions & Support