Orange Forum • View topic - how build data with missing values ?

how build data with missing values ?

A place to ask questions about methods in Orange and how they are used and other general support.

how build data with missing values ?

Postby pablo » Wed Aug 09, 2006 14:07

I want to build data with missing values, i try to replace data by "?" but apparently it is not efficient.
How should i do ?
thanks
pablo

data = orange.ExampleTable(domain)
loe = [
["C1", "?", "C1", "C2", "C1", "C1", "m1"],
["C3", "C1", "C1", "C2", "?", "C1", "m2"],
["C2", "C3", "C1", "C2", "C2", "C1", "m1"],
["?", "C2", "C1", "?", "C1", "C1", "m1"],
["C3", "C1", "C2", "C2", "?", "C1", "m2"]]
for l in loe :
data.append(l)
print data
import orngC45
tree = orange.C45Learner(data)

Postby Janez » Wed Aug 09, 2006 18:05

Your code is OK, except that you haven't imported orange and defined the domain. (I suppose you just haven't pasted everything into your post.) Apart from that, it works. What happens when you try it?

I'm attaching the script I used. The domain could be constructed in a more comprehensible way (but I like shortcuts). Note that you don't have to add the examples one by one, you can add the whole table at once.

Code: Select all
import orange
nvals = [3, 3, 2, 2, 2, 2]
domain = orange.Domain([orange.EnumVariable("V%i" % i, values = ["C%i" % (j+1) for j in range(nvals[i])]) for i in range(len(nvals))] + [orange.EnumVariable("y", values=["m1", "m2"])])

loe = [
["C1", "?", "C1", "C2", "C1", "C1", "m1"],
["C3", "C1", "C1", "C2", "?", "C1", "m2"],
["C2", "C3", "C1", "C2", "C2", "C1", "m1"],
["?", "C2", "C1", "?", "C1", "C1", "m1"],
["C3", "C1", "C2", "C2", "?", "C1", "m2"]]

data = orange.ExampleTable(domain, loe)

tree = orange.C45Learner(data)

import orngC45
orngC45.printTree(tree)


Return to Questions & Support



cron