Orange Forum • View topic - Rounded cut-off points for IntervalDiscretizer do not work

Rounded cut-off points for IntervalDiscretizer do not work

A place to ask questions about methods in Orange and how they are used and other general support.

Rounded cut-off points for IntervalDiscretizer do not work

Postby aguthrie » Fri Apr 10, 2009 19:25

I'm following the example in disc5.py to round cut-off points for an EquiNDiscretization discretizer. While the points of the discretizer are updated, it doesn't seem to apply them when selecting data.

Consider the output of the following modified disc5.py:
Code: Select all
# Description: Shows how to round-off the cut-off points used for categorization.
# Category:    preprocessing
# Uses:        iris
# Classes:     EquiNDiscretization, EntropyDiscretization
# Referenced:  o_categorization.htm

import orange
iris = orange.ExampleTable("iris")

equiN = orange.EquiNDiscretization(numberOfIntervals=4)
entropy = orange.EntropyDiscretization()

pl = equiN("petal length", iris)
sl = equiN("sepal length", iris)
sl_ent = entropy("sepal length", iris)

points = pl.getValueFrom.transformer.points
points2 = map(lambda x:round(x), points)
pl.getValueFrom.transformer.points = points2

for attribute in [pl, sl, sl_ent]:
  print "Cut-off points for", attribute.name, \
    "are", attribute.getValueFrom.transformer.points

iris = iris.select([pl, sl, sl_ent])
for ex in iris[:10]:
    print ex

Output:
Code: Select all
Cut-off points for D_petal length are <2.0, 4.0, 5.0>
Cut-off points for D_sepal length are <5.15000009537, 5.84999990463, 6.44999980927>
Cut-off points for D_sepal length are <5.5, 6.09999990463>
['<=1.55', '<=5.15', '<=5.50']
['<=1.55', '<=5.15', '<=5.50']
['<=1.55', '<=5.15', '<=5.50']
['<=1.55', '<=5.15', '<=5.50']
['<=1.55', '<=5.15', '<=5.50']
['<=1.55', '(5.15, 5.85]', '<=5.50']
['<=1.55', '<=5.15', '<=5.50']
['<=1.55', '<=5.15', '<=5.50']
['<=1.55', '<=5.15', '<=5.50']
['<=1.55', '<=5.15', '<=5.50']

Note that the first column (D_petal length) is using 1.55 as a cut-off point, not the new 2.0 cut-off.

I'm working on a deadline here, so any help (no matter how terse/rude) is appreciated.

Return to Questions & Support



cron