Orange Forum • View topic - Obtain bounds of discretized Orange.data.Value

Obtain bounds of discretized Orange.data.Value

A place to ask questions about methods in Orange and how they are used and other general support.

Obtain bounds of discretized Orange.data.Value

Postby rickyegeland » Wed Nov 21, 2012 23:14

I'm working on combining quantitative rules (data was Continuous but made Discrete) output by AssociationInducer. For this, I need to get at the boundaries of the intervals that were made by the Discretizer (I am using EqualNDiscretizer at the moment). The question is, if I have a value:

>>> print D_data[1004][4]
(4.50, 8.50]
>>> type(D_data[1004][4])
<type 'Orange.data.Value'>

1) Can I get the index of the interval given a value? For the above example:

>>> print D_data.domain.features[4].values
<<=4.50, (4.50, 8.50], >8.50>
>>> type(D_data.domain.features[4].values)
<type 'Orange.core.StringList'>
>>> print D_data.domain.features[4].values[1]
(4.50, 8.50]

Is there a way I can get the index (1) without searching through the features[i].values ?

2) Given a Value can I access the interval bounds '4.50' and '8.50' somehow? Or do I need to parse the string?

If there is a solution for 1 it is easier to identify adjacent intervals. Of course, if there is a solution for 2 I can use it to solve that problem.

Re: Obtain bounds of discretized Orange.data.Value

Postby rickyegeland » Wed Nov 21, 2012 23:59

For what it's worth, I've already written the parser:

Code: Select all
def disc_interval(val):
    """Given an Orange.data.Value that has been descritized, parses out the interval as (min, max))"""
    firstchar = val.value[0]
    if not type(val.variable) == Orange.feature.Discrete:
        raise Exception("item_bounds can only be computed for discretized values")
    if val.is_special():
        return
    if firstchar not in '>(<':
        raise Exception("Could not parse value: "+val)

    upper = float('inf')
    lower = float('-inf')
    if firstchar == '<':
        upper = float(val.value.lstrip('<='))
    elif firstchar == '(':
        lower, upper = val.value.split(', ')
        lower = float(lower.lstrip('('))
        upper = float(upper.rstrip(']'))
    elif firstchar == '>':
        lower = float(val.value.lstrip('>'))
    else:
        raise Exception("Could not parse value: "+val)
    return (lower, upper)

Re: Obtain bounds of discretized Orange.data.Value

Postby Ales » Thu Nov 22, 2012 11:21

rickyegeland wrote:Is there a way I can get the index (1) without searching through the features[i].values ?
Code: Select all
int(D_data[1004][4])
returns the `index` into the values list (or raises an error if the value is unknown).
rickyegeland wrote:2) Given a Value can I access the interval bounds '4.50' and '8.50' somehow? Or do I need to parse the string?

The interval cut points of a discretized feature can be accesed from its 'get_value_from ' member.
Code: Select all
print D_data.domain.features[4].get_value_from.transformer.points


Return to Questions & Support



cron