Ticket #1167 (closed bug: fixed)

Opened 2 years ago

Last modified 2 years ago

MultiTreeLearner does not work with single class .tab

Reported by: honkir Owned by: Lan Zagar <lan.zagar@…>
Milestone: Component: library
Severity: major Keywords:
Cc: honkir Blocking:
Blocked By:

Description

When creating a tree with MultiTreeLearner on a .tab file with 'class' column, an exception raises:

Traceback (most recent call last):
  File "./train.py", line 92, in <module>
    treeClassifier = learner(data)
  File "/usr/lib/python2.6/dist-packages/orange/Orange/multitarget/tree.py", line 201, in __call__
    self, data2, weight)
UnboundLocalError: local variable 'data2' referenced before assignment

Fixing directly this bug (assigning data2=data) won't help much, because 'multiclass' column is missing:

Traceback (most recent call last):
  File "./train.py", line 92, in <module>
    treeClassifier = learner(data)
  File "/usr/lib/python2.6/dist-packages/orange/Orange/multitarget/tree.py", line 202, in __call__
    self, data2)
  File "/usr/lib/python2.6/dist-packages/orange/Orange/classification/tree.py", line 1860, in __call__
    tree = bl(instances, weight)
  File "/usr/lib/python2.6/dist-packages/orange/Orange/multitarget/__init__.py", line 83, in __call__
    raise Exception('No classes defined.')
Exception: No classes defined.

Well, OK. There must be a 'multiclass' column and no 'class' column. But again, another error comes up. I can only guess, that the 'multiclass' column is required to be 'continuous' (?), but I need it to be 'discrete' (string):

Traceback (most recent call last):
  File "./train.py", line 92, in <module>
    treeClassifier = learner(data)
  File "/usr/lib/python2.6/dist-packages/orange/Orange/multitarget/tree.py", line 201, in __call__
    self, data2, weight)
  File "/usr/lib/python2.6/dist-packages/orange/Orange/classification/tree.py", line 1860, in __call__
    tree = bl(instances, weight)
  File "/usr/lib/python2.6/dist-packages/orange/Orange/multitarget/tree.py", line 104, in threshold_function
    ts = [(v1 + v2) / 2. for v1, v2 in zip(values, values[1:])]
TypeError: unsupported operand type(s) for +: 'float' and 'str'

I'am attaching part of the .tab file, the sample of code follows:

data = orange.ExampleTable('test.tab')
learner = Orange.multitarget.tree.MultiTreeLearner()
treeClassifier = learner(data)

test.tab:

frekvence procesoru     typ procesoru   operační paměť  barva   hmotnost        úhlopříčka displeje     grafická karta  typ displeje    rozlišení displeje      dotykový displej        numerická klávesnice    čtečka otisků prstů     webkamera    class
c       d       c       d       c       c       d       d       d       d       d       d       d       d
                                                                                                        multiclass
        Intel Core i5   4       šedá    1.98    14.0    AMD Radeon HD   lesklý  1366x768                                1       Stylové
1.7     Intel Core i5   4       bílá    1.35    13.3    Intel HD Graphics       lesklý  1440x900                                1       Stylové
2.2     Intel Core i7   6       stříbrná                15.6    NVIDIA GeForce GT       lesklý  1280x720                                        Herní
2.6     Intel Core i5   4       stříbrná        1.65    13.0    Intel HD Graphics       matný   1366x768                                1       Profesionální
2.3     Intel Core i5   8       černá   3.3     17.3    AMD Radeon HD   lesklý  1600x900                1               1       Herní
2.5     Intel Core i5   4       černá   2.48    14.0    Intel GMA       matný   1366x768                                1       Profesionální
2.2     Intel Core i7   8       černá   2.0     15.6    NVIDIA GeForce GT       lesklý  1920x1080               1               1       Herní

}}}

Change History

comment:1 Changed 2 years ago by honkir

  • Cc honkir added

comment:2 Changed 2 years ago by Lan Zagar <lan.zagar@…>

  • Owner set to Lan Zagar <lan.zagar@…>
  • Status changed from new to closed
  • Resolution set to fixed

In [f021e2b1ed50f3eb61e3126bd954995b2716001b/orange]:

Make MultiTree work on single-class data (fixes #1167).

comment:3 Changed 2 years ago by lanz

It should work for data sets with just one class now. However in that case it would be much better to use the normal trees (Orange.classification.tree.TreeLearner)

In your case you will have to, since MultiTreeLearner does not support discrete features yet. (The class can be discrete). Another option is to continuize the data first, but normal trees handle this kind of data just fine so use them if you do not have multiple classes.

Note: See TracTickets for help on using tickets.