The last summer, student Wencan Luo participated in Google Summer of Code to implement Multi-label Classification in Orange. He provided a framework, implemented a few algorithms and some prototype widgets. His work has been "hidden" in our repositories for too long; finally, we have merged part of his code into Orange (widgets are not there yet ...) and added a more general support for multi-target prediction.
You can load multi-label tab-delimited data (e.g. emotions.tab) just like any other tab-delimited data:
>>> zoo = Orange.data.Table('zoo') # single-target >>> emotions = Orange.data.Table('emotions') # multi-label
The difference is that now zoo's domain has a non-empty class_var field, while a list of emotions' labels can be obtained through it's domain's class_vars:
>>> zoo.domain.class_var EnumVariable 'type' >>> emotions.domain.class_vars <EnumVariable 'amazed-suprised', EnumVariable 'happy-pleased', EnumVariable 'relaxing-calm', EnumVariable 'quiet-still', EnumVariable 'sad-lonely', EnumVariable 'angry-aggresive'>
A simple example of a multi-label classification learner is a "binary relevance" learner. Let's try it out.
>>> learner = Orange.multilabel.BinaryRelevanceLearner() >>> classifier = learner(emotions) >>> classifier(emotions[0]) [<orange.Value 'amazed-suprised'='0'>, <orange.Value 'happy-pleased'='0'>, <orange.Value 'relaxing-calm'='1'>, <orange.Value 'quiet-still'='1'>, <orange.Value 'sad-lonely'='1'>, <orange.Value 'angry-aggresive'='0'>] >>> classifier(emotions[0], Orange.classification.Classifier.GetProbabilities) [<1.000, 0.000>, <0.881, 0.119>, <0.000, 1.000>, <0.046, 0.954>, <0.000, 1.000>, <1.000, 0.000>]
Real values of label variables of emotions[0] instance can be obtained by calling emotions[0].get_classes(), which is analogous to the get_class method in the single-target case.
For multi-label classification, we can also perform testing like usual, however, specialised evaluation measures have to be used:
>>> test = Orange.evaluation.testing.cross_validation([learner], emotions) >>> Orange.evaluation.scoring.mlc_hamming_loss(test) [0.2228780213603148]
In one of the following blog posts, a multi-target regression method PLS that is in the process of implementation will be described.
