Multi-label classification (multilabel)¶
Multi-label classification is a machine learning prediction problem in which multiple binary variables (i.e. labels) are being predicted. Orange offers a limited number of methods for this task.
Multi-label data is represented as multi-target data with discrete binary classes with values ‘0’ and ‘1’. Multi-target data is also supported by Orange’s tab file format using multiclass directive.
Binary Relevance Learner¶
The most basic problem transformation method for multi-label classification is the Binary Relevance method. It learns binary classifiers , one for each different label in . It transforms the original data set into data sets that contain all examples of the original data set, labelled as if the labels of the original example contained and as otherwise. It is the same solution used in order to deal with a single-label multi-class problem using a binary classifier. For more information, see G. Tsoumakas and I. Katakis. Multi-label classification: An overview. International Journal of Data Warehousing and Mining, 3(3):1-13, 2007.
Note that a copy of the table is made in RAM for each label to enable construction of a classifier. Due to technical limitations, that is currently unavoidable and should be remedied in Orange 3.
- class Orange.multilabel.BinaryRelevanceLearner(**argkw)¶
Bases: Orange.multilabel.multibase.MultiLabelLearner
Class that implements the Binary Relevance (BR) method.
Parameters: - instances (Orange.data.Table) – a table of instances.
- base_learner (Orange.classification.Learner) – the binary learner, the default learner is Orange.classification.bayes.NaiveLearner.
- class Orange.multilabel.BinaryRelevanceClassifier(**kwds)¶
Bases: Orange.multilabel.multibase.MultiLabelClassifier
- __call__(instance, result_type=0)¶
Return type: a list of Orange.data.Value, a list of Orange.statistics.distribution.Distribution, or a tuple with both
Examples¶
The following example demonstrates a straightforward invocation of this algorithm (mlc-classify.py):
emotions = Orange.data.Table('emotions')
learner = Orange.multilabel.BinaryRelevanceLearner()
classifier = learner(emotions)
print classifier(emotions[0])
LabelPowerset Learner¶
LabelPowerset Classification is another transformation method for multi-label classification. It considers each different set of labels that exists in the multi-label data as a single class. Thus it learns a classification problem , where is the power set of L. For more information, see G. Tsoumakas and I. Katakis. Multi-label classification: An overview. International Journal of Data Warehousing and Mining, 3(3):1-13, 2007.
- class Orange.multilabel.LabelPowersetLearner(**argkw)¶
Bases: Orange.multilabel.multibase.MultiLabelLearner
Class that implements the LabelPowerset (LP) method.
Parameters: - instances (Orange.data.Table) – a table of instances.
- base_learner (Orange.classification.Learner) – the binary learner, the default learner is BayesLearner
- class Orange.multilabel.LabelPowersetClassifier(**argkw)¶
Bases: Orange.multilabel.multibase.MultiLabelClassifier
- __call__(instance, result_type=0)¶
Return type: a list of Orange.data.Value, a list of Orange.statistics.distribution.Distribution, or a tuple with both
Examples¶
The following example demonstrates a straightforward invocation of this algorithm (mlc-classify.py):
emotions = Orange.data.Table('emotions')
learner = Orange.multilabel.LabelPowersetLearner()
classifier = learner(emotions)
print classifier(emotions[0])
MultikNN Learner¶
MultikNN Classification is the base class of kNN method based multi-label classification.
- class Orange.multilabel.MultikNNLearner(**argkw)¶
Bases: Orange.multilabel.multibase.MultiLabelLearner
Class implementing the MultikNN (Multi-Label k Nearest Neighbours) algorithm.
- k¶
Number of neighbors. The default value is 1
- num_labels¶
Number of labels
- label_indices¶
The indices of labels in the domain
- knn¶
Orange.classification.knn.FindNearest for nearest neighbor search
Parameters: instances (Orange.data.Table) – a table of instances.
- class Orange.multilabel.MultikNNClassifier(**argkw)¶
Bases: Orange.multilabel.multibase.MultiLabelClassifier
ML-kNN Learner¶
ML-kNN Classification is an adaptation kNN for multi-label classification. In essence, ML-kNN uses the kNN algorithm independently for each label . It finds the k nearest examples to the test instance and considers those that are labeled at least with as positive and the rest as negative. What mainly differentiates this method from other binary relevance (BR) methods is the use of prior probabilities. ML-kNN can also rank labels.
For more information, see Zhang, M. and Zhou, Z. 2007. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recogn. 40, 7 (Jul. 2007), 2038-2048.
- class Orange.multilabel.MLkNNLearner(**argkw)¶
Bases: Orange.multilabel.multiknn.MultikNNLearner
Class implementing the ML-kNN (Multi-Label k Nearest Neighbours) algorithm. The class is based on the pseudo-code made available by the authors.
The pseudo code of ML-kNN:
- k¶
Number of neighbors. The default value is 1
- smooth¶
Smoothing parameter controlling the strength of uniform prior (Default value is set to 1 which yields the Laplace smoothing).
- knn¶
Orange.classification.knn.FindNearest for nearest neighbor search
Parameters: instances (Orange.data.Table) – a table of instances. - compute_cond(instances)¶
Compute posterior probabilities for each label of the training set.
- compute_prior(instances)¶
Compute prior probability for each label of the training set.
- class Orange.multilabel.MLkNNClassifier(**argkw)¶
Bases: Orange.multilabel.multiknn.MultikNNClassifier
- __call__(instance, result_type=0)¶
Return type: a list of Orange.data.Value, a list of Orange.statistics.distribution.Distribution, or a tuple with both
Examples¶
The following example demonstrates a straightforward invocation of this algorithm (mlc-classify.py):
emotions = Orange.data.Table('emotions')
learner = Orange.multilabel.MLkNNLearner(k=5)
classifier = learner(emotions)
print classifier(emotions[0])
BR-kNN Learner¶
BR-kNN Classification is an adaptation of the kNN algorithm for multi-label classification that is conceptually equivalent to using the popular Binary Relevance problem transformation method in conjunction with the kNN algorithm. It also implements two extensions of BR-kNN. For more information, see E. Spyromitros, G. Tsoumakas, I. Vlahavas, An Empirical Study of Lazy Multilabel Classification Algorithms, Proc. 5th Hellenic Conference on Artificial Intelligence (SETN 2008), Springer, Syros, Greece, 2008.
- class Orange.multilabel.BRkNNLearner(**argkw)¶
Bases: Orange.multilabel.multiknn.MultikNNLearner
Class implementing the BR-kNN learner.
- k¶
Number of neighbours. If set to 0 (which is also the default value), the square root of the number of instances is used.
- ext¶
Extension type. The default is None, means ‘Standard BR’; ‘a’ means predicting top ranked label in case of empty prediction set; ‘b’ means predicting top n ranked labels based on size of labelset in neighbours.
- knn¶
Orange.classification.knn.FindNearest for nearest neighbor search
Parameters: instances (Orange.data.Table) – a table of instances.
- class Orange.multilabel.BRkNNClassifier(**argkw)¶
Bases: Orange.multilabel.multiknn.MultikNNClassifier
- __call__(instance, result_type=0)¶
Return type: a list of Orange.data.Value, a list of Orange.statistics.distribution.Distribution, or a tuple with both
- get_labels_a(prob, _neighs=None)¶
used for BRknn-a
Parameters: prob (list of double) – the probabilities of the labels Return type: the list label value
- get_labels_b(prob, neighs)¶
used for BRknn-b
Parameters: prob (list of double) – the probabilities of the labels Return type: the list label value
- get_prob(neighbours)¶
Calculates the probabilities of the labels, based on the neighboring instances.
Parameters: neighbours (list of Orange.data.Instance) – a list of nearest neighboring instances. Return type: the prob of the labels
Examples¶
The following example demonstrates a straightforward invocation of this algorithm (mlc-classify.py):
emotions = Orange.data.Table('emotions')
learner = Orange.multilabel.BRkNNLearner(k=5)
classifier = learner(emotions)
print classifier(emotions[0])