Multi-target Tree Learner

To use the tree learning algorithm for multi-target data, standard orange trees (Orange.classification.tree.TreeLearner) can be used. Only the measure for feature scoring and the node_learner components have to be chosen so that they work on multi-target data domains.

This module provides one such measure (MultitargetVariance) that can be used and a helper class MultiTreeLearner which extends TreeLearner and is the same in all aspects except for different (multi-target) defaults for measure and node_learner.

Examples

The following example demonstrates how to build a prediction model with MultitargetTreeLearner and use it to predict (multiple) class values for a given instance (multitarget.py):

import Orange
data = Orange.data.Table('multitarget-synthetic')
print 'Features:', data.domain.features
print 'Classes:', data.domain.class_vars
c_majority = mt_majority(data)
print 'Majority predictions:\n', c_majority(data[0])
class Orange.multitarget.tree.MultitargetVariance(weights=None)

Bases: Orange.feature.scoring.Score

A multi-target score that ranks features based on the average class variance of the subsets.

To compute it, a prototype has to be defined for each subset. Here, it is just the mean vector of class variables. Then the sum of squared distances to the prototypes is computed in each subset. The final score is obtained as the average of subset variances (weighted, to account for subset sizes).

Weights can be passed to the constructor to normalize classes with values of different magnitudes or to increase the importance of some classes. In this case, class values are first scaled according to the given weights.

__call__(feature, data, apriori_class_distribution=None, weights=0)
Parameters:
Returns:

float

__init__(weights=None)
Parameters:weights – Weights of the class variables used when computing distances. If None, all weights are set to 1.
best_threshold(feature, data)

Computes the best threshold for a split of a continuous feature.

Parameters:
Returns:

tuple (threshold, score, None)

threshold_function(feature, data, cont_distrib=None, weights=0)

Evaluates possible splits of a continuous feature into a binary one and scores them.

Parameters:
Returns:

list of tuples [(threshold, score, None),]

class Orange.multitarget.tree.MultiTreeLearner(**kwargs)

Bases: Orange.classification.tree.TreeLearner

MultiTreeLearner is a multi-target version of a tree learner. It is the same as TreeLearner, except for the default values of two parameters:

measure

A multi-target score is used by default: MultitargetVariance.

node_learner

Standard trees use MajorityLearner to construct prediction models in the leaves of the tree. MultiTreeLearner uses the multi-target equivalent which can be obtained simply by wrapping the majority learner:

Orange.multitarget.MultitargetLearner (Orange.classification.majority.MajorityLearner()).

__call__(data, weight=0)
Parameters:
  • data (Orange.data.Table) – Data instances to learn from.
  • weight (int) – Id of meta attribute with weights of instances.
__init__(**kwargs)

The constructor passes all arguments to TreeLearner‘s constructor Orange.classification.tree.TreeLearner.__init__.

class Orange.multitarget.tree.MultiTree(base_classifier=None)

Bases: Orange.classification.tree.TreeClassifier

MultiTree classifier is almost the same as the base class it extends (TreeClassifier). Only the __call__ method is modified so it works with multi-target data.

__call__(instance, return_type=0)
Parameters:
  • instance (Orange.data.Instance) – Instance to be classified.
  • return_type – One of Orange.classification.Classifier.GetValue, Orange.classification.Classifier.GetProbabilities or Orange.classification.Classifier.GetBoth