This is documentation for Orange 2.7. For the latest documentation, see Orange 3.

Sampling and Testing (testing)

Module Orange.evaluation.testing contains methods for cross-validation, leave-one out, random sampling and learning curves. These procedures split the data onto training and testing set and use the training data to induce models; models then make predictions for testing data. Predictions are collected in ExperimentResults, together with the actual classes and some other data. The latter can be given to functions scoring that compute the performance scores of models.

import Orange

iris = Orange.data.Table("iris")
learners = [Orange.classification.bayes.NaiveLearner(),
            Orange.classification.majority.MajorityLearner()]

cv = Orange.evaluation.testing.cross_validation(learners, iris, folds=5)
print ["%.4f" % score for score in Orange.evaluation.scoring.CA(cv)]

The following call makes 100 iterations of 70:30 test and stores all the induced classifiers.

res = Orange.evaluation.testing.proportion_test(learners, iris, 0.7, 100, store_classifiers=1)

Different evaluation techniques are implemented as instance methods of Evaluation class. For ease of use, an instance of this class is created at module loading time and instance methods are exposed as functions in Orange.evaluation.testing.

Randomness in tests

If evaluation method uses random sampling, parameter random_generator can be used to either provide either a random seed or an instance of Random. If omitted, a new instance of random generator is constructed for each call of the method with random seed 0.

Note

Running the same script twice will generally give the same results.

For conducting a repeatable set of experiments, construct an instance of Random and pass it to all of them. This way, all methods will use different random numbers, but they will be the same for each run of the script.

For truly random number, set seed to a random number generated with python random generator. Since python’s random generator is reset each time python is loaded with current system time as seed, results of the script will be different each time you run it.

class Orange.evaluation.testing.Evaluation

Common methods for learner evaluation.

cross_validation(learners, examples, folds=10, stratified=StratifiedIfPossible, preprocessors=(), random_generator=0, callback=None, store_classifiers=False, store_examples=False)

Cross validation test with specified number of folds.

Parameters:
  • learners – list of learning algorithms
  • examples – data instances used for training and testing
  • folds – number of folds
  • stratified – tells whether to stratify the sampling
  • preprocessors – a list of preprocessors to be used on data (obsolete)
  • random_generator – random seed or generator (see above)
  • callback – a function that is called after finishing each fold
  • store_classifiers – if True, classifiers are stored in results
  • store_examples – if True, examples are stored in results
Returns:

ExperimentResults

leave_one_out(learners, examples, preprocessors=(), callback=None, store_classifiers=False, store_examples=False)

Leave-one-out evaluation of learning algorithms.

Parameters:
  • learners – list of learning algorithms
  • examples – data instances used for training and testing
  • preprocessors – a list of preprocessors (obsolete)
  • callback – a function that is called after finishing each fold
  • store_classifiers – if True, classifiers are stored in results
  • store_examples – if True, examples are stored in results
Returns:

ExperimentResults

proportion_test(learners, examples, learning_proportion=0.7, times=10, stratification=StratifiedIfPossible, preprocessors=(), random_generator=0, callback=None, store_classifiers=False, store_examples=False)

Iteratively split the data into training and testing set, and train and test the learnign algorithms.

Parameters:
  • learners – list of learning algorithms
  • examples – data instances used for training and testing
  • learning_proportion – proportion of data used for training
  • times – number of iterations
  • stratification – use stratified sampling
  • preprocessors – a list of preprocessors (obsolete)
  • random_generator – random seed or generator (see above)
  • callback – a function that is called after each fold
  • store_classifiers – if True, classifiers are stored in results
  • store_examples – if True, examples are stored in results
Returns:

ExperimentResults

test_with_indices(learners, examples, indices, preprocessors=(), callback=None, store_classifiers=False, store_examples=False)

Perform a cross-validation-like test. Examples for each fold are selected based on given indices.

Parameters:
  • learners – list of learning algorithms
  • examples – data instances used for training and testing
  • indices – a list of integer indices that sort examples into folds; each index corresponds to an example from examples
  • preprocessors – a list of preprocessors (obsolete)
  • callback – a function that is called after each fold
  • store_classifiers – if True, classifiers are stored in results
  • store_examples – if True, examples are stored in results
Returns:

ExperimentResults

one_fold_with_indices(learners, examples, fold, indices, preprocessors=(), weight=0)

Similar to test_with_indices except that it performs single fold of cross-validation, given by argument fold.

learn_and_test_on_learn_data(learners, examples, preprocessors=(), callback=None, store_classifiers=False, store_examples=False)

Train learning algorithms and test them on the same data.

Parameters:
  • learners – list of learning algorithms
  • examples – data instances used for training and testing
  • preprocessors – a list of preprocessors (obsolete)
  • callback – a function that is called after each learning
  • store_classifiers – if True, classifiers are stored in results
  • store_examples – if True, examples are stored in results
Returns:

ExperimentResults

learn_and_test_on_test_data(learners, learn_set, test_set, preprocessors=(), callback=None, store_classifiers=False, store_examples=False)

Train learning algorithms on one data sets and test them on another.

Parameters:
  • learners – list of learning algorithms
  • learn_set – training instances
  • test_set – testing instances
  • preprocessors – a list of preprocessors (obsolete)
  • callback – a function that is called after each learning
  • store_classifiers – if True, classifiers are stored in results
  • store_examples – if True, examples are stored in results
Returns:

ExperimentResults

learning_curve(learners, examples, cv_indices=None, proportion_indices=None, proportions=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], preprocessors=(), random_generator=0, callback=None)

Compute a learning curve using multiple cross-validations where models are trained on different portions of the training data.

Parameters:
  • learners – list of learning algorithms
  • examples – data instances used for training and testing
  • cv_indices – indices used for cross validation (leave None for 10-fold CV)
  • proportion_indices – indices for proportion selection (leave None to let the function construct the folds)
  • proportions – list of proportions of data used for training
  • preprocessors – a list of preprocessors (obsolete)
  • random_generator – random seed or generator (see above)
  • callback – a function that is be called after each learning
Returns:

list of ExperimentResults

learning_curve_n(learners, examples, folds=10, proportions=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], stratification=StratifiedIfPossible, preprocessors=(), random_generator=0, callback=None)

Compute a learning curve using multiple cross-validations where models are trained on different portions of the training data. Similar to learning_curve except for simpler arguments.

Parameters:
  • learners – list of learning algorithms
  • examples – data instances used for training and testing
  • folds – number of folds for cross-validation
  • proportions – list of proportions of data used for training
  • stratification – use stratified sampling
  • preprocessors – a list of preprocessors (obsolete)
  • random_generator – random seed or generator (see above)
  • callback – a function that is be called after each learning
Returns:

list of ExperimentResults

learning_curve_with_test_data(learners, learn_set, test_set, times=10, proportions=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], stratification=StratifiedIfPossible, preprocessors=(), random_generator=0, store_classifiers=False, store_examples=False)

Compute a learning curve given two datasets. Models are learned on proportion of the first dataset and then tested on the second.

Parameters:
  • learners – list of learning algorithms
  • learn_set – training data
  • test_set – testing data
  • times – number of iterations
  • straitification – use stratified sampling
  • proportions – a list of proportions of training data to be used
  • preprocessors – a list of preprocessors (obsolete)
  • random_generator – random seed or generator (see above)
  • store_classifiers – if True, classifiers are stored in results
  • store_examples – if True, examples are stored in results
Returns:

list of ExperimentResults

test_on_data(classifiers, examples, store_classifiers=False, store_examples=False)

Test classifiers on the given data

Parameters:
  • classifiers – a list of classifiers
  • examples – testing data
  • store_classifiers – if True, classifiers are stored in results
  • store_examples – if True, examples are stored in results