Sampling and Testing (testing)
- Randomness in tests

Sampling and Testing (`testing`)¶

Module Orange.evaluation.testing contains methods for cross-validation, leave-one out, random sampling and learning curves. These procedures split the data onto training and testing set and use the training data to induce models; models then make predictions for testing data. Predictions are collected in ExperimentResults, together with the actual classes and some other data. The latter can be given to functions scoring that compute the performance scores of models.

import Orange

iris = Orange.data.Table("iris")
learners = [Orange.classification.bayes.NaiveLearner(),
            Orange.classification.majority.MajorityLearner()]

cv = Orange.evaluation.testing.cross_validation(learners, iris, folds=5)
print ["%.4f" % score for score in Orange.evaluation.scoring.CA(cv)]

The following call makes 100 iterations of 70:30 test and stores all the induced classifiers.

res = Orange.evaluation.testing.proportion_test(learners, iris, 0.7, 100, store_classifiers=1)

Different evaluation techniques are implemented as instance methods of Evaluation class. For ease of use, an instance of this class is created at module loading time and instance methods are exposed as functions in Orange.evaluation.testing.

Randomness in tests¶

If evaluation method uses random sampling, parameter random_generator can be used to either provide either a random seed or an instance of Random. If omitted, a new instance of random generator is constructed for each call of the method with random seed 0.

Note

Running the same script twice will generally give the same results.

For conducting a repeatable set of experiments, construct an instance of Random and pass it to all of them. This way, all methods will use different random numbers, but they will be the same for each run of the script.

For truly random number, set seed to a random number generated with python random generator. Since python’s random generator is reset each time python is loaded with current system time as seed, results of the script will be different each time you run it.

class Orange.evaluation.testing.Evaluation¶

Common methods for learner evaluation.

cross_validation(learners, examples, folds=10, stratified=StratifiedIfPossible, preprocessors=(), random_generator=0, callback=None, store_classifiers=False, store_examples=False)¶

Cross validation test with specified number of folds.

Parameters:

learners – list of learning algorithms
examples – data instances used for training and testing
folds – number of folds
stratified – tells whether to stratify the sampling
preprocessors – a list of preprocessors to be used on data (obsolete)
random_generator – random seed or generator (see above)
callback – a function that is called after finishing each fold
store_classifiers – if True, classifiers are stored in results
store_examples – if True, examples are stored in results

Returns:

ExperimentResults

leave_one_out(learners, examples, preprocessors=(), callback=None, store_classifiers=False, store_examples=False)¶

Leave-one-out evaluation of learning algorithms.

Parameters:	learners – list of learning algorithms examples – data instances used for training and testing preprocessors – a list of preprocessors (obsolete) callback – a function that is called after finishing each fold store_classifiers – if `True`, classifiers are stored in results store_examples – if `True`, examples are stored in results
Returns:	`ExperimentResults`

proportion_test(learners, examples, learning_proportion=0.7, times=10, stratification=StratifiedIfPossible, preprocessors=(), random_generator=0, callback=None, store_classifiers=False, store_examples=False)¶

Iteratively split the data into training and testing set, and train and test the learnign algorithms.

Parameters:

learners – list of learning algorithms
examples – data instances used for training and testing
learning_proportion – proportion of data used for training
times – number of iterations
stratification – use stratified sampling
preprocessors – a list of preprocessors (obsolete)
random_generator – random seed or generator (see above)
callback – a function that is called after each fold
store_classifiers – if True, classifiers are stored in results
store_examples – if True, examples are stored in results

Returns:

ExperimentResults

test_with_indices(learners, examples, indices, preprocessors=(), callback=None, store_classifiers=False, store_examples=False)¶

Perform a cross-validation-like test. Examples for each fold are selected based on given indices.

Parameters:

learners – list of learning algorithms
examples – data instances used for training and testing
indices – a list of integer indices that sort examples into folds; each index corresponds to an example from examples
preprocessors – a list of preprocessors (obsolete)
callback – a function that is called after each fold
store_classifiers – if True, classifiers are stored in results
store_examples – if True, examples are stored in results

Returns:

ExperimentResults

one_fold_with_indices(learners, examples, fold, indices, preprocessors=(), weight=0)¶: Similar to test_with_indices except that it performs single fold of cross-validation, given by argument fold.

learn_and_test_on_learn_data(learners, examples, preprocessors=(), callback=None, store_classifiers=False, store_examples=False)¶

Train learning algorithms and test them on the same data.

Parameters:	learners – list of learning algorithms examples – data instances used for training and testing preprocessors – a list of preprocessors (obsolete) callback – a function that is called after each learning store_classifiers – if `True`, classifiers are stored in results store_examples – if `True`, examples are stored in results
Returns:	`ExperimentResults`

learn_and_test_on_test_data(learners, learn_set, test_set, preprocessors=(), callback=None, store_classifiers=False, store_examples=False)¶

Train learning algorithms on one data sets and test them on another.

Parameters:	learners – list of learning algorithms learn_set – training instances test_set – testing instances preprocessors – a list of preprocessors (obsolete) callback – a function that is called after each learning store_classifiers – if `True`, classifiers are stored in results store_examples – if `True`, examples are stored in results
Returns:	`ExperimentResults`

learning_curve(learners, examples, cv_indices=None, proportion_indices=None, proportions=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], preprocessors=(), random_generator=0, callback=None)¶

Compute a learning curve using multiple cross-validations where models are trained on different portions of the training data.

Parameters:

learners – list of learning algorithms
examples – data instances used for training and testing
cv_indices – indices used for cross validation (leave None for 10-fold CV)
proportion_indices – indices for proportion selection (leave None to let the function construct the folds)
proportions – list of proportions of data used for training
preprocessors – a list of preprocessors (obsolete)
random_generator – random seed or generator (see above)
callback – a function that is be called after each learning

Returns:

list of ExperimentResults

learning_curve_n(learners, examples, folds=10, proportions=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], stratification=StratifiedIfPossible, preprocessors=(), random_generator=0, callback=None)¶

Compute a learning curve using multiple cross-validations where models are trained on different portions of the training data. Similar to learning_curve except for simpler arguments.

Parameters:

learners – list of learning algorithms
examples – data instances used for training and testing
folds – number of folds for cross-validation
proportions – list of proportions of data used for training
stratification – use stratified sampling
preprocessors – a list of preprocessors (obsolete)
random_generator – random seed or generator (see above)
callback – a function that is be called after each learning

Returns:

list of ExperimentResults

learning_curve_with_test_data(learners, learn_set, test_set, times=10, proportions=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], stratification=StratifiedIfPossible, preprocessors=(), random_generator=0, store_classifiers=False, store_examples=False)¶

Compute a learning curve given two datasets. Models are learned on proportion of the first dataset and then tested on the second.

Parameters:

learners – list of learning algorithms
learn_set – training data
test_set – testing data
times – number of iterations
straitification – use stratified sampling
proportions – a list of proportions of training data to be used
preprocessors – a list of preprocessors (obsolete)
random_generator – random seed or generator (see above)
store_classifiers – if True, classifiers are stored in results
store_examples – if True, examples are stored in results

Returns:

list of ExperimentResults

test_on_data(classifiers, examples, store_classifiers=False, store_examples=False)¶

Test classifiers on the given data

Parameters:	classifiers – a list of classifiers examples – testing data store_classifiers – if `True`, classifiers are stored in results store_examples – if `True`, examples are stored in results

Sampling and Testing (testing)¶

Randomness in tests¶

Sampling and Testing (`testing`)¶