Orange Forum • View topic - Area under ROC in multivalued class problems

Area under ROC in multivalued class problems

A place to ask questions about methods in Orange and how they are used and other general support.

Area under ROC in multivalued class problems

Postby alm8i » Wed Aug 22, 2012 8:28

Hi,

I am relatively new to Orange so any help would be appreciated. I noticed that Oranges
default method for computing the area under ROC curve is ByWeightedPairs. I have also
found a python script in the documentation that makes me understand the computation
of the area under ROC if the testing method is CrossValidation. But i couldn't figure out
how ByWeightedPairs works if the testing method is LeaveOneOut. Obviously each fold
in this testing method contains only one item. So paring is not possible.
Is the default method for computing the area under ROC curve not ByWeightedPairs
or behaves ByWeightedPairs differently if we use LeaveOneOut testing?

Thanks in advance

Re: Area under ROC in multivalued class problems

Postby Anze » Mon Aug 27, 2012 12:22

Computation ByWeightedPair is used, when class variable contains more than one value (like in iris dataset). It specifies that AUC is computed for each pair of the class values (such as setosa vs. versicolor, versicolor vs. virginica and setosa vs. virginica), an weighted sum of computed AUC values is returned.

AUC for cross-validation is computed for each fold independently, and the average score is returned. (if the dataset has a multivalued class variable, the method from the first paragraph is used for each fold). As you have correctly noticed, AUC scores for each fold cannot be computed, when leave one out validation is used (because each fold contains only one example). In that specific case, Orange combines results from all folds and computes AUC on all of them.

If you are interested where this happens in the AUC code, you should see https://bitbucket.org/biolab/orange/src/c4f002bce47b/Orange/evaluation/scoring.py#cl-1725. When a fold contains only one example, is_CDT_empty(cdts[0]) will return True, and code below will compute AUC using all of the examples.

I hope this answers your question.

Anže


Return to Questions & Support