Orange Forum • View topic - How to compare classifiers across different data sets?

How to compare classifiers across different data sets?

General discussions about Orange and with Orange connected things (data mining, machine learning, bioinformatics...).

How to compare classifiers across different data sets?

Postby bricklemacho » Thu Oct 03, 2013 14:38

I have been using AUC as the primary measure to rank classifiers. With the same data set I have been using Orange.evaluation.scoring.McNemar_of_two() to determine if the difference between classifiers is significant.

I have 2 datasets with different features, and a third which is a combination of the other two. I think ranking them using AUC is still fine (need to find a mathematician to talk to).

How can I determine if the difference between classifiers, across different data sets, is significant?

Re: How to compare classifiers across different data sets?

Postby bricklemacho » Thu Oct 03, 2013 14:47

A couple of papers claim the the Wilcoxon signed-rank test is reasonable. This method Orange.evaluation.scoring.AUCWilcoxon() looks close. To perform the test, is it just a matter of collating the from AUCWilcoson() for each iteration (10 fold cross validation) for each classier. And then using the using scipy.stats.wilcoxon() to obtain the p-value?

Re: How to compare classifiers across different data sets?

Postby bricklemacho » Fri Oct 04, 2013 8:31

After discussion with the resident "math" guru in our lab here is what I doing. Hopefully is will be of some help to others, should they need to compare classifiers from different data sets (populations).

1. Use Orange.evaluation.scoring.AUCWilcoxon() for each classifier. As I understand this function, this produces a WIlcoxon statistic. That is a value that can be used in the Wilcoxon-signed rank test.

2. Given two Wilcoxon statistics, I can now use scipy.stats.wilcoxon() to obtain the p-value.

Michael.
--

Note: Posting as a separate message in case other are tracking as I am not sure if "edits" trigger emails etc.


Return to General Discussions



cron