Orange Forum • View topic - Canvas vs scripting feature ranking

Canvas vs scripting feature ranking

A place to ask questions about methods in Orange and how they are used and other general support.

Canvas vs scripting feature ranking

Postby mway » Wed Mar 27, 2013 15:41

I'm trying to reproduce the feature rankings that I get using the rank widget in Orange Canvas with a Python script. I'm specifically trying to rank the features by gain ratio. Since it seems that this can only be calculated for discrete attributes, if I do

gainRatio = Orange.feature.scoring.GainRatio()
scoredFeatures = [(feature.name, gainRatio(feature,data)) for feature in data.domain.features]

where data is just an Orange data table, I get an error since I have continuous features. I've tried discretizing the domain using orange.DiscretizationEntropy, and then using the code above, but I don't get the same scores that I do when I do the scoring in Orange Canvas. I do get the same numbers if, in Orange Canvas, I do the same discretization prior to the ranking, but I haven't been able to reproduce the attribute gain ratio scores that are given when I just connect the data table directly to the Rank widget in Canvas. So I guess I'm just wondering which method of discretization is used in this scenario prior to scoring/how I can reproduce those scores in a script.

Thanks!!

Re: Canvas vs scripting feature ranking

Postby Ales » Thu Mar 28, 2013 11:29

Rank widget uses equal frequency discretization for score functions that don't support continuous features.

Re: Canvas vs scripting feature ranking

Postby mway » Fri Mar 29, 2013 19:50

Perfect, thank you!


Return to Questions & Support