Orange Forum • View topic - data normalisation

data normalisation

A place to ask questions about methods in Orange and how they are used and other general support.

data normalisation

Postby jon » Tue Jan 12, 2010 11:06

Hi

Im relatively new to Orange,,but enjoying this well integrated system.

I have a question about scaling data,

Ive trained an SVM and Random Forest classifier on a bioinformatics dataset.

The parameters I got for the SVM were obatained by the
orngSVM.SVMLearnerEasy. Which by default scales the data with the _normalize function.

Training a new SVMLearner with these parameters also seems to have normalisation set by default.


If I cPickle the trained SVM, reload it and test on the some previously tested examples,
I get the same result,,so It looks like the normalisation parameters are somehow stored in the classifier and scale any new examples appropriately,
which is great.


Im just using the default parameters for the random forest which regardless of this gives a similar level of performance to the optimised SVM.
although it doesnt appear to scale the data by default.
(on the subject of Random Forests does anyone have experience which parameters are most important to optimise, is scaling useful here?)

Basically I just wanted to check if my picture of what was happening with regar to scaling was correct,
before writing this up.


thanks

jon

Return to Questions & Support



cron