Orange Forum • View topic - Data pre-processing

Data pre-processing

General discussions about Orange and with Orange connected things (data mining, machine learning, bioinformatics...).

Data pre-processing

Postby romzee » Wed Jan 02, 2013 20:18

IBM SPSS Modeller provides an excellent functionality to speed up the job of data preprocessing through the use of Auto data preparation node (ADP)
It handles the following tasks :
1. analyzes the data and identifying fixes
2. screens out fields that are problematic or not likely to be useful
3. derives new attributes when appropriate (ie the derived attributes replaces the old attributes)
4. Using it in an interactive fashion, previewing the changes before they are made and accept or reject them as desired .
5. Provides a detailed analysis and predictive power of the attributes .

Does Orange contain such a similar widget which could help in easy pre processing of data ?
Is Orange looking forward towards such an initiative ?

Re: Data pre-processing

Postby Ales » Thu Jan 03, 2013 16:32

romzee wrote:Does Orange contain such a similar widget which could help in easy pre processing of data ?
The closest widget to this would be the Purge Domain widget but it is very limited (only removes columns with single defined (or used) values).

romzee wrote:Is Orange looking forward towards such an initiative ?

Yes. If you are interested in implementing it?

Re: Data pre-processing

Postby romzee » Fri Jan 04, 2013 8:24

Yes , I would like to take up the project .
I would provide you with the list of deliverables and a weekly plan leading to the project .
I hope to continue with this Thread .

Re: Data pre-processing

Postby romzee » Sat Jan 12, 2013 11:30

Here's a high level list of deliverables provided by the widget intended :
(Its basically an combined extension of the following widgets like Purge domain , Rank , attribute-statistics )

1. Gives a detailed analysis of the current set of attributes.
2. Gives a set of transformed attributes (improving the predictive power) along with its detailed analysis
3. Allows you to preview the transformed attributes .
4. Suggests you the best set of attributes (like Rank)
5. A user friendly UI to filter out the unanted attributes & replace them with transformed attributes.


Return to General Discussions