Orange Forum • View topic - Participating in GSoC 2011: Time series analysis tool

Participating in GSoC 2011: Time series analysis tool

General discussions about Orange and with Orange connected things (data mining, machine learning, bioinformatics...).

Participating in GSoC 2011: Time series analysis tool

Postby pivithuru » Mon Mar 21, 2011 16:28

Hello,
I am Pivithuru Wijegunawardana, a final year undergraduate Computer Science and Engineering student of University of Moratuwa. I am interested in project idea Time series analysis tool for Orange.
I have a sound background on data analysis together with time series analysis. In last summer I developed a statistical data analysis[1] tool for Openoffice in Openoffice summer internship program 2010[2]. And also I have developed a time series forecasting application in java where it includes different time series analysis methods like ARIMA, regression, Moving average etc.
I have a very good knowledge in Java programming but I am not that much familiar with Python. But I am sure with my knowledge on programming and my understanding on Time series analysis, catching up with Python will not be a big issue in succeeding the project.
It will be really helpful if I can have your suggestions regarding the project and requirement of Python regarding the project.

[1] http://wiki.services.openoffice.org/wik ... lysis_Tool
[2] http://wiki.services.openoffice.org/wik ... plications

Thank you
Regards,
Pivithuru

Re: Participating in GSoC 2011: Time series analysis tool

Postby Mitar » Mon Mar 21, 2011 23:17

After some soul-searching we have sadly discovered that we do not have suitable mentor for this idea as we have not yet done much in this field. Please check ideas page if you find something else interesting for you or maybe propose something else yourself.

Re: Participating in GSoC 2011: Time series analysis tool

Postby Mitar » Wed Mar 23, 2011 13:26

After even more soul-searching I have decided that I will try to be a mentor for this, but you will have to be more or less independent about researching time-series algorithms. I will help you with integration into Orange and general guidance. So if this is OK for you, then check again the ideas page.

What is important is find a way to integrate this with the rest or Orange. So to implement various feature extraction algorithms which can then be analyzed with existing tools. Do you have experience with this?

Re: Participating in GSoC 2011: Time series analysis tool

Postby wvanlint » Fri Mar 25, 2011 15:57

Hi,

I am Willem Van Lint, currently an engineering and computer science student in the first year of my master's degree at the Catholic University of Leuven.
I would also be interested in doing the project for time series analysis.

I am experienced in using Java and Python since I used those languages in previous internships respectively at my university and at CERN, Geneva.
Throughout my studies, I have had courses on machine learning, data mining and signal processing.
In my machine learning course, we focused mostly on attribute value learning but I am sure I can research the algorithms used for time series learning.
I also have a general understanding of feature extraction (again, mostly applied to attribute value problems, I've seen Component Analysis, Multidimensional Scaling, Self Organizing Maps and the basis of supervised methods).
After reading the proposal, I've begun researching more information and my background in signal processing might be useful since I am comfortable with Fourier transforms and frequency domain analysis.

I also have questions regarding my schedule and the application.

Since I study in Belgium, I have exams throughout June.
My summer holiday lasts from the beginning of July until the end of September.
Therefore, I could either start working sooner or start later and work longer.
Personally, I would like to start later but this would conflict with evaluations.
So I am willing to do as much as I can earlier on.
Do you have any suggestions regarding this?

For the project proposal, do I need to have thought out the algorithms/implementation structure or should I just mention what features I want to implement?

Best regards,

Willem Van Lint

Re: Participating in GSoC 2011: Time series analysis tool

Postby Mitar » Fri Mar 25, 2011 19:01

We and Orange is also currently doing mostly attribute value learning. So this is why we would like to somehow try and (at least for now) concentrate mostly on techniques which translate time-series into features we can apply our existing tools on.

Of course also other techniques are encouraged, but you will be much more on your own there. And will have to code much more to get visible/useful results (more components).

Timeline is given by Google. Sadly, we cannot stretch it in any way. (Of course, you can always work on your ideas outside of the GSoC and outside GSoC timeline. We would be glad to have you aboard also outside the GSoC.) So because of us you can start working immediately after you will know if you have been accepted. And work as much as you want. What we want is that you will do what will you propose to do. And of course do it well. And of course we will also evaluate by what you have proposed. So you should not overbook yourself and also not propose too little. So if you convince us in your proposal with how will you divide your time and show to us that it is doable, then this is this. ;-)

I suggest that you opt just for known algorithms and implementations. Thins by-the-book and so on. So for that it is not really necessary to thought out them, but it would look nice on your proposal if you research exactly what you want to implement and reference books/papers providing necessary details to you how to do that. So research and choose. Maybe check other similar tools, what they provide, what things are basic and should be included and what are things we can do later on.


Return to General Discussions