Changes between Version 17 and Version 18 of GSoC/Ideas


Ignore:
Timestamp:
03/10/11 17:10:17 (4 years ago)
Author:
mitar
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • GSoC/Ideas

    v17 v18  
    55''Ideas are listed in no particular order.'' 
    66 
    7 == Time-series analysis == 
     7== Repository for add-ons == 
    88 
    9 Orange currently lacks any [wikipedia:Time_series time-series] analysis tools. It would be great to develop some basic tools for dealing with them: reading, normalizing, basic pattern search, some (auto-)correlation and similar basic techniques, and so on. Research what other similar applications support and propose which features would be useful to have as a basic set of tools. 
     9Orange supports add-ons which can add new features to scripting and new widgets (GUI). Currently, this feature is highly underused and used only for few internally developed add-ons. It would be great to open this in such way that also contributors around the world would be able to submit their add-ons to some central repository from which would then be possible install/use add-ons into Orange. This could be in some form of a web portal, maybe something along the lines of [http://trac-hacks.org/ Trac Hacks]. The portal should encourage collaboration, code exchange, help and community. In this way also a global data mining and machine learning community collaboration will be improved. 
    1010 
    11 Useful skills: Python. Data analysis experience. Digital signal processing experience could also help. 
     11It would be good to try to integrate this with existing technologies and portals ([https://bitbucket.org/ Bitbucket], [https://github.com/ GitHub], [http://pypi.python.org/ Python Package Index]). 
     12 
     13Useful skills: Python. Web programming experience (suggested technologies are Django and jQuery). 
     14 
     15Level from 1 (beginner) to 5 (professional): 3 
     16 
     17== Bridge between Orange and R == 
     18 
     19[http://www.r-project.org/ R] contains many great methods/tools which would be also very useful in Orange. To prevent duplication of work (and implementation) it would be great to be able to use those methods/tools directly in Orange (so that it is not necessary to reimplement them in Orange). 
     20 
     21The idea is to research possibilities for this and then implement a future-proof bridge between Orange and R. 
     22 
     23Useful skills: Python. C/C++. Experience with R. Experience with program-to-program interfaces. 
    1224 
    1325Level from 1 (beginner) to 5 (professional): 4 
     
    2234 
    2335Level from 1 (beginner) to 5 (professional): 5 
     36 
     37== Support for parallel computation for scripting/backend == 
     38 
     39One other idea discusses the idea of making GUI process in parallel/separate processes. But this idea talks about having scripting part (backend part) of the Orange support (semi)automatic parallelisation/separation into processes and possible also processes over different computers. For example, [wikipedia:Cross-validation_(statistics) cross-validation] with multiple folds is one simple example of easy parallelized technique, as each fold can be independently computed and then easily combined into the final result. 
     40 
     41It would be good to analyze such opportunities for parallelization, find what they have in common and maybe devise a small helper library (possibly a wrapper for some existing grid computing system, like Xgrid) to use in code to easily make it run in parallel, if such environment is available, and run normally if not. And the of course move as much of already existing implementations to this new support for parallelization. 
     42 
     43Useful skills: Python. Grid computing experience. 
     44 
     45Level from 1 (beginner) to 5 (professional): 4 
     46 
     47== Benchmarking and optimizing Orange == 
     48 
     49It would be useful to test and benchmark different aspects of Orange and find bottle-necks. Furthermore, also find, propose and implement solutions for them. Orange implements various algorithms and some implementations are better than others. It would be useful to compare our implementations with others and see how they compare and if they should be improved. 
     50 
     51Useful skills: Experience with testing and benchmarking software. Experience with common patterns which make programs run slowly. 
     52 
     53Level from 1 (beginner) to 5 (professional): 3 
    2454 
    2555== Anova regression == 
     
    4878Level from 1 (beginner) to 5 (professional): 3 
    4979 
    50 == Support for parallel computation for scripting/backend == 
     80== Time-series analysis == 
    5181 
    52 One other idea discusses the idea of making GUI process in parallel/separate processes. But this idea talks about having scripting part (backend part) of the Orange support (semi)automatic parallelisation/separation into processes and possible also processes over different computers. For example, [wikipedia:Cross-validation_(statistics) cross-validation] with multiple folds is one simple example of easy parallelized technique, as each fold can be independently computed and then easily combined into the final result. 
     82Orange currently lacks any [wikipedia:Time_series time-series] analysis tools. It would be great to develop some basic tools for dealing with them: reading, normalizing, basic pattern search, some (auto-)correlation and similar basic techniques, and so on. Research what other similar applications support and propose which features would be useful to have as a basic set of tools. 
    5383 
    54 It would be good to analyze such opportunities for parallelization, find what they have in common and maybe devise a small helper library (possibly a wrapper for some existing grid computing system, like Xgrid) to use in code to easily make it run in parallel, if such environment is available, and run normally if not. And the of course move as much of already existing implementations to this new support for parallelization. 
    55  
    56 Useful skills: Python. Grid computing experience. 
     84Useful skills: Python. Data analysis experience. Digital signal processing experience could also help. 
    5785 
    5886Level from 1 (beginner) to 5 (professional): 4 
     
    6795 
    6896Level from 1 (beginner) to 5 (professional): 3 
    69  
    70 == Benchmarking and optimizing Orange == 
    71  
    72 It would be useful to test and benchmark different aspects of Orange and find bottle-necks. Furthermore, also find, propose and implement solutions for them. Orange implements various algorithms and some implementations are better than others. It would be useful to compare our implementations with others and see how they compare and if they should be improved. 
    73  
    74 Useful skills: Experience with testing and benchmarking software. Experience with common patterns which make programs run slowly. 
    75  
    76 Level from 1 (beginner) to 5 (professional): 3 
    77  
    78 == Bridge between Orange and R == 
    79  
    80 [http://www.r-project.org/ R] contains many great methods/tools which would be also very useful in Orange. To prevent duplication of work (and implementation) it would be great to be able to use those methods/tools directly in Orange (so that it is not necessary to reimplement them in Orange). 
    81  
    82 The idea is to research possibilities for this and then implement a future-proof bridge between Orange and R. 
    83  
    84 Useful skills: Python. C/C++. Experience with R. Experience with program-to-program interfaces. 
    85  
    86 Level from 1 (beginner) to 5 (professional): 4