wiki:GSoC/Ideas

Version 83 (modified by thocevar, 16 months ago) (diff)

Google Summer of Code Ideas

Here is a list of ideas for projects that might be interesting and useful to carry out during Google Summer of Code program for Orange. Your own ideas in methods for data analytics and visualization that would complement Orange are most welcome, too!

You can find more information about our participation in Google Summer of Code here.

Ideas are listed in no particular order.

Open ideas for 2013

Porting Python code to Orange 3.0

We are (still) migrating Orange to Python 3.0 and at the same time reimplementing Orange from scratch. We are ditching the C++ and relying on numpy, scipy.sparse, scikit-learn and similar libraries, and using Cython for anything that needs to be fast but is not provided elsewhere. We are looking for help in reimplementing various parts of Orange (note that this is not about porting Orange from Python 2.X to Py3K: this is trivial and can be done in one evening - we tried it). This project requires working closely (e.g. almost daily communication) with the core team.

Required skills: Good knowledge of Orange, Python and its libraries, and Cython.

Level from 1 (beginner) to 5 (professional): 4

Possible mentors: Janez, Marko

Parallel model evaluation

Most techniques for model evaluation are embarrassingly parallel: example bootstrap, cross validation, leave one out validation. Implement their parallel versions for Orange 3. Parallelization should be seamless, from the point of view of the user – script writer. This project requires working closely (e.g. almost daily communication) with the core team.

Required skills: Python, experience with multiprocessing.

Level from 1 (beginner) to 5 (professional): 4

Possible mentors: Marko

Neural Networks

Orange implements many algorithms for classification, but currently lacks support for neural network learning. The task consists of implementing neural networks (multilayer perceptron, convolutional and deep belief networks) in Python with numpy. Neural networks will be implemented as an addon to Orange.

Level from 1 (beginner) to 5 (professional): 4

Possible mentors: Jure

dictyExpress

The  dictyExpress is an interactive, web-based exploratory data analytics application for analysis of over 1,000 Dictyostelium gene expression experiments. The app is basically a web interface to Orange algorithms that run in the cloud. We are porting the obsolete Flash user interface to HTML5/JavaScript. This is an extremely multidisciplinary project that requires wide range of skills.

Useful skills: JavaScript, HTML5, CSS3, AngularJS, Bootstrap, Python, Django, basic bioinformatics knowledge.

Level from 1 (beginner) to 5 (professional): 4

Possible mentors: Miha

Widgets in separate processes

Widgets in Orange Canvas currently run in a single process. As they are independent given their inputs, they could frequently work in parallel (in a  data-flow manner). The objective of this task would be to modify Orange Canvas so that each widget could run in its own process.

It would be also useful to separate GUI thread from main payload computation of widgets. Currently we are using also just one thread for everything (GUI thread) and we have, while widget is working, to repeatedly callback into the GUI to make it responsive.

We would start by making a single widget able to run in its own process and then integrating it into the canvas. Afterwards, we would try to find a systematic way to branch off widgets into subprocesses with the least changes in the current code base.

Useful skills: Python programming with multiple processes and threads. Qt and PyQt experience.

Level from 1 (beginner) to 5 (professional): 5.

Possible mentors: Marko