Orange Forum • View topic - [GSoC] Neural Network Proposal

[GSoC] Neural Network Proposal

General discussions about Orange and with Orange connected things (data mining, machine learning, bioinformatics...).

[GSoC] Neural Network Proposal

Postby ResHacker » Thu Apr 12, 2012 12:36

I have sent my proposal to GSoC's application system two weeks ago, but there was no reply whatsoever. Thus, I am now also posting it here.

In this project, I plan to implement a neural network module for Orange. For this project, I will use either my own scalable neural network library Thunder Neural Networks, or the C++ library named EBLearn.

Neural Network Modules
Linear module
Bias module
Branch module
Sum module
Tanh (Sigmoid) module
Softmax module
Negexp (negative exponential) module
Conv1 (1-D convolution) module
Conv2 (2-D convolution) module

Loss functions
Euclidean (square) loss
Negative log likelihood loss

Training/optimization methods
Naive stochatic gradient descent
Averaged stochastic gradient descent
Parallelized stochastic gradient descent using alternating direction methods of multipliers
Others (maybe 2nd order methods) if required...

Target Usages
Feature Learning / Deep Learning

April 1st - June 14th: Continue the Implementation of Thunder Neural Networks, and study the Python language and the Orange software. All the core parts will be developed.
June 15th - July 31st: Design and Implementation of Neural Network Models for the Orange software. Interfacing the TNN library or the EBLearn library with Orange.
August 1st - August 20th: Test and Experiment on the software modules developed. Here is a list of potential tests:
1. Use the UCI speech recognition data to test multiple-layered neural network models
2. Implement a LeNet-5 document recognition network in the library and test on the MNIST dataset

April 1st - May 15th: 10 hours / week
May 16th - August 20th: > 40 hours / week

Biography and Free Software experiences

During my B.E. study at Tianjin University, I am the assitant administrator of the Computer Graphics and Animation Group (CGAG), and for 3 years there I had research experience in the areas of computer vision and computational photography, advised by professor Liu Shiguang. In 2011 I have started my M.S. study at New York University majoring computer science, during which time I was admitted as a student of the Computational and Biological Learning Lab (CBLL), conducting research in the area of machine learning advised by professor Yann LeCun at NYU.

For now I am focusing on improving the efficiency of gradient-based learning algorithms using parallel computing techniques. Inspired by this project, I have initiated a new parallelized and scalable neural network library Thunder Neural Networks (TNN, ... l-Networks). The future of this project is to provide the user a scalable environment to deploy algorithms of neural networks, energy-based models, regression, feature learning models and deep learning models, with the environments of pThread, MPI and Hadoop. Furthermore, we will also implement this library in such a way that it can adapt to different numerical environments based on user configuration, such as Intel IPL, OpenCL, CUDA and FFTW. Simply put, we will make it the fastest and most scalable neural network library ever.

For more information regarding the open source C library Thunder Neural Networks, please check its Github repository: ... l-Networks
It is nearly finished, scheduled to give a first release in June.

If you think the TNN library is not mature enough to be used for Orange, I intend to use the EBLearn library which has the same functionality, but contrained to a limit implementation of training algorithms:

I am familiar with both of the code above, since TNN is now mainly developed by me, with reference to the design paradigm of EBLearn. If this proposal draws interests, I sincerely hope for your response and advisement.

Return to General Discussions