Changes between Version 62 and Version 63 of GSoC/Ideas


Ignore:
Timestamp:
03/08/12 12:32:13 (2 years ago)
Author:
crt
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • GSoC/Ideas

    v62 v63  
    1212Current [https://bitbucket.org/biolab/orange-addon-text Orange add-on for text mining] is outdated and incomplete. Source code needs rafactoring in order to be compliant with [http://orange.biolab.si/trac/wiki/Orange25 Orange 2.5 development guidelines]. Additionally, current text mining add-on lacks of documentation in reST format (including tutorial for beginners), unit tests and installation supported by PyPI (http://pypi.python.org/pypi). 
    1313 
    14 Project should also include a comparison between already implemented basic text (pre)processing techniques (lemmatization, steaming, document distance, feature sub selection, phrase detection) in current version of add-on and latest state-of-the-art techniques. If necessary additional algorithms should be implemented. It would be very nice if text mining add-on functionalities would we also available from widgets in OrangeCanvas.  
     14Project should also include a comparison between already implemented basic text (pre)processing techniques (lemmatization, steaming, document distance, feature sub selection, phrase detection) in current version of add-on and latest state-of-the-art techniques. If necessary additional algorithms (for example: multinomial Naive Bayes) should be (re)implemented. It would be very nice if text mining add-on functionalities would we also available from widgets in OrangeCanvas.  
    1515 
    1616Useful skills: Python. Data mining.