Changes between Version 2 and Version 3 of MultiLabelClassification

05/04/11 18:34:18 (3 years ago)

Initial version of the plan of MultiLabelClassification for Google Summer code by Wencan Luo


  • MultiLabelClassification

    v2 v3  
    11= Plan of Multi-label Classification Implementation = 
    22== Dataset Support == 
    3 TODO 
    4 == TODO == 
     3=== Method 1: add a special prefix to each labels === 
     4Add a special prefix to each class label and set the optional flag to be ‘meta’. For example, there are a four-label data set, “Sports”, “Religion”, ”Science”, ”Politics”, respectively. Then we can name their attribute names as “_c_Sports”, “_c_Religion”, ” _c_Science”, ” _c_Politics”. With this flag, we can deal the labels. 
     5What needs to be changed? 
     6* [//doc/reference/tabdelimited.htm tab file] 
     7* Methods of [//doc/reference/Example.htm Example], like getClass() 
     8Problems to be solved: 
     9* Whenever it visits the class attributes, the code should search all the attributes to locate the attributes that have prefix “_c_” 
     11=== Method 2: using special Attribute value ===  
     12Since Orange can also support arbitrary attributes types such as a list, derived from [//doc/reference/PythonVariable.htm PythonVariable]. In addition, they can be converted to ordinary types. See more in [//doc/reference/PythonVariable.htm PythonVariable] link. In this way, we can store the labels as a list, and use special converter like [//doc/reference/Variable.htm#getValueFrom getValueFrom] to deal with it. 
     14=== Method 3: adding a special value into their 'attributes' dictionary ===  
     16=== Method 4: allow to multi 'class' optional flags ===  
     17Now the [//doc/reference/tabdelimited.htm tab file] can have only at most one 'class' flag. We can allow several attributes to be 'class'. 
     18What needs to be changed? 
     19* [//doc/reference/ExampleTable.htm ExampleTable]: add a vector to store all the class's names 
     20* date input and output related to ExampleTable 
     22== Problem-Transformation Methods == 
     23* BR (mulan/classifier/transformation/ 
     24* CLR (mulan/classifier/transformation/ 
     25* LP (mulan/classifier/transformation/ 
     26* PPT (mulan/classifier/transformation/ 
     28== Algorithm Adaptation Methods == 
     29* ML-kNN (mulan/classifier/lazy/ 
     30* BR-KNN (mulan/classifier/lazy/ 
     31* MultiLabel-KNN (mulan/classifier/lazy/ 
     32* MMP (mulan/classifier/neural/model/ 
     33* NaiveBayes (weka.classifiers.bayes/NaiveBayes.class) 
     35== Evaluation Measures == 
     36* Hamming loss (mulan/evaluation/loss/ 
     37* Example-based Accuracy, Precision, Recall (mulan/evaluation/measure/ExampleBased*.java) 
     38* Label-based (mulan/evaluation/measure/LabelBased*.java) 
     39* Ranking (mulan/evaluation/measure/ 
     40* Average precision (mulan/evaluation/measure/ 
     41* Hierarchical (mulan/evaluation/measure/ 
     43== Feature Selection == 
     44* LP based (mulan/dimensionalityReduction/ 
     45* Transformation based(../ 
     46* Ranking (../ 
     48== GUI Support == 
     49See [//doc/widgets Widget development] manual 
    551== Timeline == 
    6 TODO 
     52 April 25 – May 23 (Before official coding time) :: To discuss the details about my ideas with my mentor to archive a final agreement, including designing the dataset support, choosing which transformation and adaptive methods to implement. Based on the final agreement, I will write some testing to make clear about all goals. 
     53 May 23 – June 18 (Official coding period starts) :: Start to design the framework to support multi-label classification, including the multi-label data structure –instance, instances, attribute, evaluator, etc. 
     54 Coding on designing basic multi-label dataset, two of problem-transformation methods-Binary relevance (BR), Calibrated label ranking (CLR), one GUI widget, and two evaluation measures: Example-based Hamming-Loss, Classfication Accuracy, Precision, Recall; Label-based. 
     55 June 18 – July 5 :: Finish the work on improving problem-transformation methods, and test the whole work to ensure it can work properly. 
     56 Start to code on algorithm adaptation method: ML-KNN, Multi-class multi-label perceptron (MMP). 
     57 July 6 – July 15 (Mid-term) :: Finish the work on adaptation models and do some test work to ensure it could work properly. Start to implement feature selection methods: LP based and Transformation based. 
     58 Submit mid-term evaluation. 
     59 July 16– July 31 :: Finish all my work on the Multi-Label project and do bug fixing work and test. 
     60 Make a document about what we have now and what to do next. 
     61 August 1 – August 15 :: Redundant time for some unpredictable stuff to do. 
     62 If it is possible, I could work on to implement more problem-transformation models, adapted models and evaluation methods. 
     63 Submit final evaluation.