Changes between Version 1 and Version 2 of MatrixFactorization


Ignore:
Timestamp:
05/06/11 00:44:51 (3 years ago)
Author:
MarinkaZitnik
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • MatrixFactorization

    v1 v2  
    1 To be populated. 
     1 
     2= Matrix Factorization Techniques for Data Mining = 
     3 
     4Unified and efficient interface to matrix factorization algorithms and methods. A scripting library which includs a number of published factorization algorithms and initialization methods and facilitates the combination of these to produce new strategies. 
     5 
     6Extensive documentation with working examples which demonstrate real applications, commonly used benchmark data and visualization methods are provided to help with the interpretation and comprehension of the results. 
     7 
     8== Overview == 
     9 
     10List of algorithms for matrix factorization.  
     11 
     12=== nmf === 
     13Standard NMF based on Kullbach Leibler divergence, simple multiplicative updates, enhanced to avoid numerical overflow [2]. [[BR]] 
     14Reference: (Brunet, 2004). 
     15 
     16=== s-nmf === 
     17Sparse NMF based on alternating non-negativity constrained least squares, solved by a fast non-negativity constrained least squares. Sparseness imposed on the left, right factor. [[BR]] 
     18Reference: (Kim, 2007). 
     19 
     20=== ns-nmf === 
     21Non-smooth NMF. Uses a modifed version of Lee and Seung's multiplicative updates for Kullbach-Leibler divergence. It is meant to give sparser results.[[BR]] 
     22Reference: (Pascual-Montano, 2006). 
     23 
     24=== l-nmf === 
     25Local Fisher NMF. Add the Fisher constraint to maximize the between-class scatter and minimize the within-class scatter.[[BR]] 
     26Reference: (Wang, 2004). 
     27 
     28=== ls-nmf === 
     29Alternating Least Square NMF. It is meant to be very fast compared to other approaches.[[BR]] 
     30Reference: (Kim, 2007). 
     31 
     32=== pmf === 
     33Probabilistic MF. Model which scales linearly with the number of observations and performs well on large, sparse, imbalanced datasets.[[BR]] 
     34Reference: (Salakhutdinov, 2008). 
     35 
     36=== nnma === 
     37Nonnegative matrix approximation. Method for dimensionality reduction with respect on the nonnegativity of input data. Multiplicative iterative scheme.[[BR]] 
     38Reference: (Sra, 2006). 
     39 
     40=== psmf === 
     41Probabilistic Sparse MF. PSMF allows for varying levels of sensor noise, uncertainty in the hidden prototypes used to explain the data and uncertainty as to the prototypes selected to explain each data vector.[[BR]] 
     42Reference: (Dueck, 2005). 
     43 
     44=== bd === 
     45Bayesian decomposition. A Bayesian treatment of NMF, based on a normal likelihood and exponential priors, Gibbs sampler to approximate the posterior density.[[BR]] 
     46Reference: (Schmidt, 2003). 
     47 
     48=== bfrm === 
     49Bayesian factor regression model. Markov chain Monte Carlo technique.[[BR]] 
     50Reference: (Schmidt, 2003). 
     51 
     52=== i-nmf === 
     53Interval-valued NMF.[[BR]] 
     54Reference: (Shen, 2010). 
     55 
     56=== i-pmf === 
     57Interval-valued PMF.[[BR]] 
     58Reference: (Shen, 2010). 
     59 
     60 
     61== Milestones == 
     62 
     63'''April 20 -- May 23 (Before the official coding time)''' 
     64 * To do some self coding to improve my further understanding of techniques. 
     65 * I will become absolutely clear about my future implementations, design and approaches I will follow. 
     66 
     67'''May 23 -- June 18 (Official coding period starts)''' 
     68 * Interface to perform all algorithms and combine them with initialization methods and extensions. 
     69 * Implementing family of NMF techniques: NMF, nsNMF, lsNMF. 
     70 
     71'''June 18 -- July 5''' 
     72 * Implementing family of NMF techniques: sNMF, lNMF, NNMA. 
     73 
     74'''July 5 -- July 15''' 
     75 * Extend NMF implementations with various extensions, additive and multiplicative update rules and initialization methods. 
     76 * Provide factorization quality measures: measure based on cophenetic correlation coefficient, sparseness, dispersion, residuals.  
     77 
     78'''July 15th mid-term evaluation deadline''' 
     79 
     80'''July 15 -- July 25''' 
     81 * Implement Bayesian methods. Bayesian decomposition using Gibbs sampler, MCMC techniques such Bayesian factor regression modeling. 
     82 * Handling PMF on large, sparse and unbalanced datasets (algorithm for probabilistic sparse matrix factorization (PSMF)). 
     83 
     84'''July 25 -- July 31''' 
     85 * Adapt PMF model to the interval-valued matrices and implement Interval-valued PMF (I-PMF) and Interval-valued NMF (I-NMF). 
     86 * Improve efficieny of the code, bug removal, exception handling, additional testing. 
     87 
     88'''August 1 -- August 15''' 
     89 * For Documentation. 
     90 * Devise working examples that demonstrate various types of applications. 
     91 
     92'''August 15 -- August 22''' 
     93 * A buffer for unpredictable delay. 
     94 
     95'''Redundant time''' 
     96 * Extend Bayesian methods (variational BD, linearly constrained BD). 
     97 
     98''' August 26th final evaluation deadline''' 
     99 
     100== References ==  
     101 * Brunet, J. P., Tamayo, P., Golub, T. R., and Mesirov, J. P. Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci USA, 2004, 101(12), 4164--4169. 
     102 * Kim, H., Park, H. Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics (Oxford, England), 2007, 23(12), 1495--502. 
     103 * Pascual-Montano, A., Carazo, J. M., Kochi, K., Lehmann, D., and Pascual-Marqui, R. D. Nonsmooth nonnegative matrix factorization (nsnmf). IEEE transactions on pattern analysis and machine intelligence, 2006 28(3), 403--415. 
     104 * Wang, Y., Turk, M. Fisher non-negative matrix factorization for learning local features, 2004. 
     105 * Salakhutdinov, R., Mnih, A. Probabilistic Matrix Factorization Learning, ICML, 2008. 
     106 * Sra, S.,Dhillon, I. S. Nonnegative Matrix Approximation : Algorithms and Applications. Sciences-New York, 2006, 1--36. 
     107 * Zhiyong Shen, Liang Du, Xukun Shen, Yidong Shen, Interval-valued Matrix Factorization with Applications, Data Mining, IEEE International Conference on, pp. 1037--1042, 2010 IEEE International Conference on Data Mining, 2010. 
     108 * Dueck, D., Morris, Q. D., Frey, B. J. Multi-way clustering of microarray data using probabilistic sparse matrix factorization. Bioinformatics (Oxford, England), 21 Suppl 1, 2005, 144--51. 
     109 * Schmidt, M. N., Winther, O., Kai Hansen, L. Bayesian non-negative matrix factorization. Bayesian Statistics 7 (Oxford), 2003. 
     110 * Ochs, M. F.,Kossenkov A. V. NIH Public Access. Methods, Methods Enzymol., 2009,467: 59--77.