# SCIENTIA SINICA Informationis, Volume 48, Issue 12: 1622-1633(2018) https://doi.org/10.1360/N112017-00238

## Information compression based on principal component analysis: from one-order to higher-order • AcceptedJan 10, 2018
• PublishedSep 19, 2018
Share
Rating

### Abstract

In this paper, a statistical technique of principal component analysis (PCA) from one-order to higher-order data under the background of information compression is summarized, its characteristics and limitations of each order PCA are revealed from three different perspectives, and some possible research directions are pointed out. Firstly, the technique and some existing developments are summarized by a similar structure, their intrinsic similar statistical significance is further analyzed, and the common structure—the Pythagorean theorem that shows two equivalent expressions of PCA—“maximizing variability" and “minimizing square loss" is shown. Secondly, this paper analyzes three important angles of PCA: the first one starts from the perspective of the Pythagorean theorem and further points out that the PCA can be extended to more general loss function—“robust" or “sparse" PCA; the second view reveals the relationship between tensor decomposition and PCA, which leads to a new idea of constructing tensor decomposition from the PCA perspective; the last view shows that higher-order PCA has a limitation to reveal anisotropic structure and further points out that a new method, “depth PCA," can be used to conquer this limitation.

### References

 Pearson K. On lines and planes of closest fit to systems of points in space. London Edinburgh Dublin Philos Mag J Sci, 1901, 2: 559-572 CrossRef Google Scholar

 Hotelling H. Analysis of a complex of statistical variables into principal components.. J Educational Psychology, 1933, 24: 417-441 CrossRef Google Scholar

 Anderson T W. Asymptotic Theory for Principal Component Analysis. Ann Math Statist, 1963, 34: 122-148 CrossRef Google Scholar

 Johnstone I M. On the distribution of the largest eigenvalue in principal components analysis. Ann Stat, 2001, 29: 295--327. Google Scholar

 Baik J, Ben Arous G, Péché S. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann Probab, 2005, 33: 1643-1697 CrossRef Google Scholar

 Paul D. Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Stat Sin, 2007, 17: 1617--1642. Google Scholar

 Johnstone I M, Lu A Y. On Consistency and Sparsity for Principal Components Analysis in High Dimensions.. J Am Statistical Association, 2009, 104: 682-693 CrossRef PubMed Google Scholar

 Jung S, Marron J S. PCA consistency in high dimension, low sample size context. Ann Statist, 2009, 37: 4104-4130 CrossRef Google Scholar

 Onatski A. Asymptotics of the principal components estimator of large factor models with weakly influential factors. J Econom, 2012, 168: 244-258 CrossRef Google Scholar

 Shen D, Shen H, Zhu H, et al. The statistics and mathematics of high dimension low sample size asymptotics. Stat Sin, 2016, 26: 1747--1770. Google Scholar

 Wang W, Fan J. Asymptotics of empirical eigenstructure for high dimensional spiked covariance.. Ann Statist, 2017, 45: 1342-1374 CrossRef PubMed Google Scholar

 Jian Yang , Zhang D, Frangi A F. Two-dimensional pca: a new approach to appearance-based face representation and recognition. IEEE Trans Pattern Anal Machine Intell, 2004, 26: 131-137 CrossRef Google Scholar

 Ye J P, Janardan R, Li Q. GPCA: an efficient dimension reduction scheme for image compression and retrieval. In: Proceedings of the 10th ACM Sigkdd International Conference on Knowledge Discovery & Data Mining, Seattle, 2004. 354--363. Google Scholar

 Cai D, He X, Han J. Subspace Learning Based on Tensor Analysis. UIUCDCS-R-2005-2572. 2005. Google Scholar

 Ye J. Generalized Low Rank Approximations of Matrices. Mach Learn, 2005, 61: 167-191 CrossRef Google Scholar

 Haiping Lu , Plataniotis K N, Venetsanopoulos A N. MPCA: Multilinear Principal Component Analysis of Tensor Objects.. IEEE Trans Neural Netw, 2008, 19: 18-39 CrossRef PubMed Google Scholar

 Zou H, Hastie T, Tibshirani R. Sparse Principal Component Analysis. J Comput Graphical Stat, 2006, 15: 265-286 CrossRef Google Scholar

 Hastie T, Tibshirani R, Wainwright M. Statistical Learning with Sparsity the Lasso and Generalizations. London: Chapman and Hall/CRC, 2015. Google Scholar

 Huber P J. Robust Estimation of a Location Parameter. Ann Math Statist, 1964, 35: 73-101 CrossRef Google Scholar

 Huber P J. Robust Regression: Asymptotics, Conjectures and Monte Carlo. Ann Statist, 1973, 1: 799-821 CrossRef Google Scholar

 Chen X R, Zhao L C. M-method in Linear Model. Shanghai: Shanghai Scientific & Techinical Publishers, 1996. Google Scholar

 De Torre F. A framework for robust subspace learning. Int J Comput Vision, 2003, 54: 117-142 CrossRef Google Scholar

 Candes E J, Li X, Ma Y, et al. Robust principal component analysis? J ACM, 2011, 58: 1--37. Google Scholar

 Nie F, Yuan J, Huang H. Optimal mean robust principal component analysis. In: Proceedings of the 31st International Conference on Machine Learning, Beijing, 2014. Google Scholar

 Vu V Q, Lei J. Minimax rates of estimation for sparse PCA in high dimensions. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, La Palma, 2012. 1278--1286. Google Scholar

 Shen D, Shen H, Marron J S. Consistency of sparse PCA in High Dimension, Low Sample Size contexts. J Multivariate Anal, 2013, 115: 317-333 CrossRef Google Scholar

 Cai T T, Ma Z, Wu Y. Sparse PCA: Optimal rates and adaptive estimation. Ann Statist, 2013, 41: 3074-3110 CrossRef Google Scholar

 Nadler B. Finite sample approximation results for principal component analysis: A matrix perturbation approach. Ann Statist, 2008, 36: 2791-2817 CrossRef Google Scholar

 Reiss M, Wahl M. Non-asymptotic upper bounds for the reconstruction error of PCA. 2016,. arXiv Google Scholar

 Montanari A, Richard E. Non-Negative Principal Component Analysis: Message Passing Algorithms and Sharp Asymptotics. IEEE Trans Inform Theor, 2016, 62: 1458-1484 CrossRef Google Scholar

 Tucker L R. Implications of Factor Analysis of Three-way Matrices for Measurement of Change. Wisconsin: University of Wisconsin Press, 1963. Google Scholar

 Levin J. Three-mode factor analysis. Dissertation for Ph.D. Degree. Urbana: University of Illinois, 1963. Google Scholar

 Tucker L R. The Extension of Factor Analysis to Three-dimensional Matrices. New York: Springer-Verlag, 1964. Google Scholar

 Tucker L R. Some mathematical notes on three-mode factor analysis. Psychometrika, 1966, 31: 279-311 CrossRef Google Scholar

 Kroonenberg P M, de Leeuw J. Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika, 1980, 45: 69-97 CrossRef Google Scholar

 Kapteyn A, Neudecker H, Wansbeek T. An approach ton-mode components analysis. Psychometrika, 1986, 51: 269-275 CrossRef Google Scholar

 De Lathauwer L, De Moor B, Vandewalle J. A Multilinear Singular Value Decomposition. SIAM J Matrix Anal Appl, 2000, 21: 1253-1278 CrossRef Google Scholar

 Vasilescu M A O, Terzopoulos D. Multilinear analysis of image ensembles: tensor faces. In: Proceedings of the 7th European Conference on Computer Vision. Berlin: Springer, 2002. 447--460. Google Scholar

 ten Berge J M F, de Leeuw J, Kroonenberg P M. Some additional results on principal components analysis of three-mode data by means of alternating least squares algorithms. Psychometrika, 1987, 52: 183-191 CrossRef Google Scholar

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有