Principal Component Analysis on non-Gaussian Dependent Data

You are here: Home / Submitted Papers / 2015 / Principal Component Analysis on non-Gaussian Dependent Data

Abstract

n this paper, we analyze the performance of a semiparametric principal component analysis named Copula Component Analysis (COCA) (Han & Liu, 2012) when the data are dependent. The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are multivariate Gaussian. We study the sce- nario where the observations are drawn from non-i.i.d. processes (m-dependency or a more general φ-mixing case). We show that COCA can allow weak dependence. In particular, we provide the generalization bounds of con- vergence for both support recovery and pa- rameter estimation of COCA for the depen- dent data. We provide explicit sufficient con- ditions on the degree of dependence, under which the parametric rate can be maintained. To our knowledge, this is the first work an- alyzing the theoretical performance of PCA for the dependent data in high dimensional settings. Our results strictly generalize the analysis in Han & Liu (2012) and the tech- niques we used have the separate interest for analyzing a variety of other multivariate sta- tistical methods.

Author

Fang Han and Han Liu

Journal

Proceedings of the 30th International Conference on Machine Learning

Paper Publication Date

2013

Paper Type

Astrostatistics