Abstract
n this paper, we analyze the performance of a semiparametric principal component analysis named Copula Component Analysis (COCA) (Han & Liu, 2012) when the data are dependent. The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are multivariate Gaussian. We study the sce- nario where the observations are drawn from non-i.i.d. processes (m-dependency or a more general φ-mixing case). We show that COCA can allow weak dependence. In particular, we provide the generalization bounds of con- vergence for both support recovery and pa- rameter estimation of COCA for the depen- dent data. We provide explicit sufficient con- ditions on the degree of dependence, under which the parametric rate can be maintained. To our knowledge, this is the first work an- alyzing the theoretical performance of PCA for the dependent data in high dimensional settings. Our results strictly generalize the analysis in Han & Liu (2012) and the tech- niques we used have the separate interest for analyzing a variety of other multivariate sta- tistical methods.
Author
Fang Han and Han Liu
Journal
Proceedings of the 30th International Conference on Machine Learning
Paper Publication Date
2013
Paper Type
Astrostatistics