Stability of principal components under normal and non-normal parent populations and different covariance structures scenarios

Research output: Contribution to journalArticlepeer-review

Abstract

Principal Component Analysis (PCA) is one of the most used multivariate techniques for dimension reduction assuming nowadays a particular relevance due to the increasingly common large datasets. Being mainly used as a descriptive/exploratory tool it does not require any explicit a priori assumption. However, regardless the parent population miss/unknown characterization, sample principal components are often used to characterize the parent population structure, as these are frequently targeted to visualize multivariate datasets on a 2D graphical display or to infer the first two latent dimensions. In this context, although the main goal might not be inferential, sample principal components may fail to provide a valid solution as principal components may vary considerably, depending on the extracted sample. The stability of the PCA solution is here studied considering normal and non-normal parent populations and three covariance structures scenarios. In addition, the effects of the covariance parameter, the dimension and the size of the sample are also investigated via Monte Carlo simulations. This study aims to understand how stability varies with the population and sample features, characterize the conditions under which PCA results are expected to be stable, and study a sample criterion for PCA stability.

Original languageEnglish
Number of pages18
JournalJournal of Statistical Computation and Simulation
DOIs
Publication statusE-pub ahead of print - 7 Oct 2022

Keywords

  • Principal components
  • eigenvectors
  • nonnormality
  • simulation
  • stability

Fingerprint

Dive into the research topics of 'Stability of principal components under normal and non-normal parent populations and different covariance structures scenarios'. Together they form a unique fingerprint.

Cite this