Skip to search boxSkip to navigationSkip to main content

Component retention in principal component analysis with application to cDNA microarray data

  • aUniversity of Arizona
Research Output: Contribution to journal Article Peer-review

Open access

Abstract

Shannon entropy is used to provide an estimate of the number of interpretable components in a principal component analysis. In addition, several ad hoc stopping rules for dimension determination are reviewed and a modification of the broken stick model is presented. The modification incorporates a test for the presence of an "effective degeneracy" among the subspaces spanned by the eigenvectors of the correlation matrix of the data set then allocates the total variance among subspaces. A summary of the performance of the methods applied to both published microarray data sets and to simulated data is given.