statsmodels.multivariate.pca.pca¶

statsmodels.multivariate.pca.
pca
(data, ncomp=None, standardize=True, demean=True, normalize=True, gls=False, weights=None, method='svd')[source]¶ Principal Component Analysis
Parameters:  data (array) – Variables in columns, observations in rows.
 ncomp (int, optional) – Number of components to return. If None, returns the as many as the smaller to the number of rows or columns of data.
 standardize (bool, optional) – Flag indicating to use standardized data with mean 0 and unit variance. standardized being True implies demean.
 demean (bool, optional) – Flag indicating whether to demean data before computing principal components. demean is ignored if standardize is True.
 normalize (bool , optional) – Indicates whether th normalize the factors to have unit inner product. If False, the loadings will have unit inner product.
 weights (array, optional) – Series weights to use after transforming data according to standardize or demean when computing the principal components.
 gls (bool, optional) – Flag indicating to implement a twostep GLS estimator where in the first step principal components are used to estimate residuals, and then the inverse residual variance is used as a set of weights to estimate the final principal components
 method (str, optional) – Determines the linear algebra routine uses. ‘eig’, the default, uses an eigenvalue decomposition. ‘svd’ uses a singular value decomposition.
Returns:  factors (array or DataFrame) – nobs by ncomp array of of principal components (also known as scores)
 loadings (array or DataFrame) – ncomp by nvar array of principal component loadings for constructing the factors
 projection (array or DataFrame) – nobs by var array containing the projection of the data onto the ncomp estimated factors
 rsquare (array or Series) – ncomp array where the element in the ith position is the Rsquare of including the fist i principal components. The values are calculated on the transformed data, not the original data.
 ic (array or DataFrame) – ncomp by 3 array containing the Bai and Ng (2003) Information criteria. Each column is a different criteria, and each row represents the number of included factors.
 eigenvals (array or Series) – nvar array of eigenvalues
 eigenvecs (array or DataFrame) – nvar by nvar array of eigenvectors
Notes
This is a simple function wrapper around the PCA class. See PCA for more information and additional methods.