class statsmodels.multivariate.factor.Factor(endog=None, n_factor=1, corr=None, method='pa', smc=True, endog_names=None, nobs=None, missing='drop')[source]

Factor analysis

  • endog (array-like) – Variables in columns, observations in rows. May be None if corr is not None.

  • n_factor (int) – The number of factors to extract

  • corr (array-like) – Directly specify the correlation matrix instead of estimating it from endog. If provided, endog is not used for the factor analysis, it may be used in post-estimation.

  • method (str) – The method to extract factors, currently must be either ‘pa’ for principal axis factor analysis or ‘ml’ for maximum likelihood estimation.

  • smc (True or False) – Whether or not to apply squared multiple correlations (method=’pa’)

  • endog_names (str) – Names of endogeous variables. If specified, it will be used instead of the column names in endog

  • nobs (int) – The number of observations, not used if endog is present. Needs to be provided for inference if endog is None.

  • missing ('none', 'drop', or 'raise') – Missing value handling for endog, default is row-wise deletion ‘drop’ If ‘none’, no nan checking is done. If ‘drop’, any observations with nans are dropped. If ‘raise’, an error is raised.



Supported rotations: ‘varimax’, ‘quartimax’, ‘biquartimax’, ‘equamax’, ‘oblimin’, ‘parsimax’, ‘parsimony’, ‘biquartimin’, ‘promax’

If method=’ml’, the factors are rotated to satisfy condition IC3 of Bai and Li (2012). This means that the scores have covariance I, so the model for the covariance matrix is L * L’ + diag(U), where L are the loadings and U are the uniquenesses. In addition, L’ * diag(U)^{-1} L must be diagonal.



Hofacker, C. (2004). Exploratory Factor Analysis, Mathematical Marketing.

J Bai, K Li (2012). Statistical analysis of factor models of high dimension. Annals of Statistics.


fit([maxiter, tol, start, opt_method, opt, …])

Estimate factor model parameters.

from_formula(formula, data[, subset, drop_cols])

Create a Model from a formula and dataframe.


Evaluate the log-likelihood function.

predict(params[, exog])

After a model has been fit predict returns the fitted values.


Evaluate the score function (first derivative of loglike).



Names of endogenous variables


Names of exogenous variables