statsmodels.stats.correlation_tools.cov_nearest_factor_homog

statsmodels.stats.correlation_tools.cov_nearest_factor_homog(cov, rank)[source]

Approximate an arbitrary square matrix with a factor-structured matrix of the form k*I + XX’.

Parameters
  • cov (array-like) – The input array, must be square but need not be positive semidefinite

  • rank (positive integer) – The rank of the fitted factor structure

Returns

Return type

A FactoredPSDMatrix instance containing the fitted matrix

Notes

This routine is useful if one has an estimated covariance matrix that is not SPD, and the ultimate goal is to estimate the inverse, square root, or inverse square root of the true covariance matrix. The factor structure allows these tasks to be performed without constructing any n x n matrices.

The calculations use the fact that if k is known, then X can be determined from the eigen-decomposition of cov - k*I, which can in turn be easily obtained form the eigen-decomposition of cov. Thus the problem can be reduced to a 1-dimensional search for k that does not require repeated eigen-decompositions.

If the input matrix is sparse, then cov - k*I is also sparse, so the eigen-decomposition can be done effciciently using sparse routines.

The one-dimensional search for the optimal value of k is not convex, so a local minimum could be obtained.

Examples

Hard thresholding a covariance matrix may result in a matrix that is not positive semidefinite. We can approximate a hard thresholded covariance matrix with a PSD matrix as follows:

>>> import numpy as np
>>> np.random.seed(1234)
>>> b = 1.5 - np.random.rand(10, 1)
>>> x = np.random.randn(100,1).dot(b.T) + np.random.randn(100,10)
>>> cov = np.cov(x)
>>> cov = cov * (np.abs(cov) >= 0.3)
>>> rslt = cov_nearest_factor_homog(cov, 3)