statsmodels.tsa.stattools.acf¶
- statsmodels.tsa.stattools.acf(x, adjusted=False, nlags=None, qstat=False, fft=True, alpha=None, bartlett_confint=True, missing='none')[source]¶
Calculate the autocorrelation function.
- Parameters:
- xarray_like
The time series data.
- adjustedbool,
default
False
If True, then denominators for autocovariance are n-k, otherwise n.
- nlags
int
,optional
Number of lags to return autocorrelation for. If not provided, uses min(10 * np.log10(nobs), nobs - 1). The returned value includes lag 0 (ie., 1) so size of the acf vector is (nlags + 1,).
- qstatbool,
default
False
If True, returns the Ljung-Box q statistic for each autocorrelation coefficient. See q_stat for more information.
- fftbool,
default
True
If True, computes the ACF via FFT.
- alphascalar,
default
None
If a number is given, the confidence intervals for the given level are returned. For instance if alpha=.05, 95 % confidence intervals are returned where the standard deviation is computed according to Bartlett”s formula.
- bartlett_confintbool,
default
True
Confidence intervals for ACF values are generally placed at 2 standard errors around r_k. The formula used for standard error depends upon the situation. If the autocorrelations are being used to test for randomness of residuals as part of the ARIMA routine, the standard errors are determined assuming the residuals are white noise. The approximate formula for any lag is that standard error of each r_k = 1/sqrt(N). See section 9.4 of [2] for more details on the 1/sqrt(N) result. For more elementary discussion, see section 5.3.2 in [3]. For the ACF of raw data, the standard error at a lag k is found as if the right model was an MA(k-1). This allows the possible interpretation that if all autocorrelations past a certain lag are within the limits, the model might be an MA of order defined by the last significant autocorrelation. In this case, a moving average model is assumed for the data and the standard errors for the confidence intervals should be generated using Bartlett’s formula. For more details on Bartlett formula result, see section 7.2 in [2].
- missing
str
,default
“none” A string in [“none”, “raise”, “conservative”, “drop”] specifying how the NaNs are to be treated. “none” performs no checks. “raise” raises an exception if NaN values are found. “drop” removes the missing observations and then estimates the autocovariances treating the non-missing as contiguous. “conservative” computes the autocovariance using nan-ops so that nans are removed when computing the mean and cross-products that are used to estimate the autocovariance. When using “conservative”, n is set to the number of non-missing observations.
- Returns:
- acf
ndarray
The autocorrelation function for lags 0, 1, …, nlags. Shape (nlags+1,).
- confint
ndarray
,optional
Confidence intervals for the ACF at lags 0, 1, …, nlags. Shape (nlags + 1, 2). Returned if alpha is not None.
- qstat
ndarray
,optional
The Ljung-Box Q-Statistic for lags 1, 2, …, nlags (excludes lag zero). Returned if q_stat is True.
- pvalues
ndarray
,optional
The p-values associated with the Q-statistics for lags 1, 2, …, nlags (excludes lag zero). Returned if q_stat is True.
- acf
Notes
The acf at lag 0 (ie., 1) is returned.
For very long time series it is recommended to use fft convolution instead. When fft is False uses a simple, direct estimator of the autocovariances that only computes the first nlag + 1 values. This can be much faster when the time series is long and only a small number of autocovariances are needed.
If adjusted is true, the denominator for the autocovariance is adjusted for the loss of data.
References
[1]Parzen, E., 1963. On spectral analysis with missing observations and amplitude modulation. Sankhya: The Indian Journal of Statistics, Series A, pp.383-392.
[2]Brockwell and Davis, 1987. Time Series Theory and Methods
[3]Brockwell and Davis, 2010. Introduction to Time Series and Forecasting, 2nd edition.