statsmodels.stats.outliers_influence.OLSInfluence

class statsmodels.stats.outliers_influence.OLSInfluence(results)[source]

class to calculate outlier and influence measures for OLS result

Parameters
resultsRegressionResults

currently assumes the results are from an OLS regression

Notes

One part of the results can be calculated without any auxiliary regression (some of which have the _internal postfix in the name. Other statistics require leave-one-observation-out (LOOO) auxiliary regression, and will be slower (mainly results with _external postfix in the name). The auxiliary LOOO regression only the required results are stored.

Using the LOO measures is currently only recommended if the data set is not too large. One possible approach for LOOO measures would be to identify possible problem observations with the _internal measures, and then run the leave-one-observation-out only with observations that are possible outliers. (However, this is not yet available in an automized way.)

This should be extended to general least squares.

The leave-one-variable-out (LOVO) auxiliary regression are currently not used.

Attributes
det_cov_params_not_obsi

determinant of cov_params of all LOOO regressions

params_not_obsi

parameter estimates for all LOOO regressions

Methods

cooks_distance()

Cooks distance

cov_ratio()

covariance ratio between LOOO and original

dfbeta()

dfbetas

dfbetas()

uses results from leave-one-observation-out loop

dffits()

dffits measure for influence of an observation

dffits_internal()

dffits measure for influence of an observation

ess_press()

Error sum of squares of PRESS residuals

get_resid_studentized_external([sigma])

calculate studentized residuals

hat_diag_factor()

Factor of diagonal of hat_matrix used in influence

hat_matrix_diag()

Diagonal of the hat_matrix for OLS

influence()

Influence measure

plot_index([y_var, threshold, title, ax, idx])

index plot for influence attributes

plot_influence([external, alpha, criterion, …])

Plot of influence in regression.

resid_press()

PRESS residuals

resid_std()

estimate of standard deviation of the residuals

resid_studentized()

Studentized residuals using variance from OLS

resid_studentized_external()

Studentized residuals using LOOO variance

resid_studentized_internal()

Studentized residuals using variance from OLS

resid_var()

estimate of variance of the residuals

sigma2_not_obsi()

error variance for all LOOO regressions

summary_frame()

Creates a DataFrame with all available influence results.

summary_table([float_fmt])

create a summary table with all influence and outlier measures