statsmodels.stats.outliers_influence.GLMInfluence

class statsmodels.stats.outliers_influence.GLMInfluence(results, resid=None, endog=None, exog=None, hat_matrix_diag=None, cov_params=None, scale=None)[source]

Influence and outlier measures (experimental)

This uses partly formulas specific to GLM, specifically cooks_distance is based on the hessian, i.e. observed or expected information matrix and not on cov_params, in contrast to MLEInfluence. Standardization for changes in parameters, in fittedvalues and in the linear predictor are based on cov_params.

Parameters
resultsinstance of results class

This only works for model and results classes that have the necessary helper methods.

other arguments are only to override default behavior and are used instead
of the corresponding attribute of the results class.
By default resid_pearson is used as resid.

Notes

This has not yet been tested for correctness when offset or exposure are used, although they should be supported by the code.

Some GLM specific measures like d_deviance are still missing.

Computing an explicit leave-one-observation-out (LOOO) loop is included but no influence measures are currently computed from it.

Attributes
dbetas

change in parameters divided by the standard error of parameters from the full model results, bse.

d_fittedvalues_scaled

Change in fittedvalues scaled by standard errors

d_linpred

Change in linear prediction

d_linpred_scale

local change in linear prediction scaled by the standard errors for the prediction based on cov_params.

Methods

cooks_distance()

Cook’s distance

d_fittedvalues()

Change in expected response, fittedvalues

d_params()

Change in parameter estimates

dfbetas()

Scaled change in parameter estimates

hat_matrix_diag()

Diagonal of the hat_matrix for GLM

params_one()

Parameter estimate based on one-step approximation

plot_index([y_var, threshold, title, ax, idx])

index plot for influence attributes

plot_influence([external, alpha, criterion, …])

Plot of influence in regression.

resid_studentized()

Internally studentized pearson residuals

summary_frame()

Creates a DataFrame with influence results.