statsmodels.stats.outliers_influence.GLMInfluence

class statsmodels.stats.outliers_influence.GLMInfluence(results, resid=None, endog=None, exog=None, hat_matrix_diag=None, cov_params=None, scale=None)[source]

Influence and outlier measures (experimental)

This uses partly formulas specific to GLM, specifically cooks_distance is based on the hessian, i.e. observed or expected information matrix and not on cov_params, in contrast to MLEInfluence. Standardization for changes in parameters, in fittedvalues and in the linear predictor are based on cov_params.

Parameters:
results : instance of results class

This only works for model and results classes that have the necessary helper methods.

instead : other arguments are only to override default behavior and are used

class. : of the corresponding attribute of the results

resid. : By default resid_pearson is used as

dbetas

change in parameters divided by the standard error of parameters from the full model results, bse.

d_fittedvalues_scaled

same as d_fittedvalues but scaled by the standard errors of a predicted mean of the response.

d_linpred

local change in linear prediction.

d_linpred_scale

local change in linear prediction scaled by the standard errors for the prediction based on cov_params.

Notes

This has not yet been tested for correctness when offset or exposure are used, although they should be supported by the code.

Some GLM specific measures like d_deviance are still missing.

Computing an explicit leave-one-observation-out (LOOO) loop is included but no influence measures are currently computed from it.

Methods

plot_index([y_var, threshold, title, ax, idx])

index plot for influence attributes

plot_influence([external, alpha, criterion, ...])

Plot of influence in regression.

resid_score([joint, index, studentize])

Score observations scaled by inverse hessian.

resid_score_factor()

Score residual divided by sqrt of hessian factor.

summary_frame()

Creates a DataFrame with influence results.

Properties

cooks_distance

Cook's distance

d_fittedvalues

Change in expected response, fittedvalues.

d_fittedvalues_scaled

Change in fittedvalues scaled by standard errors.

d_linpred

Change in linear prediction

d_linpred_scaled

Change in linpred scaled by standard errors

d_params

Change in parameter estimates

dfbetas

Scaled change in parameter estimates.

hat_matrix_diag

Diagonal of the hat_matrix for GLM

hat_matrix_exog_diag

Diagonal of the hat_matrix using only exog as in OLS

params_one

Parameter estimate based on one-step approximation.

resid_studentized

Internally studentized pearson residuals