statsmodels.stats.outliers_influence.GLMInfluence

class statsmodels.stats.outliers_influence.GLMInfluence(results, resid=None, endog=None, exog=None, hat_matrix_diag=None, cov_params=None, scale=None)[source]

Influence and outlier measures (experimental)

This uses partly formulas specific to GLM, specifically cooks_distance is based on the hessian, i.e. observed or expected information matrix and not on cov_params, in contrast to MLEInfluence. Standardization for changes in parameters, in fittedvalues and in the linear predictor are based on cov_params.

Parameters:
resultsinstance of results class

This only works for model and results classes that have the necessary helper methods.

other arguments are only to override default behavior and are used instead
of the corresponding attribute of the results class.
By default resid_pearson is used as resid.

Notes

This has not yet been tested for correctness when offset or exposure are used, although they should be supported by the code.

Some GLM specific measures like d_deviance are still missing.

Computing an explicit leave-one-observation-out (LOOO) loop is included but no influence measures are currently computed from it.

Attributes:
dbetas

change in parameters divided by the standard error of parameters from the full model results, bse.

d_fittedvalues_scaled

Change in fittedvalues scaled by standard errors.

d_linpred

Change in linear prediction

d_linpred_scale

local change in linear prediction scaled by the standard errors for the prediction based on cov_params.

Methods

plot_index([y_var, threshold, title, ax, idx])

index plot for influence attributes

plot_influence([external, alpha, criterion, ...])

Plot of influence in regression.

resid_score([joint, index, studentize])

Score observations scaled by inverse hessian.

resid_score_factor()

Score residual divided by sqrt of hessian factor.

summary_frame()

Creates a DataFrame with influence results.

Properties

cooks_distance

Cook's distance

d_fittedvalues

Change in expected response, fittedvalues.

d_fittedvalues_scaled

Change in fittedvalues scaled by standard errors.

d_linpred

Change in linear prediction

d_linpred_scaled

Change in linpred scaled by standard errors

d_params

Change in parameter estimates

dfbetas

Scaled change in parameter estimates.

hat_matrix_diag

Diagonal of the hat_matrix for GLM

hat_matrix_exog_diag

Diagonal of the hat_matrix using only exog as in OLS

params_one

Parameter estimate based on one-step approximation.

resid_studentized

Internally studentized pearson residuals


Last update: Mar 18, 2024