statsmodels.stats.outliers_influence.MLEInfluence

class statsmodels.stats.outliers_influence.MLEInfluence(results, resid=None, endog=None, exog=None, hat_matrix_diag=None, cov_params=None, scale=None)[source]

Global Influence and outlier measures (experimental)

Parameters:
results : instance of results class

This only works for model and results classes that have the necessary helper methods.

arguments : other

Those are only available to override default behavior and are used instead of the corresponding attribute of the results class. By default resid_pearson is used as resid.

hat_matrix_diag(hii)

local derivative of fittedvalues (predicted mean) with respect to the observed response for each observation. Not available for ZeroInflated models because of nondifferentiability.

Type:

This is the generalized leverage computed as the

d_params

full Hessian corrected by division by (1 - hii). If hat_matrix_diag is not available, then the division by (1 - hii) is not included.

Type:

Change in parameters computed with one Newton step using the

dbetas

from the full model results, bse.

Type:

change in parameters divided by the standard error of parameters

cooks_distance

cov_params from the full model divided by the number of variables. It includes p-values based on the F-distribution which are only approximate outside of linear Gaussian models.

Type:

quadratic form for change in parameters weighted by

resid_studentized

computed from the score residuals scaled by hessian factor and leverage. This does not use cov_params.

Type:

In the general MLE case resid_studentized are

d_fittedvalues

parameters as computed in d_params.

Type:

local change of expected mean given the change in the

d_fittedvalues_scaled

errors of a predicted mean of the response.

Type:

same as d_fittedvalues but scaled by the standard

params_one

from the full sample minus d_params.

Type:

is the one step parameter estimate computed as params

Notes

MLEInfluence uses generic definitions based on maximum likelihood models.

MLEInfluence produces the same results as GLMInfluence for canonical links (verified for GLM Binomial, Poisson and Gaussian). There will be some differences for non-canonical links or if a robust cov_type is used. For example, the generalized leverage differs from the definition of the GLM hat matrix in the case of Probit, which corresponds to family Binomial with a non-canonical link.

The extension to non-standard models, e.g. multi-link model like BetaModel and the ZeroInflated models is still experimental and might still change. Additonally, ZeroInflated and some threshold models have a nondifferentiability in the generalized leverage. How this case is treated might also change.

Warning: This does currently not work for constrained or penalized models, e.g. models estimated with fit_constrained or fit_regularized.

This has not yet been tested for correctness when offset or exposure are used, although they should be supported by the code.

status: experimental, This class will need changes to support different kinds of models, e.g. extra parameters in discrete.NegativeBinomial or two-part models like ZeroInflatedPoisson.

Methods

plot_index([y_var, threshold, title, ax, idx])

index plot for influence attributes

plot_influence([external, alpha, criterion, ...])

Plot of influence in regression.

resid_score([joint, index, studentize])

Score observations scaled by inverse hessian.

resid_score_factor()

Score residual divided by sqrt of hessian factor.

summary_frame()

Creates a DataFrame with influence results.

Properties

cooks_distance

Cook's distance and p-values.

d_fittedvalues

Change in expected response, fittedvalues.

d_fittedvalues_scaled

Change in fittedvalues scaled by standard errors.

d_params

Approximate change in parameter estimates when dropping observation.

dfbetas

Scaled change in parameter estimates.

hat_matrix_diag

Diagonal of the generalized leverage

hat_matrix_exog_diag

Diagonal of the hat_matrix using only exog as in OLS

params_one

Parameter estimate based on one-step approximation.

resid_studentized

studentized default residuals.