statsmodels.stats.anova.anova_lm

statsmodels.stats.anova.anova_lm(*args, **kwargs)[source]

Anova table for one or more fitted linear models.

Parameters:
  • args (fitted linear model results instance) – One or more fitted linear models
  • scale (float) – Estimate of variance, If None, will be estimated from the largest model. Default is None.
  • test (str {"F", "Chisq", "Cp"} or None) – Test statistics to provide. Default is “F”.
  • typ (str or int {"I","II","III"} or {1,2,3}) – The type of Anova test to perform. See notes.
  • robust ({None, "hc0", "hc1", "hc2", "hc3"}) – Use heteroscedasticity-corrected coefficient covariance matrix. If robust covariance is desired, it is recommended to use hc3.
Returns:

anova – When args is a single model, return is DataFrame with columns:

sum_sq : float64

Sum of squares for model terms.

df : float64

Degrees of freedom for model terms.

F : float64

F statistic value for significance of adding model terms.

PR(>F) : float64

P-value for significance of adding model terms.

When args is multiple models, return is DataFrame with columns:

df_resid : float64

Degrees of freedom of residuals in models.

ssr : float64

Sum of squares of residuals in models.

df_diff : float64

Degrees of freedom difference from previous model in args

ss_dff : float64

Difference in ssr from previous model in args

F : float64

F statistic comparing to previous model in args

PR(>F): float64

P-value for significance comparing to previous model in args

Return type:

DataFrame

Notes

Model statistics are given in the order of args. Models must have been fit using the formula api.

See also

model_results.compare_f_test, model_results.compare_lm_test

Examples

>>> import statsmodels.api as sm
>>> from statsmodels.formula.api import ols
>>> moore = sm.datasets.get_rdataset("Moore", "carData", cache=True) # load
>>> data = moore.data
>>> data = data.rename(columns={"partner.status" :
...                             "partner_status"}) # make name pythonic
>>> moore_lm = ols('conformity ~ C(fcategory, Sum)*C(partner_status, Sum)',
...                 data=data).fit()
>>> table = sm.stats.anova_lm(moore_lm, typ=2) # Type 2 Anova DataFrame
>>> print(table)