statsmodels.stats.anova.anova_lm#

statsmodels.stats.anova.anova_lm(*args, **kwargs)[source]#

Anova table for one or more fitted linear models

Parameters:

*argsfitted linear model results instance: One or more fitted linear models
scalefloat: Estimate of variance, If None, will be estimated from the largest model. Default is None.
teststr {“F”, “Chisq”, “Cp”} or None: Test statistics to provide. Default is “F”.
typstr or int {“I”,”II”,”III”} or {1,2,3}: The type of Anova test to perform. See notes.
robust{None, “hc0”, “hc1”, “hc2”, “hc3”}: Use heteroscedasticity-corrected coefficient covariance matrix. If robust covariance is desired, it is recommended to use hc3.

Returns:

anovaDataFrame

When args is a single model, return is DataFrame with columns:

sum_sqfloat64: Sum of squares for model terms.
dffloat64: Degrees of freedom for model terms.
Ffloat64: F statistic value for significance of adding model terms.
PR(>F)float64: P-value for significance of adding model terms.

When args is multiple models, return is DataFrame with columns:

df_residfloat64: Degrees of freedom of residuals in models.
ssrfloat64: Sum of squares of residuals in models.
df_difffloat64: Degrees of freedom difference from previous model in args
ss_dfffloat64: Difference in ssr from previous model in args
Ffloat64: F statistic comparing to previous model in args
PR(>F): float64: P-value for significance comparing to previous model in args

See also

model_results.compare_f_test, model_results.compare_lm_test

Notes

Model statistics are given in the order of args. Models must have been fit using the formula api.

Examples

>>> import statsmodels.api as sm
>>> from statsmodels.formula.api import ols
>>> moore = sm.datasets.get_rdataset("Moore", "carData", cache=True) # load
>>> data = moore.data
>>> data = data.rename(columns={"partner.status" :
...                             "partner_status"}) # make name pythonic
>>> moore_lm = ols('conformity ~ C(fcategory, Sum)*C(partner_status, Sum)',
...                 data=data).fit()
>>> table = sm.stats.anova_lm(moore_lm, typ=2) # Type 2 Anova DataFrame
>>> print(table)