statsmodels.imputation.mice.MICE

class statsmodels.imputation.mice.MICE(model_formula, model_class, data, n_skip=3, init_kwds=None, fit_kwds=None)[source]

Multiple Imputation with Chained Equations.

This class can be used to fit most Statsmodels models to data sets with missing values using the ‘multiple imputation with chained equations’ (MICE) approach..

Parameters
  • model_formula (string) – The model formula to be fit to the imputed data sets. This formula is for the ‘analysis model’.

  • model_class (statsmodels model) – The model to be fit to the imputed data sets. This model class if for the ‘analysis model’.

  • data (MICEData instance) – MICEData object containing the data set for which missing values will be imputed

  • n_skip (int) – The number of imputed datasets to skip between consecutive imputed datasets that are used for analysis.

  • init_kwds (dict-like) – Dictionary of keyword arguments passed to the init method of the analysis model.

  • fit_kwds (dict-like) – Dictionary of keyword arguments passed to the fit method of the analysis model.

Examples

Run all MICE steps and obtain results:

>>> imp = mice.MICEData(data)
>>> fml = 'y ~ x1 + x2 + x3 + x4'
>>> mice = mice.MICE(fml, sm.OLS, imp)
>>> results = mice.fit(10, 10)
>>> print(results.summary())
                          Results: MICE
=================================================================
Method:                    MICE       Sample size:           1000
Model:                     OLS        Scale                  1.00
Dependent variable:        y          Num. imputations       10
-----------------------------------------------------------------
           Coef.  Std.Err.    t     P>|t|   [0.025  0.975]  FMI
-----------------------------------------------------------------
Intercept -0.0234   0.0318  -0.7345 0.4626 -0.0858  0.0390 0.0128
x1         1.0305   0.0578  17.8342 0.0000  0.9172  1.1437 0.0309
x2        -0.0134   0.0162  -0.8282 0.4076 -0.0451  0.0183 0.0236
x3        -1.0260   0.0328 -31.2706 0.0000 -1.0903 -0.9617 0.0169
x4        -0.0253   0.0336  -0.7520 0.4521 -0.0911  0.0406 0.0269
=================================================================

Obtain a sequence of fitted analysis models without combining to obtain summary:

>>> imp = mice.MICEData(data)
>>> fml = 'y ~ x1 + x2 + x3 + x4'
>>> mice = mice.MICE(fml, sm.OLS, imp)
>>> results = []
>>> for k in range(10):
>>>     x = mice.next_sample()
>>>     results.append(x)

Methods

combine()

Pools MICE imputation results.

fit([n_burnin, n_imputations])

Fit a model using MICE.

next_sample()

Perform one complete MICE iteration.