MICEData.set_imputer(endog_name, formula=None, model_class=None, init_kwds=None, fit_kwds=None, predict_kwds=None, k_pmm=20, perturbation_method=None, regularized=False)[source]

Specify the imputation process for a single variable.


Name of the variable to be imputed.


Conditional formula for imputation. Defaults to a formula with main effects for all other variables in dataset. The formula should only include an expression for the mean structure, e.g. use ‘x1 + x2’ not ‘x4 ~ x1 + x2’.

model_classstatsmodels model

Conditional model for imputation. Defaults to OLS. See below for more information.


Keyword arguments passed to the model init method.


Keyword arguments passed to the model fit method.


Keyword arguments passed to the model predict method.


Determines number of neighboring observations from which to randomly sample when using predictive mean matching.


Either ‘gaussian’ or ‘bootstrap’. Determines the method for perturbing parameters in the imputation model. If None, uses the default specified at class initialization.


If regularized[name]=True, fit_regularized rather than fit is called when fitting imputation models for this variable. When regularized[name]=True for any variable, perturbation_method must be set to boot.


The model class must meet the following conditions:
  • A model must have a ‘fit’ method that returns an object.

  • The object returned from fit must have a params attribute that is an array-like object.

  • The object returned from fit must have a cov_params method that returns a square array-like object.

  • The model must have a predict method.