statsmodels.regression.mixed_linear_model.MixedLM.from_formula

classmethod MixedLM.from_formula(formula, data, re_formula=None, subset=None, *args, **kwargs)[source]

Create a Model from a formula and dataframe.

Parameters:

formula : str or generic Formula object

The formula specifying the model

data : array-like

The data for the model. See Notes.

re_formula : string

A one-sided formula defining the variance structure of the model. The default gives a random intercept for each group.

subset : array-like

An array-like object of booleans, integers, or index values that indicate the subset of df to use in the model. Assumes df is a pandas.DataFrame

args : extra arguments

These are passed to the model

kwargs : extra keyword arguments

These are passed to the model with one exception. The eval_env keyword is passed to patsy. It can be either a patsy.EvalEnvironment object or an integer indicating the depth of the namespace to use. For example, the default eval_env=0 uses the calling namespace. If you wish to use a “clean” environment set eval_env=-1.

Returns:

model : Model instance

Notes

data must define __getitem__ with the keys in the formula terms args and kwargs are passed on to the model instantiation. E.g., a numpy structured or rec array, a dictionary, or a pandas DataFrame.

If re_formula is not provided, the default is a random intercept for each group.

This method currently does not correctly handle missing values, so missing values should be explicitly dropped from the DataFrame before calling this method.