statsmodels.regression.mixed_linear_model.MixedLM

class statsmodels.regression.mixed_linear_model.MixedLM(endog, exog, groups, exog_re=None, exog_vc=None, use_sqrt=True, missing='none', **kwargs)[source]

Linear Mixed Effects Model

Parameters
endog1d array_like

The dependent variable

exog2d array_like

A matrix of covariates used to determine the mean structure (the “fixed effects” covariates).

groups1d array_like

A vector of labels determining the groups – data from different groups are independent

exog_re2d array_like

A matrix of covariates used to determine the variance and covariance structure (the “random effects” covariates). If None, defaults to a random intercept for each group.

exog_vcVCSpec instance or dict-like (deprecated)

A VCSPec instance defines the structure of the variance components in the model. Alternatively, see notes below for a dictionary-based format. The dictionary format is deprecated and may be removed at some point in the future.

use_sqrtbool

If True, optimization is carried out using the lower triangle of the square root of the random effects covariance matrix, otherwise it is carried out using the lower triangle of the random effects covariance matrix.

missingstr

The approach to missing data handling

Notes

If exog_vc is not a VCSpec instance, then it must be a dictionary of dictionaries. Specifically, exog_vc[a][g] is a matrix whose columns are linearly combined using independent random coefficients. This random term then contributes to the variance structure of the data for group g. The random coefficients all have mean zero, and have the same variance. The matrix must be m x k, where m is the number of observations in group g. The number of columns may differ among the top-level groups.

The covariates in exog, exog_re and exog_vc may (but need not) partially or wholly overlap.

use_sqrt should almost always be set to True. The main use case for use_sqrt=False is when complicated patterns of fixed values in the covariance structure are set (using the free argument to fit) that cannot be expressed in terms of the Cholesky factor L.

Examples

A basic mixed model with fixed effects for the columns of exog and a random intercept for each distinct value of group:

>>> model = sm.MixedLM(endog, exog, groups)
>>> result = model.fit()

A mixed model with fixed effects for the columns of exog and correlated random coefficients for the columns of exog_re:

>>> model = sm.MixedLM(endog, exog, groups, exog_re=exog_re)
>>> result = model.fit()

A mixed model with fixed effects for the columns of exog and independent random coefficients for the columns of exog_re:

>>> free = MixedLMParams.from_components(
                 fe_params=np.ones(exog.shape[1]),
                 cov_re=np.eye(exog_re.shape[1]))
>>> model = sm.MixedLM(endog, exog, groups, exog_re=exog_re)
>>> result = model.fit(free=free)

A different way to specify independent random coefficients for the columns of exog_re. In this example groups must be a Pandas Series with compatible indexing with exog_re, and exog_re has two columns.

>>> g = pd.groupby(groups, by=groups).groups
>>> vc = {}
>>> vc['1'] = {k : exog_re.loc[g[k], 0] for k in g}
>>> vc['2'] = {k : exog_re.loc[g[k], 1] for k in g}
>>> model = sm.MixedLM(endog, exog, groups, vcomp=vc)
>>> result = model.fit()
Attributes
endog_names

Names of endogenous variables.

exog_names

Names of exogenous variables.

Methods

fit([start_params, reml, niter_sa, do_cg, …])

Fit a linear mixed model to the data.

fit_regularized([start_params, method, …])

Fit a model in which the fixed effects parameters are penalized.

from_formula(formula, data[, re_formula, …])

Create a Model from a formula and dataframe.

get_fe_params(cov_re, vcomp[, tol])

Use GLS to update the fixed effects parameter estimates.

get_scale(fe_params, cov_re, vcomp)

Returns the estimated error variance based on given estimates of the slopes and random effects covariance matrix.

group_list(array)

Returns array split into subarrays corresponding to the grouping structure.

hessian(params)

Returns the model’s Hessian matrix.

information(params)

Fisher information matrix of model.

initialize()

Initialize (possibly re-initialize) a Model instance.

loglike(params[, profile_fe])

Evaluate the (profile) log-likelihood of the linear mixed effects model.

predict(params[, exog])

Return predicted values from a design matrix.

score(params[, profile_fe])

Returns the score vector of the profile log-likelihood.

score_full(params, calc_fe)

Returns the score with respect to untransformed parameters.

score_sqrt(params[, calc_fe])

Returns the score with respect to transformed parameters.

get_distribution

Methods

fit([start_params, reml, niter_sa, do_cg, …])

Fit a linear mixed model to the data.

fit_regularized([start_params, method, …])

Fit a model in which the fixed effects parameters are penalized.

from_formula(formula, data[, re_formula, …])

Create a Model from a formula and dataframe.

get_distribution(params, scale, exog)

get_fe_params(cov_re, vcomp[, tol])

Use GLS to update the fixed effects parameter estimates.

get_scale(fe_params, cov_re, vcomp)

Returns the estimated error variance based on given estimates of the slopes and random effects covariance matrix.

group_list(array)

Returns array split into subarrays corresponding to the grouping structure.

hessian(params)

Returns the model’s Hessian matrix.

information(params)

Fisher information matrix of model.

initialize()

Initialize (possibly re-initialize) a Model instance.

loglike(params[, profile_fe])

Evaluate the (profile) log-likelihood of the linear mixed effects model.

predict(params[, exog])

Return predicted values from a design matrix.

score(params[, profile_fe])

Returns the score vector of the profile log-likelihood.

score_full(params, calc_fe)

Returns the score with respect to untransformed parameters.

score_sqrt(params[, calc_fe])

Returns the score with respect to transformed parameters.

Properties

endog_names

Names of endogenous variables.

exog_names

Names of exogenous variables.