statsmodels.regression.mixed_linear_model.MixedLM

class statsmodels.regression.mixed_linear_model.MixedLM(endog, exog, groups, exog_re=None, exog_vc=None, use_sqrt=True, missing='none', **kwargs)[source]

An object specifying a linear mixed effects model. Use the fit method to fit the model and obtain a results object.

Parameters:

endog : 1d array-like

The dependent variable

exog : 2d array-like

A matrix of covariates used to determine the mean structure (the “fixed effects” covariates).

groups : 1d array-like

A vector of labels determining the groups – data from different groups are independent

exog_re : 2d array-like

A matrix of covariates used to determine the variance and covariance structure (the “random effects” covariates). If None, defaults to a random intercept for each group.

exog_vc : dict-like

A dictionary containing specifications of the variance component terms. See below for details.

use_sqrt : bool

If True, optimization is carried out using the lower triangle of the square root of the random effects covariance matrix, otherwise it is carried out using the lower triangle of the random effects covariance matrix.

missing : string

The approach to missing data handling

Notes

exog_vc is a dictionary of dictionaries. Specifically, exog_vc[a][g] is a matrix whose columns are linearly combined using independent random coefficients. This random term then contributes to the variance structure of the data for group g. The random coefficients all have mean zero, and have the same variance. The matrix must be m x k, where m is the number of observations in group g. The number of columns may differ among the top-level groups.

The covariates in exog, exog_re and exog_vc may (but need not) partially or wholly overlap.

use_sqrt should almost always be set to True. The main use case for use_sqrt=False is when complicated patterns of fixed values in the covariance structure are set (using the free argument to fit) that cannot be expressed in terms of the Cholesky factor L.

Examples

A basic mixed model with fixed effects for the columns of exog and a random intercept for each distinct value of group:

>>> model = sm.MixedLM(endog, exog, groups)
>>> result = model.fit()

A mixed model with fixed effects for the columns of exog and correlated random coefficients for the columns of exog_re:

>>> model = sm.MixedLM(endog, exog, groups, exog_re=exog_re)
>>> result = model.fit()

A mixed model with fixed effects for the columns of exog and independent random coefficients for the columns of exog_re:

>>> free = MixedLMParams.from_components(fe_params=np.ones(exog.shape[1]),                      cov_re=np.eye(exog_re.shape[1]))
>>> model = sm.MixedLM(endog, exog, groups, exog_re=exog_re)
>>> result = model.fit(free=free)

A different way to specify independent random coefficients for the columns of exog_re. In this example groups must be a Pandas Series with compatible indexing with exog_re, and exog_re has two columns.

>>> g = pd.groupby(groups, by=groups).groups
>>> vc = {}
>>> vc['1'] = {k : exog_re.loc[g[k], 0] for k in g}
>>> vc['2'] = {k : exog_re.loc[g[k], 1] for k in g}
>>> model = sm.MixedLM(endog, exog, groups, vcomp=vc)
>>> result = model.fit()

Attributes

endog_names Names of endogenous variables
exog_names Names of exogenous variables

Methods

fit([start_params, reml, niter_sa, do_cg, ...]) Fit a linear mixed model to the data.
fit_regularized([start_params, method, ...]) Fit a model in which the fixed effects parameters are penalized.
from_formula(formula, data[, re_formula, ...]) Create a Model from a formula and dataframe.
get_fe_params(cov_re, vcomp) Use GLS to update the fixed effects parameter estimates.
get_scale(fe_params, cov_re, vcomp) Returns the estimated error variance based on given estimates of the slopes and random effects covariance matrix.
group_list(array) Returns array split into subarrays corresponding to the grouping structure.
hessian(params) Returns the model’s Hessian matrix.
information(params) Fisher information matrix of model
initialize() Initialize (possibly re-initialize) a Model instance.
loglike(params[, profile_fe]) Evaluate the (profile) log-likelihood of the linear mixed effects model.
predict(params[, exog]) Return predicted values from a design matrix.
score(params[, profile_fe]) Returns the score vector of the profile log-likelihood.
score_full(params, calc_fe) Returns the score with respect to untransformed parameters.
score_sqrt(params[, calc_fe]) Returns the score with respect to transformed parameters.

Methods

fit([start_params, reml, niter_sa, do_cg, ...]) Fit a linear mixed model to the data.
fit_regularized([start_params, method, ...]) Fit a model in which the fixed effects parameters are penalized.
from_formula(formula, data[, re_formula, ...]) Create a Model from a formula and dataframe.
get_fe_params(cov_re, vcomp) Use GLS to update the fixed effects parameter estimates.
get_scale(fe_params, cov_re, vcomp) Returns the estimated error variance based on given estimates of the slopes and random effects covariance matrix.
group_list(array) Returns array split into subarrays corresponding to the grouping structure.
hessian(params) Returns the model’s Hessian matrix.
information(params) Fisher information matrix of model
initialize() Initialize (possibly re-initialize) a Model instance.
loglike(params[, profile_fe]) Evaluate the (profile) log-likelihood of the linear mixed effects model.
predict(params[, exog]) Return predicted values from a design matrix.
score(params[, profile_fe]) Returns the score vector of the profile log-likelihood.
score_full(params, calc_fe) Returns the score with respect to untransformed parameters.
score_sqrt(params[, calc_fe]) Returns the score with respect to transformed parameters.

Attributes

endog_names Names of endogenous variables
exog_names Names of exogenous variables