statsmodels.genmod.bayes_mixed_glm.BinomialBayesMixedGLM

class statsmodels.genmod.bayes_mixed_glm.BinomialBayesMixedGLM(endog, exog, exog_vc, ident, vcp_p=1, fe_p=2, fep_names=None, vcp_names=None, vc_names=None)[source]

Generalized Linear Mixed Model with Bayesian estimation

The class implements the Laplace approximation to the posterior distribution (fit_map) and a variational Bayes approximation to the posterior (fit_vb). See the two fit method docstrings for more information about the fitting approaches.

Parameters:
endogarray_like

Vector of response values.

exogarray_like

Array of covariates for the fixed effects part of the mean structure.

exog_vcarray_like

Array of covariates for the random part of the model. A scipy.sparse array may be provided, or else the passed array will be converted to sparse internally.

identarray_like

Array of integer labels showing which random terms (columns of exog_vc) have a common variance.

vcp_pfloat

Prior standard deviation for variance component parameters (the prior standard deviation of log(s) is vcp_p, where s is the standard deviation of a random effect).

fe_pfloat

Prior standard deviation for fixed effects parameters.

familystatsmodels.genmod.families instance

The GLM family.

fep_nameslist[str]

The names of the fixed effects parameters (corresponding to columns of exog). If None, default names are constructed.

vcp_nameslist[str]

The names of the variance component parameters (corresponding to distinct labels in ident). If None, default names are constructed.

vc_nameslist[str]

The names of the random effect realizations.

Attributes:
endog_names

Names of endogenous variables.

exog_names

Names of exogenous variables.

Returns:
MixedGLMResults object

Notes

There are three types of values in the posterior distribution: fixed effects parameters (fep), corresponding to the columns of exog, random effects realizations (vc), corresponding to the columns of exog_vc, and the standard deviations of the random effects realizations (vcp), corresponding to the unique integer labels in ident.

All random effects are modeled as being independent Gaussian values (given the variance structure parameters). Every column of exog_vc has a distinct realized random effect that is used to form the linear predictors. The elements of ident determine the distinct variance structure parameters. Two random effect realizations that have the same value in ident have the same variance. When fitting with a formula, ident is constructed internally (each element of vc_formulas yields a distinct label in ident).

The random effect standard deviation parameters (vcp) have log-normal prior distributions with mean 0 and standard deviation vcp_p.

Note that for some families, e.g. Binomial, the posterior mode may be difficult to find numerically if vcp_p is set to too large of a value. Setting vcp_p to 0.5 seems to work well.

The prior for the fixed effects parameters is Gaussian with mean 0 and standard deviation fe_p. It is recommended that quantitative covariates be standardized.

References

Introduction to generalized linear mixed models: https://stats.idre.ucla.edu/other/mult-pkg/introduction-to-generalized-linear-mixed-models

SAS documentation: https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_intromix_a0000000215.htm

An assessment of estimation methods for generalized linear mixed models with binary outcomes https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866838/

Examples

A binomial (logistic) random effects model with random intercepts for villages and random slopes for each year within each village:

>>> random = {"a": '0 + C(Village)', "b": '0 + C(Village)*year_cen'}
>>> model = BinomialBayesMixedGLM.from_formula(
               'y ~ year_cen', random, data)
>>> result = model.fit_vb()

Methods

fit([method, minim_opts])

fit is equivalent to fit_map.

fit_map([method, minim_opts, scale_fe])

Construct the Laplace approximation to the posterior distribution.

fit_vb([mean, sd, fit_method, minim_opts, ...])

Fit a model using the variational Bayes mean field approximation.

from_formula(formula, vc_formulas, data[, ...])

Fit a BayesMixedGLM using a formula.

logposterior(params)

The overall log-density: log p(y, fe, vc, vcp).

logposterior_grad(params)

The gradient of the log posterior.

predict(params[, exog, linear])

Return the fitted mean structure.

vb_elbo(vb_mean, vb_sd)

Returns the evidence lower bound (ELBO) for the model.

vb_elbo_base(h, tm, fep_mean, vcp_mean, ...)

Returns the evidence lower bound (ELBO) for the model.

vb_elbo_grad(vb_mean, vb_sd)

Returns the gradient of the model's evidence lower bound (ELBO).

vb_elbo_grad_base(h, tm, tv, fep_mean, ...)

Return the gradient of the ELBO function.

Properties

endog_names

Names of endogenous variables.

exog_names

Names of exogenous variables.

rng

verbose


Last update: Oct 29, 2024