Regression with Discrete Dependent Variable

Regression models for limited and qualitative dependent variables. The module currently allows the estimation of models with binary (Logit, Probit), nominal (MNLogit), or count (Poisson, NegativeBinomial) data.

Starting with version 0.9, this also includes new count models, that are still experimental in 0.9, NegativeBinomialP, GeneralizedPoisson and zero-inflated models, ZeroInflatedPoisson, ZeroInflatedNegativeBinomialP and ZeroInflatedGeneralizedPoisson.

See Module Reference for commands and arguments.

Examples

# Load the data from Spector and Mazzeo (1980)
In [1]: spector_data = sm.datasets.spector.load()

In [2]: spector_data.exog = sm.add_constant(spector_data.exog)

# Logit Model
In [3]: logit_mod = sm.Logit(spector_data.endog, spector_data.exog)

In [4]: logit_res = logit_mod.fit()
Optimization terminated successfully.
         Current function value: 0.402801
         Iterations 7

In [5]: print(logit_res.summary())
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                      y   No. Observations:                   32
Model:                          Logit   Df Residuals:                       28
Method:                           MLE   Df Model:                            3
Date:                Mon, 14 May 2018   Pseudo R-squ.:                  0.3740
Time:                        21:46:24   Log-Likelihood:                -12.890
converged:                       True   LL-Null:                       -20.592
                                        LLR p-value:                  0.001502
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const        -13.0213      4.931     -2.641      0.008     -22.687      -3.356
x1             2.8261      1.263      2.238      0.025       0.351       5.301
x2             0.0952      0.142      0.672      0.501      -0.182       0.373
x3             2.3787      1.065      2.234      0.025       0.292       4.465
==============================================================================

Detailed examples can be found here:

Technical Documentation

Currently all models are estimated by Maximum Likelihood and assume independently and identically distributed errors.

All discrete regression models define the same methods and follow the same structure, which is similar to the regression results but with some methods specific to discrete models. Additionally some of them contain additional model specific methods and attributes.

References

General references for this class of models are:

A.C. Cameron and P.K. Trivedi.  `Regression Analysis of Count Data`.
    Cambridge, 1998

G.S. Madalla. `Limited-Dependent and Qualitative Variables in Econometrics`.
    Cambridge, 1983.

W. Greene. `Econometric Analysis`. Prentice Hall, 5th. edition. 2003.

Module Reference

The specific model classes are:

Logit(endog, exog, **kwargs) Binary choice logit model
Probit(endog, exog, **kwargs) Binary choice Probit model
MNLogit(endog, exog, **kwargs) Multinomial logit model
Poisson(endog, exog[, offset, exposure, missing]) Poisson model for count data
NegativeBinomial(endog, exog[, …]) Negative Binomial Model for count data
NegativeBinomialP(endog, exog[, p, offset, …]) Generalized Negative Binomial (NB-P) model for count data
GeneralizedPoisson(endog, exog[, p, offset, …]) Generalized Poisson model for count data
ZeroInflatedPoisson(endog, exog[, …]) Poisson Zero Inflated model for count data
ZeroInflatedNegativeBinomialP(endog, exog[, …]) Zero Inflated Generalized Negative Binomial model for count data
ZeroInflatedGeneralizedPoisson(endog, exog) Zero Inflated Generalized Poisson model for count data

The specific result classes are:

LogitResults(model, mlefit[, cov_type, …]) A results class for Logit Model
ProbitResults(model, mlefit[, cov_type, …]) A results class for Probit Model
CountResults(model, mlefit[, cov_type, …]) A results class for count data
MultinomialResults(model, mlefit[, …]) A results class for multinomial data
NegativeBinomialResults(model, mlefit[, …]) A results class for NegativeBinomial 1 and 2
GeneralizedPoissonResults(model, mlefit[, …]) A results class for Generalized Poisson
ZeroInflatedPoissonResults(model, mlefit[, …]) A results class for Zero Inflated Poisson
ZeroInflatedNegativeBinomialResults(model, …) A results class for Zero Inflated Genaralized Negative Binomial
ZeroInflatedGeneralizedPoissonResults(model, …) A results class for Zero Inflated Generalized Poisson

DiscreteModel is a superclass of all discrete regression models. The estimation results are returned as an instance of one of the subclasses of DiscreteResults. Each category of models, binary, count and multinomial, have their own intermediate level of model and results classes. This intermediate classes are mostly to facilitate the implementation of the methods and attributes defined by DiscreteModel and DiscreteResults.

DiscreteModel(endog, exog, **kwargs) Abstract class for discrete choice models.
DiscreteResults(model, mlefit[, cov_type, …]) A results class for the discrete dependent variable models.
BinaryModel(endog, exog, **kwargs)
BinaryResults(model, mlefit[, cov_type, …]) A results class for binary data
CountModel(endog, exog[, offset, exposure, …])
MultinomialModel(endog, exog, **kwargs)
GenericZeroInflated(endog, exog[, …]) Generiz Zero Inflated model for count data