statsmodels.nonparametric.kernel_density.KDEMultivariateConditional

class statsmodels.nonparametric.kernel_density.KDEMultivariateConditional(endog, exog, dep_type, indep_type, bw, defaults=<statsmodels.nonparametric._kernel_base.EstimatorSettings object>)[source]

Conditional multivariate kernel density estimator.

Calculates P(Y_1,Y_2,...Y_n | X_1,X_2...X_m) = P(X_1, X_2,...X_n, Y_1, Y_2,..., Y_m)/P(X_1, X_2,..., X_m). The conditional density is by definition the ratio of the two densities, see [1].

Parameters:
  • endog (list of ndarrays or 2-D ndarray) – The training data for the dependent variables, used to determine the bandwidth(s). If a 2-D array, should be of shape (num_observations, num_variables). If a list, each list element is a separate observation.
  • exog (list of ndarrays or 2-D ndarray) – The training data for the independent variable; same shape as endog.
  • dep_type (str) –

    The type of the dependent variables:

    c : Continuous u : Unordered (Discrete) o : Ordered (Discrete)

    The string should contain a type specifier for each variable, so for example dep_type='ccuo'.

  • indep_type (str) – The type of the independent variables; specifed like dep_type.
  • bw (array_like or str, optional) –

    If an array, it is a fixed user-specified bandwidth. If a string, should be one of:

    • normal_reference: normal reference rule of thumb (default)
    • cv_ml: cross validation maximum likelihood
    • cv_ls: cross validation least squares
  • defaults (Instance of class EstimatorSettings) – The default values for the efficient bandwidth estimation
bw

array_like – The bandwidth parameters

See also

KDEMultivariate

References

[1]http://en.wikipedia.org/wiki/Conditional_probability_distribution

Examples

>>> import statsmodels.api as sm
>>> nobs = 300
>>> c1 = np.random.normal(size=(nobs,1))
>>> c2 = np.random.normal(2,1,size=(nobs,1))
>>> dens_c = sm.nonparametric.KDEMultivariateConditional(endog=[c1],
...     exog=[c2], dep_type='c', indep_type='c', bw='normal_reference')
>>> dens_c.bw   # show computed bandwidth
array([ 0.41223484,  0.40976931])

Methods

cdf([endog_predict, exog_predict]) Cumulative distribution function for the conditional density.
imse(bw) The integrated mean square error for the conditional KDE.
loo_likelihood(bw[, func]) Returns the leave-one-out conditional likelihood of the data.
pdf([endog_predict, exog_predict]) Evaluate the probability density function.