statsmodels.nonparametric.kernel_density.EstimatorSettings

class statsmodels.nonparametric.kernel_density.EstimatorSettings(efficient=False, randomize=False, n_res=25, n_sub=50, return_median=True, return_only_bw=False, n_jobs=- 1)[source]

Object to specify settings for density estimation or regression.

EstimatorSettings has several properties related to how bandwidth estimation for the KDEMultivariate, KDEMultivariateConditional, KernelReg and CensoredKernelReg classes behaves.

Parameters
efficientbool, optional

If True, the bandwidth estimation is to be performed efficiently – by taking smaller sub-samples and estimating the scaling factor of each subsample. This is useful for large samples (nobs >> 300) and/or multiple variables (k_vars > 3). If False (default), all data is used at the same time.

randomizebool, optional

If True, the bandwidth estimation is to be performed by taking n_res random resamples (with replacement) of size n_sub from the full sample. If set to False (default), the estimation is performed by slicing the full sample in sub-samples of size n_sub so that all samples are used once.

n_subint, optional

Size of the sub-samples. Default is 50.

n_resint, optional

The number of random re-samples used to estimate the bandwidth. Only has an effect if randomize == True. Default value is 25.

return_medianbool, optional

If True (default), the estimator uses the median of all scaling factors for each sub-sample to estimate the bandwidth of the full sample. If False, the estimator uses the mean.

return_only_bwbool, optional

If True, the estimator is to use the bandwidth and not the scaling factor. This is not theoretically justified. Should be used only for experimenting.

n_jobsint, optional

The number of jobs to use for parallel estimation with joblib.Parallel. Default is -1, meaning n_cores - 1, with n_cores the number of available CPU cores. See the joblib documentation for more details.

Examples

>>> settings = EstimatorSettings(randomize=True, n_jobs=3)
>>> k_dens = KDEMultivariate(data, var_type, defaults=settings)

Methods