Tools

Our tool collection contains some convenience functions for users and functions that were written mainly for internal use.

Additional to this tools directory, several other subpackages have their own tools modules, for example statsmodels.tsa.tsatools

Module Reference

Basic tools tools

These are basic and miscellaneous tools. The full import path is statsmodels.tools.tools.

tools.add_constant(data[, prepend, has_constant])

Adds a column of ones to an array

The next group are mostly helper functions that are not separately tested or insufficiently tested.

tools.categorical(data[, col, dictnames, drop])

Returns a dummy matrix given an array of categorical variables.

tools.clean0(matrix)

Erase columns of zeros: can save some time in pseudoinverse.

tools.fullrank(X[, r])

Return a matrix whose column span is the same as X.

tools.isestimable(C, D)

True if (Q, P) contrast C is estimable for (N, P) design D

tools.recipr(x)

Return the reciprocal of an array, setting all entries less than or equal to 0 to 0.

tools.recipr0(x)

Return the reciprocal of an array, setting all entries equal to 0 as 0.

tools.unsqueeze(data, axis, oldshape)

Unsqueeze a collapsed array

Numerical Differentiation

numdiff.approx_fprime(x, f[, epsilon, args, …])

Gradient of function, or Jacobian if function f returns 1d array

numdiff.approx_fprime_cs(x, f[, epsilon, …])

Calculate gradient or Jacobian with complex step derivative approximation

numdiff.approx_hess1(x, f[, epsilon, args, …])

Calculate Hessian with finite difference derivative approximation

numdiff.approx_hess2(x, f[, epsilon, args, …])

Calculate Hessian with finite difference derivative approximation

numdiff.approx_hess3(x, f[, epsilon, args, …])

Calculate Hessian with finite difference derivative approximation

numdiff.approx_hess_cs(x, f[, epsilon, …])

Calculate Hessian with complex-step derivative approximation

Measure for fit performance eval_measures

The first group of function in this module are standalone versions of information criteria, aic bic and hqic. The function with _sigma suffix take the error sum of squares as argument, those without, take the value of the log-likelihood, llf, as argument.

The second group of function are measures of fit or prediction performance, which are mostly one liners to be used as helper functions. All of those calculate a performance or distance statistic for the difference between two arrays. For example in the case of Monte Carlo or cross-validation, the first array would be the estimation results for the different replications or draws, while the second array would be the true or observed values.

eval_measures.aic(llf, nobs, df_modelwc)

Akaike information criterion

eval_measures.aic_sigma(sigma2, nobs, df_modelwc)

Akaike information criterion

eval_measures.aicc(llf, nobs, df_modelwc)

Akaike information criterion (AIC) with small sample correction

eval_measures.aicc_sigma(sigma2, nobs, …)

Akaike information criterion (AIC) with small sample correction

eval_measures.bic(llf, nobs, df_modelwc)

Bayesian information criterion (BIC) or Schwarz criterion

eval_measures.bic_sigma(sigma2, nobs, df_modelwc)

Bayesian information criterion (BIC) or Schwarz criterion

eval_measures.hqic(llf, nobs, df_modelwc)

Hannan-Quinn information criterion (HQC)

eval_measures.hqic_sigma(sigma2, nobs, …)

Hannan-Quinn information criterion (HQC)

eval_measures.bias(x1, x2[, axis])

bias, mean error

eval_measures.iqr(x1, x2[, axis])

interquartile range of error

eval_measures.maxabs(x1, x2[, axis])

maximum absolute error

eval_measures.meanabs(x1, x2[, axis])

mean absolute error

eval_measures.medianabs(x1, x2[, axis])

median absolute error

eval_measures.medianbias(x1, x2[, axis])

median bias, median error

eval_measures.mse(x1, x2[, axis])

mean squared error

eval_measures.rmse(x1, x2[, axis])

root mean squared error

eval_measures.stde(x1, x2[, ddof, axis])

standard deviation of error

eval_measures.vare(x1, x2[, ddof, axis])

variance of error