Multinomial logit Hessian matrix of the log-likelihood

Parameters:params (array-like) – The parameters of the model
Returns:hess – The Hessian, second derivative of loglikelihood function with respect to the flattened parameters, evaluated at params
Return type:ndarray, (J*K, J*K)


\[\frac{\partial^{2}\ln L}{\partial\beta_{j}\partial\beta_{l}}=-\sum_{i=1}^{n}\frac{\exp\left(\beta_{j}^{\prime}x_{i}\right)}{\sum_{k=0}^{J}\exp\left(\beta_{k}^{\prime}x_{i}\right)}\left[\boldsymbol{1}\left(j=l\right)-\frac{\exp\left(\beta_{l}^{\prime}x_{i}\right)}{\sum_{k=0}^{J}\exp\left(\beta_{k}^{\prime}x_{i}\right)}\right]x_{i}x_{l}^{\prime}\]

where \(\boldsymbol{1}\left(j=l\right)\) equals 1 if j = l and 0 otherwise.

The actual Hessian matrix has J**2 * K x K elements. Our Hessian is reshaped to be square (J*K, J*K) so that the solvers can use it.

This implementation does not take advantage of the symmetry of the Hessian and could probably be refactored for speed.