{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Custom statespace models\n", "\n", "The true power of the state space model is to allow the creation and estimation of custom models. This notebook shows various statespace models that subclass `sm.tsa.statespace.MLEModel`.\n", "\n", "Remember the general state space model can be written in the following general way:\n", "\n", "$$\n", "\\begin{aligned}\n", "y_t & = Z_t \\alpha_{t} + d_t + \\varepsilon_t \\\\\n", "\\alpha_{t+1} & = T_t \\alpha_{t} + c_t + R_t \\eta_{t}\n", "\\end{aligned}\n", "$$\n", "\n", "You can check the details and the dimensions of the objects [in this link](https://www.statsmodels.org/stable/statespace.html#custom-state-space-models)\n", "\n", "Most models won't include all of these elements. For example, the design matrix $Z_t$ might not depend on time ($\\forall t \\;Z_t = Z$), or the model won't have an observation intercept $d_t$.\n", "\n", "We'll start with something relatively simple and then show how to extend it bit by bit to include more elements.\n", "\n", "+ Model 1: time-varying coefficients. One observation equation with two state equations\n", "+ Model 2: time-varying parameters with non identity transition matrix\n", "+ Model 3: multiple observation and multiple state equations\n", "+ Bonus: pymc3 for Bayesian estimation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import statsmodels.api as sm\n", "import matplotlib.pyplot as plt\n", "from collections import OrderedDict\n", "plt.rc(\"figure\", figsize=(16,8))\n", "plt.rc(\"font\", size=15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model 1: time-varying coefficients\n", "\n", "$$\n", "\\begin{aligned}\n", "y_t & = d + x_t \\beta_{x,t} + w_t \\beta_{w,t} + \\varepsilon_t \\hspace{4em} \\varepsilon_t \\sim N(0, \\sigma_\\varepsilon^2)\\\\\n", "\\begin{bmatrix} \\beta_{x,t} \\\\ \\beta_{w,t} \\end{bmatrix} & = \\begin{bmatrix} \\beta_{x,t-1} \\\\ \\beta_{w,t-1} \\end{bmatrix} + \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\hspace{3.7em} \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{\\beta, x}^2 & 0 \\\\ 0 & \\sigma_{\\beta, w}^2 \\end{bmatrix} \\right )\n", "\\end{aligned}\n", "$$\n", "\n", "The observed data is $y_t, x_t, w_t$. With $x_t, w_t$ being the exogenous variables. Notice that the design matrix is time-varying, so it will have three dimensions (`k_endog x k_states x nobs`)\n", "\n", "The states are $\\beta_{x,t}$ and $\\beta_{w,t}$. The state equation tells us these states evolve with a random walk. Thus, in this case the transition matrix is a 2 by 2 identity matrix.\n", "\n", "We'll first simulate the data, the construct a model and finally estimate it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def gen_data_for_model1():\n", " nobs = 1000\n", "\n", " rs = np.random.RandomState(seed=93572)\n", "\n", " d = 5\n", " var_y = 5\n", " var_coeff_x = 0.01\n", " var_coeff_w = 0.5\n", "\n", " x_t = rs.uniform(size=nobs)\n", " w_t = rs.uniform(size=nobs)\n", " eps = rs.normal(scale=var_y**0.5, size=nobs)\n", "\n", " beta_x = np.cumsum(rs.normal(size=nobs, scale=var_coeff_x**0.5))\n", " beta_w = np.cumsum(rs.normal(size=nobs, scale=var_coeff_w**0.5))\n", "\n", " y_t = d + beta_x * x_t + beta_w * w_t + eps\n", " return y_t, x_t, w_t, beta_x, beta_w \n", "\n", "y_t, x_t, w_t, beta_x, beta_w = gen_data_for_model1()\n", "_ = plt.plot(y_t)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class TVRegression(sm.tsa.statespace.MLEModel):\n", " def __init__(self, y_t, x_t, w_t):\n", " exog = np.c_[x_t, w_t] # shaped nobs x 2\n", "\n", " super(TVRegression, self).__init__(\n", " endog=y_t, exog=exog, k_states=2,\n", " initialization='diffuse')\n", "\n", " # Since the design matrix is time-varying, it must be\n", " # shaped k_endog x k_states x nobs\n", " # Notice that exog.T is shaped k_states x nobs, so we\n", " # just need to add a new first axis with shape 1\n", " self.ssm['design'] = exog.T[np.newaxis, :, :] # shaped 1 x 2 x nobs\n", " self.ssm['selection'] = np.eye(self.k_states)\n", " self.ssm['transition'] = np.eye(self.k_states)\n", " \n", " #Which parameters need to be positive?\n", " self.positive_parameters = slice(1, 4)\n", "\n", " @property\n", " def param_names(self):\n", " return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff']\n", "\n", " @property\n", " def start_params(self):\n", " \"\"\"\n", " Defines the starting values for the parameters\n", " The linear regression gives us reasonable starting values for the constant\n", " d and the variance of the epsilon error\n", " \"\"\"\n", " exog = sm.add_constant(self.exog)\n", " res = sm.OLS(self.endog, exog).fit()\n", " params = np.r_[res.params[0], res.scale, 0.001, 0.001]\n", " return params\n", "\n", " def transform_params(self, unconstrained):\n", " \"\"\"\n", " We constraint the last three parameters\n", " ('var.e', 'var.x.coeff', 'var.w.coeff') to be positive, \n", " because they are variances\n", " \"\"\"\n", " constrained = unconstrained.copy()\n", " constrained[self.positive_parameters] = constrained[self.positive_parameters]**2\n", " return constrained\n", "\n", " def untransform_params(self, constrained):\n", " \"\"\"\n", " Need to unstransform all the parameters you transformed\n", " in the `transform_params` function\n", " \"\"\"\n", " unconstrained = constrained.copy()\n", " unconstrained[self.positive_parameters] = unconstrained[self.positive_parameters]**0.5\n", " return unconstrained\n", "\n", " def update(self, params, **kwargs):\n", " params = super(TVRegression, self).update(params, **kwargs)\n", "\n", " self['obs_intercept', 0, 0] = params[0]\n", " self['obs_cov', 0, 0] = params[1]\n", " self['state_cov'] = np.diag(params[2:4])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### And then estimate it with our custom model class" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mod = TVRegression(y_t, x_t, w_t)\n", "res = mod.fit()\n", "\n", "print(res.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The values that generated the data were:\n", "\n", "+ intercept = 5\n", "+ var.e = 5\n", "+ var.x.coeff = 0.01\n", "+ var.w.coeff = 0.5\n", "\n", "\n", "As you can see, the estimation recovered the real parameters pretty well.\n", "\n", "We can also recover the estimated evolution of the underlying coefficients (or states in Kalman filter talk)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, axes = plt.subplots(2, figsize=(16, 8))\n", "\n", "ss = pd.DataFrame(res.smoothed_state.T, columns=['x', 'w'])\n", "\n", "axes[0].plot(beta_x, label='True')\n", "axes[0].plot(ss['x'], label='Smoothed estimate')\n", "axes[0].set(title='Time-varying coefficient on x_t')\n", "axes[0].legend()\n", "\n", "axes[1].plot(beta_w, label='True')\n", "axes[1].plot(ss['w'], label='Smoothed estimate')\n", "axes[1].set(title='Time-varying coefficient on w_t')\n", "axes[1].legend()\n", "\n", "fig.tight_layout();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model 2: time-varying parameters with non identity transition matrix\n", "\n", "This is a small extension from Model 1. Instead of having an identity transition matrix, we'll have one with two parameters ($\\rho_1, \\rho_2$) that we need to estimate. \n", "\n", "\n", "$$\n", "\\begin{aligned}\n", "y_t & = d + x_t \\beta_{x,t} + w_t \\beta_{w,t} + \\varepsilon_t \\hspace{4em} \\varepsilon_t \\sim N(0, \\sigma_\\varepsilon^2)\\\\\n", "\\begin{bmatrix} \\beta_{x,t} \\\\ \\beta_{w,t} \\end{bmatrix} & = \\begin{bmatrix} \\rho_1 & 0 \\\\ 0 & \\rho_2 \\end{bmatrix} \\begin{bmatrix} \\beta_{x,t-1} \\\\ \\beta_{w,t-1} \\end{bmatrix} + \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\hspace{3.7em} \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{\\beta, x}^2 & 0 \\\\ 0 & \\sigma_{\\beta, w}^2 \\end{bmatrix} \\right )\n", "\\end{aligned}\n", "$$\n", "\n", "\n", "What should we modify in our previous class to make things work?\n", "+ Good news: not a lot!\n", "+ Bad news: we need to be careful about a few things" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1) Change the starting parameters function\n", "\n", "We need to add names for the new parameters $\\rho_1, \\rho_2$ and we need to start corresponding starting values.\n", "\n", "The `param_names` function goes from:\n", "\n", "```python\n", "def param_names(self):\n", " return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff']\n", "```\n", "\n", "\n", "to \n", "\n", "```python\n", "def param_names(self):\n", " return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff',\n", " 'rho1', 'rho2']\n", "```\n", "\n", "and we change the `start_params` function from \n", "\n", "```python\n", "def start_params(self):\n", " exog = sm.add_constant(self.exog)\n", " res = sm.OLS(self.endog, exog).fit()\n", " params = np.r_[res.params[0], res.scale, 0.001, 0.001]\n", " return params\n", "```\n", "\n", "to\n", "\n", "```python\n", "def start_params(self):\n", " exog = sm.add_constant(self.exog)\n", " res = sm.OLS(self.endog, exog).fit()\n", " params = np.r_[res.params[0], res.scale, 0.001, 0.001, 0.8, 0.8]\n", " return params\n", "```\n", "\n", "2) Change the `update` function\n", "\n", "It goes from\n", "\n", "```python\n", "def update(self, params, **kwargs):\n", " params = super(TVRegression, self).update(params, **kwargs)\n", "\n", " self['obs_intercept', 0, 0] = params[0]\n", " self['obs_cov', 0, 0] = params[1]\n", " self['state_cov'] = np.diag(params[2:4])\n", "```\n", "\n", "\n", "to \n", "\n", "```python\n", "def update(self, params, **kwargs):\n", " params = super(TVRegression, self).update(params, **kwargs)\n", "\n", " self['obs_intercept', 0, 0] = params[0]\n", " self['obs_cov', 0, 0] = params[1]\n", " self['state_cov'] = np.diag(params[2:4])\n", " self['transition', 0, 0] = params[4]\n", " self['transition', 1, 1] = params[5]\n", "```\n", "\n", "\n", "3) (optional) change `transform_params` and `untransform_params`\n", "\n", "This is not required, but you might wanna restrict $\\rho_1, \\rho_2$ to lie between -1 and 1.\n", "In that case, we first import two utility functions from `statsmodels`.\n", "\n", "\n", "```python\n", "from statsmodels.tsa.statespace.tools import (\n", " constrain_stationary_univariate, unconstrain_stationary_univariate)\n", "```\n", "\n", "`constrain_stationary_univariate` constraint the value to be within -1 and 1. \n", "`unconstrain_stationary_univariate` provides the inverse function.\n", "The transform and untransform parameters function would look like this\n", "(remember that $\\rho_1, \\rho_2$ are in the 4 and 5th index):\n", "\n", "```python\n", "def transform_params(self, unconstrained):\n", " constrained = unconstrained.copy()\n", " constrained[self.positive_parameters] = constrained[self.positive_parameters]**2\n", " constrained[4] = constrain_stationary_univariate(constrained[4:5])\n", " constrained[5] = constrain_stationary_univariate(constrained[5:6])\n", " return constrained\n", "\n", "def untransform_params(self, constrained):\n", " unconstrained = constrained.copy()\n", " unconstrained[self.positive_parameters] = unconstrained[self.positive_parameters]**0.5\n", " unconstrained[4] = unconstrain_stationary_univariate(constrained[4:5])\n", " unconstrained[5] = unconstrain_stationary_univariate(constrained[5:6])\n", " return unconstrained\n", "```\n", "\n", "I'll write the full class below (without the optional changes I have just discussed)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class TVRegressionExtended(sm.tsa.statespace.MLEModel):\n", " def __init__(self, y_t, x_t, w_t):\n", " exog = np.c_[x_t, w_t] # shaped nobs x 2\n", "\n", " super(TVRegressionExtended, self).__init__(\n", " endog=y_t, exog=exog, k_states=2,\n", " initialization='diffuse')\n", "\n", " # Since the design matrix is time-varying, it must be\n", " # shaped k_endog x k_states x nobs\n", " # Notice that exog.T is shaped k_states x nobs, so we\n", " # just need to add a new first axis with shape 1\n", " self.ssm['design'] = exog.T[np.newaxis, :, :] # shaped 1 x 2 x nobs\n", " self.ssm['selection'] = np.eye(self.k_states)\n", " self.ssm['transition'] = np.eye(self.k_states)\n", " \n", " #Which parameters need to be positive?\n", " self.positive_parameters = slice(1, 4)\n", "\n", " @property\n", " def param_names(self):\n", " return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff',\n", " 'rho1', 'rho2']\n", "\n", " @property\n", " def start_params(self):\n", " \"\"\"\n", " Defines the starting values for the parameters\n", " The linear regression gives us reasonable starting values for the constant\n", " d and the variance of the epsilon error\n", " \"\"\"\n", "\n", " exog = sm.add_constant(self.exog)\n", " res = sm.OLS(self.endog, exog).fit()\n", " params = np.r_[res.params[0], res.scale, 0.001, 0.001, 0.7, 0.8]\n", " return params\n", "\n", " def transform_params(self, unconstrained):\n", " \"\"\"\n", " We constraint the last three parameters\n", " ('var.e', 'var.x.coeff', 'var.w.coeff') to be positive, \n", " because they are variances\n", " \"\"\"\n", " constrained = unconstrained.copy()\n", " constrained[self.positive_parameters] = constrained[self.positive_parameters]**2\n", " return constrained\n", "\n", " def untransform_params(self, constrained):\n", " \"\"\"\n", " Need to unstransform all the parameters you transformed\n", " in the `transform_params` function\n", " \"\"\"\n", " unconstrained = constrained.copy()\n", " unconstrained[self.positive_parameters] = unconstrained[self.positive_parameters]**0.5\n", " return unconstrained\n", "\n", " def update(self, params, **kwargs):\n", " params = super(TVRegressionExtended, self).update(params, **kwargs)\n", "\n", " self['obs_intercept', 0, 0] = params[0]\n", " self['obs_cov', 0, 0] = params[1]\n", " self['state_cov'] = np.diag(params[2:4])\n", " self['transition', 0, 0] = params[4]\n", " self['transition', 1, 1] = params[5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To estimate, we'll use the same data as in model 1 and expect the $\\rho_1, \\rho_2$ to be near 1.\n", "\n", "The results look pretty good!\n", "Note that this estimation can be quite sensitive to the starting value of $\\rho_1, \\rho_2$. If you try lower values, you'll see it fails to converge." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mod = TVRegressionExtended(y_t, x_t, w_t)\n", "res = mod.fit(maxiter=2000) #it doesn't converge with 50 iters\n", "print(res.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model 3: multiple observation and state equations\n", "\n", "We'll keep the time-varying parameters, but this time we'll also have two observation equations.\n", "\n", "### Observation equations\n", "\n", "$\\hat{i_t}, \\hat{M_t}, \\hat{s_t}$ are observed each period.\n", "\n", "The model for the observation equation has two equations:\n", "\n", "$$ \\hat{i_t} = \\alpha_1 * \\hat{s_t} + \\varepsilon_1 $$\n", "\n", "$$ \\hat{M_t} = \\alpha_2 + \\varepsilon_2 $$\n", "\n", "Following the [general notation from state space models](https://www.statsmodels.org/stable/statespace.html), the endogenous part of the observation equation is $y_t = (\\hat{i_t}, \\hat{M_t})$ and we only have one exogenous variable $\\hat{s_t}$\n", "\n", "\n", "### State equations\n", "\n", "\n", "$$ \\alpha_{1, t+1} = \\delta_1 \\alpha_{1, t} + \\delta_2 \\alpha_{2, t} + W_1 $$\n", "\n", "$$ \\alpha_{2, t+1} = \\delta_3 \\alpha_{2, t} + W_2 $$\n", "\n", "\n", "### Matrix notation for the state space model\n", "\n", "$$\n", "\\begin{aligned}\n", "\\begin{bmatrix} \\hat{i_t} \\\\ \\hat{M_t} \\end{bmatrix} &= \n", "\\begin{bmatrix} \\hat{s_t} & 0 \\\\ 0 & 1 \\end{bmatrix} \\begin{bmatrix} \\alpha_{1, t} \\\\ \\alpha_{2, t} \\end{bmatrix} + \\begin{bmatrix} \\varepsilon_{1, t} \\\\ \\varepsilon_{1, t} \\end{bmatrix} \\hspace{6.5em} \\varepsilon_t \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{\\varepsilon_1}^2 & 0 \\\\ 0 & \\sigma_{\\varepsilon_2}^2 \\end{bmatrix} \\right )\n", "\\\\\n", "\\begin{bmatrix} \\alpha_{1, t+1} \\\\ \\alpha_{2, t+1} \\end{bmatrix} & = \\begin{bmatrix} \\delta_1 & \\delta_1 \\\\ 0 & \\delta_3 \\end{bmatrix} \\begin{bmatrix} \\alpha_{1, t} \\\\ \\alpha_{2, t} \\end{bmatrix} + \\begin{bmatrix} W_1 \\\\ W_2 \\end{bmatrix} \\hspace{3.em} \\begin{bmatrix} W_1 \\\\ W_2 \\end{bmatrix} \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{W_1}^2 & 0 \\\\ 0 & \\sigma_{W_2}^2 \\end{bmatrix} \\right )\n", "\\end{aligned}\n", "$$\n", "\n", "I'll simulate some data, talk about what we need to modify and finally estimate the model to see if we're recovering something reasonable.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "true_values = {'var_e1': 0.01, 'var_e2': 0.01,\n", " 'var_w1': 0.01, 'var_w2': 0.01,\n", " 'delta1': 0.8, 'delta2': 0.5, 'delta3': 0.7}\n", "\n", "def gen_data_for_model3():\n", " #Starting values\n", " alpha1_0 = 2.1\n", " alpha2_0 = 1.1\n", "\n", " t_max = 500\n", "\n", " def gen_i(alpha1, s):\n", " return alpha1*s + np.sqrt(true_values['var_e1'])*np.random.randn()\n", "\n", " def gen_m_hat(alpha2):\n", " return 1*alpha2 + np.sqrt(true_values['var_e2'])*np.random.randn()\n", "\n", " def gen_alpha1(alpha1, alpha2):\n", " w1 = np.sqrt(true_values['var_w1'])*np.random.randn()\n", " return true_values['delta1'] * alpha1 + true_values['delta2'] * alpha2 + w1\n", "\n", " def gen_alpha2(alpha2):\n", " w2 = np.sqrt(true_values['var_w2'])*np.random.randn()\n", " return true_values['delta3'] * alpha2 + w2\n", "\n", " s_t = 0.3 + np.sqrt(1.4)*np.random.randn(t_max)\n", " i_hat = np.empty(t_max)\n", " m_hat = np.empty(t_max)\n", "\n", " current_alpha1 = alpha1_0\n", " current_alpha2 = alpha2_0\n", " for t in range(t_max):\n", " #Obs eqns\n", " i_hat[t] = gen_i(current_alpha1, s_t[t])\n", " m_hat[t] = gen_m_hat(current_alpha2)\n", "\n", " #state eqns\n", " new_alpha1 = gen_alpha1(current_alpha1, current_alpha2)\n", " new_alpha2 = gen_alpha2(current_alpha2)\n", "\n", " #Update states for next period\n", " current_alpha1 = new_alpha1\n", " current_alpha2 = new_alpha2\n", " \n", " return i_hat, m_hat, s_t\n", "\n", "i_hat, m_hat, s_t = gen_data_for_model3()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What do we need to modify?\n", "\n", "Once again, we don't need to change much, but we need to be careful about the dimensions.\n", "\n", "#### 1) The `__init__` function changes from\n", "\n", "\n", "```python\n", "def __init__(self, y_t, x_t, w_t):\n", " exog = np.c_[x_t, w_t]\n", "\n", " super(TVRegressionExtended, self).__init__(\n", " endog=y_t, exog=exog, k_states=2,\n", " initialization='diffuse')\n", "\n", " self.ssm['design'] = exog.T[np.newaxis, :, :] # shaped 1 x 2 x nobs\n", " self.ssm['selection'] = np.eye(self.k_states)\n", " self.ssm['transition'] = np.eye(self.k_states)\n", "```\n", "\n", "to\n", "\n", "\n", "```python\n", "def __init__(self, i_t: np.array, s_t: np.array, m_t: np.array):\n", "\n", " exog = np.c_[s_t, np.repeat(1, len(s_t))] # exog.shape => (nobs, 2)\n", " \n", " super(MultipleYsModel, self).__init__(\n", " endog=np.c_[i_t, m_t], exog=exog, k_states=2,\n", " initialization='diffuse')\n", " \n", " self.ssm['design'] = np.zeros((self.k_endog, self.k_states, self.nobs))\n", " self.ssm['design', 0, 0, :] = s_t\n", " self.ssm['design', 1, 1, :] = 1\n", "```\n", "\n", "Note that we did not have to specify `k_endog` anywhere. The initialization does this for us after checking the dimensions of the `endog` matrix.\n", "\n", "\n", "#### 2) The `update()` function \n", "\n", "changes from \n", "\n", "```python\n", "def update(self, params, **kwargs):\n", " params = super(TVRegressionExtended, self).update(params, **kwargs)\n", "\n", " self['obs_intercept', 0, 0] = params[0]\n", " self['obs_cov', 0, 0] = params[1]\n", " \n", " self['state_cov'] = np.diag(params[2:4])\n", " self['transition', 0, 0] = params[4]\n", " self['transition', 1, 1] = params[5]\n", "```\n", "\n", "\n", "to\n", "\n", "\n", "```python\n", "def update(self, params, **kwargs):\n", " params = super(MultipleYsModel, self).update(params, **kwargs)\n", " \n", " \n", " #The following line is not needed (by default, this matrix is initialized by zeroes),\n", " #But I leave it here so the dimensions are clearer\n", " self['obs_intercept'] = np.repeat([np.array([0, 0])], self.nobs, axis=0).T\n", " self['obs_cov', 0, 0] = params[0]\n", " self['obs_cov', 1, 1] = params[1]\n", "\n", " self['state_cov'] = np.diag(params[2:4])\n", " #delta1, delta2, delta3\n", " self['transition', 0, 0] = params[4]\n", " self['transition', 0, 1] = params[5]\n", " self['transition', 1, 1] = params[6]\n", "```\n", "\n", "The rest of the methods change in pretty obvious ways (need to add parameter names, make sure the indexes work, etc). The full code for the function is right below" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "starting_values = {'var_e1': 0.2, 'var_e2': 0.1,\n", " 'var_w1': 0.15, 'var_w2': 0.18,\n", " 'delta1': 0.7, 'delta2': 0.1, 'delta3': 0.85}\n", "\n", "\n", "class MultipleYsModel(sm.tsa.statespace.MLEModel):\n", " def __init__(self, i_t: np.array, s_t: np.array, m_t: np.array):\n", "\n", " exog = np.c_[s_t, np.repeat(1, len(s_t))] # exog.shape => (nobs, 2)\n", " \n", " super(MultipleYsModel, self).__init__(\n", " endog=np.c_[i_t, m_t], exog=exog, k_states=2,\n", " initialization='diffuse')\n", " \n", " self.ssm['design'] = np.zeros((self.k_endog, self.k_states, self.nobs))\n", " self.ssm['design', 0, 0, :] = s_t\n", " self.ssm['design', 1, 1, :] = 1\n", " \n", " #These have ok shape. Placeholders since I'm changing them\n", " #in the update() function\n", " self.ssm['selection'] = np.eye(self.k_states)\n", " self.ssm['transition'] = np.eye(self.k_states)\n", " \n", " #Dictionary of positions to names\n", " self.position_dict = OrderedDict(var_e1=1, var_e2=2, \n", " var_w1=3, var_w2=4,\n", " delta1=5, delta2=6, delta3=7)\n", " self.initial_values = starting_values\n", " self.positive_parameters = slice(0, 4)\n", " \n", "\n", " @property\n", " def param_names(self):\n", " return list(self.position_dict.keys())\n", "\n", " @property\n", " def start_params(self):\n", " \"\"\"\n", " Initial values\n", " \"\"\"\n", " #(optional) Use scale for var_e1 and var_e2 starting values\n", " params = np.r_[self.initial_values['var_e1'],\n", " self.initial_values['var_e2'],\n", " self.initial_values['var_w1'],\n", " self.initial_values['var_w2'],\n", " self.initial_values['delta1'],\n", " self.initial_values['delta2'],\n", " self.initial_values['delta3']]\n", " return params\n", "\n", " def transform_params(self, unconstrained):\n", " \"\"\"\n", " If you need to restrict parameters\n", " For example, variances should be > 0\n", " Parameters maybe have to be within -1 and 1\n", " \"\"\"\n", " constrained = unconstrained.copy()\n", " constrained[self.positive_parameters] = constrained[self.positive_parameters]**2\n", " return constrained\n", "\n", " def untransform_params(self, constrained):\n", " \"\"\"\n", " Need to reverse what you did in transform_params()\n", " \"\"\"\n", " unconstrained = constrained.copy()\n", " unconstrained[self.positive_parameters] = unconstrained[self.positive_parameters]**0.5\n", " return unconstrained\n", "\n", " def update(self, params, **kwargs):\n", " params = super(MultipleYsModel, self).update(params, **kwargs)\n", " \n", " #The following line is not needed (by default, this matrix is initialized by zeroes),\n", " #But I leave it here so the dimensions are clearer\n", " self['obs_intercept'] = np.repeat([np.array([0, 0])], self.nobs, axis=0).T\n", " \n", " self['obs_cov', 0, 0] = params[0]\n", " self['obs_cov', 1, 1] = params[1]\n", " \n", " self['state_cov'] = np.diag(params[2:4])\n", " \n", " #delta1, delta2, delta3\n", " self['transition', 0, 0] = params[4]\n", " self['transition', 0, 1] = params[5]\n", " self['transition', 1, 1] = params[6]\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mod = MultipleYsModel(i_hat, s_t, m_hat)\n", "res = mod.fit()\n", "\n", "print(res.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bonus: pymc3 for fast Bayesian estimation\n", "\n", "In this section I'll show how you can take your custom state space model and easily plug it to `pymc3` and estimate it with Bayesian methods. In particular, this example will show you an estimation with a version of Hamiltonian Monte Carlo called the No-U-Turn Sampler (NUTS).\n", "\n", "I'm basically copying the ideas contained [in this notebook](https://www.statsmodels.org/dev/examples/notebooks/generated/statespace_sarimax_pymc3.html), so make sure to check that for more details." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Extra requirements\n", "import theano\n", "import theano.tensor as tt\n", "import pymc3 as pm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need to define some helper functions to connect theano to the likelihood function that is implied in our model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Loglike(tt.Op):\n", "\n", " itypes = [tt.dvector] # expects a vector of parameter values when called\n", " otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)\n", "\n", " def __init__(self, model):\n", " self.model = model\n", " self.score = Score(self.model)\n", "\n", " def perform(self, node, inputs, outputs):\n", " theta, = inputs # contains the vector of parameters\n", " llf = self.model.loglike(theta)\n", " outputs[0][0] = np.array(llf) # output the log-likelihood\n", "\n", " def grad(self, inputs, g):\n", " # the method that calculates the gradients - it actually returns the\n", " # vector-Jacobian product - g[0] is a vector of parameter values\n", " theta, = inputs # our parameters\n", " out = [g[0] * self.score(theta)]\n", " return out\n", "\n", "\n", "class Score(tt.Op):\n", " itypes = [tt.dvector]\n", " otypes = [tt.dvector]\n", "\n", " def __init__(self, model):\n", " self.model = model\n", "\n", " def perform(self, node, inputs, outputs):\n", " theta, = inputs\n", " outputs[0][0] = self.model.score(theta)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll simulate again the data we used for model 1.\n", "We'll also `fit` it again and save the results to compare them to the Bayesian posterior we get." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y_t, x_t, w_t, beta_x, beta_w = gen_data_for_model1()\n", "plt.plot(y_t)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mod = TVRegression(y_t, x_t, w_t)\n", "res_mle = mod.fit(disp=False)\n", "print(res_mle.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bayesian estimation\n", "\n", "We need to define a prior for each parameter and the number of draws and burn-in points" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set sampling params\n", "ndraws = 3000 # 3000 number of draws from the distribution\n", "nburn = 600 # 600 number of \"burn-in points\" (which will be discarded)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Construct an instance of the Theano wrapper defined above, which\n", "# will allow PyMC3 to compute the likelihood and Jacobian in a way\n", "# that it can make use of. Here we are using the same model instance\n", "# created earlier for MLE analysis (we could also create a new model\n", "# instance if we preferred)\n", "loglike = Loglike(mod)\n", "\n", "with pm.Model():\n", " # Priors\n", " intercept = pm.Uniform('intercept', 1, 10)\n", " var_e = pm.InverseGamma('var.e', 2.3, 0.5)\n", " var_x_coeff = pm.InverseGamma('var.x.coeff', 2.3, 0.1)\n", " var_w_coeff = pm.InverseGamma('var.w.coeff', 2.3, 0.1)\n", "\n", " # convert variables to tensor vectors\n", " theta = tt.as_tensor_variable([intercept, var_e, var_x_coeff, var_w_coeff])\n", "\n", " # use a DensityDist (use a lamdba function to \"call\" the Op)\n", " pm.DensityDist('likelihood', lambda v: loglike(v), observed={'v': theta})\n", "\n", " # Draw samples\n", " trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True, cores=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How does the posterior distribution compare with the MLE estimation?\n", "\n", "The clearly peak around the MLE estimate." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "results_dict = {'intercept': res_mle.params[0], 'var.e': res_mle.params[1],\n", " 'var.x.coeff': res_mle.params[2], 'var.w.coeff': res_mle.params[3]}\n", "plt.tight_layout()\n", "_ = pm.traceplot(trace,\n", " lines=[(k, {}, [v]) for k, v in dict(results_dict).items()],\n", " combined=True,\n", " figsize=(12, 12))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 4 }