{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Custom statespace models\n",
    "\n",
    "The true power of the state space model is to allow the creation and estimation of custom models. This notebook shows various statespace models that subclass `sm.tsa.statespace.MLEModel`.\n",
    "\n",
    "Remember the general state space model can be written in the following general way:\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "y_t & = Z_t \\alpha_{t} + d_t +  \\varepsilon_t \\\\\n",
    "\\alpha_{t+1} & = T_t \\alpha_{t} + c_t + R_t \\eta_{t}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "You can check the details and the dimensions of the objects [in this link](https://www.statsmodels.org/stable/statespace.html#custom-state-space-models)\n",
    "\n",
    "Most models won't include all of these elements. For example, the design matrix $Z_t$ might not depend on time ($\\forall t \\;Z_t = Z$), or the model won't have an observation intercept $d_t$.\n",
    "\n",
    "We'll start with something relatively simple and then show how to extend it bit by bit to include more elements.\n",
    "\n",
    "+ Model 1: time-varying coefficients. One observation equation with two state equations\n",
    "+ Model 2: time-varying parameters with non identity transition matrix\n",
    "+ Model 3: multiple observation and multiple state equations\n",
    "+ Bonus: pymc3 for Bayesian estimation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import statsmodels.api as sm\n",
    "import matplotlib.pyplot as plt\n",
    "from collections import OrderedDict\n",
    "plt.rc(\"figure\", figsize=(16,8))\n",
    "plt.rc(\"font\", size=15)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model 1: time-varying coefficients\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "y_t & = d + x_t \\beta_{x,t} + w_t \\beta_{w,t} + \\varepsilon_t \\hspace{4em} \\varepsilon_t \\sim N(0, \\sigma_\\varepsilon^2)\\\\\n",
    "\\begin{bmatrix} \\beta_{x,t} \\\\ \\beta_{w,t} \\end{bmatrix} & = \\begin{bmatrix} \\beta_{x,t-1} \\\\ \\beta_{w,t-1} \\end{bmatrix} + \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\hspace{3.7em} \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{\\beta, x}^2 & 0 \\\\ 0 & \\sigma_{\\beta, w}^2 \\end{bmatrix} \\right )\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "The observed data is $y_t, x_t, w_t$. With $x_t, w_t$ being the exogenous variables. Notice that the design matrix is time-varying, so it will have three dimensions (`k_endog x k_states x nobs`)\n",
    "\n",
    "The states are $\\beta_{x,t}$ and $\\beta_{w,t}$. The state equation tells us these states evolve with a random walk. Thus, in this case the transition matrix is a 2 by 2 identity matrix.\n",
    "\n",
    "We'll first simulate the data, the construct a model and finally estimate it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def gen_data_for_model1():\n",
    "    nobs = 1000\n",
    "\n",
    "    rs = np.random.RandomState(seed=93572)\n",
    "\n",
    "    d = 5\n",
    "    var_y = 5\n",
    "    var_coeff_x = 0.01\n",
    "    var_coeff_w = 0.5\n",
    "\n",
    "    x_t = rs.uniform(size=nobs)\n",
    "    w_t = rs.uniform(size=nobs)\n",
    "    eps = rs.normal(scale=var_y**0.5, size=nobs)\n",
    "\n",
    "    beta_x = np.cumsum(rs.normal(size=nobs, scale=var_coeff_x**0.5))\n",
    "    beta_w = np.cumsum(rs.normal(size=nobs, scale=var_coeff_w**0.5))\n",
    "\n",
    "    y_t = d + beta_x * x_t + beta_w * w_t + eps\n",
    "    return y_t, x_t, w_t, beta_x, beta_w \n",
    "\n",
    "y_t, x_t, w_t, beta_x, beta_w = gen_data_for_model1()\n",
    "_ = plt.plot(y_t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class TVRegression(sm.tsa.statespace.MLEModel):\n",
    "    def __init__(self, y_t, x_t, w_t):\n",
    "        exog = np.c_[x_t, w_t]  # shaped nobs x 2\n",
    "\n",
    "        super(TVRegression, self).__init__(\n",
    "            endog=y_t, exog=exog, k_states=2,\n",
    "            initialization='diffuse')\n",
    "\n",
    "        # Since the design matrix is time-varying, it must be\n",
    "        # shaped k_endog x k_states x nobs\n",
    "        # Notice that exog.T is shaped k_states x nobs, so we\n",
    "        # just need to add a new first axis with shape 1\n",
    "        self.ssm['design'] = exog.T[np.newaxis, :, :]  # shaped 1 x 2 x nobs\n",
    "        self.ssm['selection'] = np.eye(self.k_states)\n",
    "        self.ssm['transition'] = np.eye(self.k_states)\n",
    "        \n",
    "        #Which parameters need to be positive?\n",
    "        self.positive_parameters = slice(1, 4)\n",
    "\n",
    "    @property\n",
    "    def param_names(self):\n",
    "        return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff']\n",
    "\n",
    "    @property\n",
    "    def start_params(self):\n",
    "        \"\"\"\n",
    "        Defines the starting values for the parameters\n",
    "        The linear regression gives us reasonable starting values for the constant\n",
    "        d and the variance of the epsilon error\n",
    "        \"\"\"\n",
    "        exog = sm.add_constant(self.exog)\n",
    "        res = sm.OLS(self.endog, exog).fit()\n",
    "        params = np.r_[res.params[0], res.scale, 0.001, 0.001]\n",
    "        return params\n",
    "\n",
    "    def transform_params(self, unconstrained):\n",
    "        \"\"\"\n",
    "        We constraint the last three parameters\n",
    "        ('var.e', 'var.x.coeff', 'var.w.coeff') to be positive, \n",
    "        because they are variances\n",
    "        \"\"\"\n",
    "        constrained = unconstrained.copy()\n",
    "        constrained[self.positive_parameters] = constrained[self.positive_parameters]**2\n",
    "        return constrained\n",
    "\n",
    "    def untransform_params(self, constrained):\n",
    "        \"\"\"\n",
    "        Need to unstransform all the parameters you transformed\n",
    "        in the `transform_params` function\n",
    "        \"\"\"\n",
    "        unconstrained = constrained.copy()\n",
    "        unconstrained[self.positive_parameters] = unconstrained[self.positive_parameters]**0.5\n",
    "        return unconstrained\n",
    "\n",
    "    def update(self, params, **kwargs):\n",
    "        params = super(TVRegression, self).update(params, **kwargs)\n",
    "\n",
    "        self['obs_intercept', 0, 0] = params[0]\n",
    "        self['obs_cov', 0, 0] = params[1]\n",
    "        self['state_cov'] = np.diag(params[2:4])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### And then estimate it with our custom model class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mod = TVRegression(y_t, x_t, w_t)\n",
    "res = mod.fit()\n",
    "\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The values that generated the data were:\n",
    "\n",
    "+ intercept = 5\n",
    "+ var.e = 5\n",
    "+ var.x.coeff = 0.01\n",
    "+ var.w.coeff = 0.5\n",
    "\n",
    "\n",
    "As you can see, the estimation recovered the real parameters pretty well.\n",
    "\n",
    "We can also recover the estimated evolution of the underlying coefficients (or states in Kalman filter talk)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, axes = plt.subplots(2, figsize=(16, 8))\n",
    "\n",
    "ss = pd.DataFrame(res.smoothed_state.T, columns=['x', 'w'])\n",
    "\n",
    "axes[0].plot(beta_x, label='True')\n",
    "axes[0].plot(ss['x'], label='Smoothed estimate')\n",
    "axes[0].set(title='Time-varying coefficient on x_t')\n",
    "axes[0].legend()\n",
    "\n",
    "axes[1].plot(beta_w, label='True')\n",
    "axes[1].plot(ss['w'], label='Smoothed estimate')\n",
    "axes[1].set(title='Time-varying coefficient on w_t')\n",
    "axes[1].legend()\n",
    "\n",
    "fig.tight_layout();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model 2: time-varying parameters with non identity transition matrix\n",
    "\n",
    "This is a small extension from Model 1. Instead of having an identity transition matrix, we'll have one with two parameters ($\\rho_1, \\rho_2$) that we need to estimate. \n",
    "\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "y_t & = d + x_t \\beta_{x,t} + w_t \\beta_{w,t} + \\varepsilon_t \\hspace{4em} \\varepsilon_t \\sim N(0, \\sigma_\\varepsilon^2)\\\\\n",
    "\\begin{bmatrix} \\beta_{x,t} \\\\ \\beta_{w,t} \\end{bmatrix} & = \\begin{bmatrix} \\rho_1 & 0 \\\\ 0 & \\rho_2 \\end{bmatrix} \\begin{bmatrix} \\beta_{x,t-1} \\\\ \\beta_{w,t-1} \\end{bmatrix} + \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\hspace{3.7em} \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{\\beta, x}^2 & 0 \\\\ 0 & \\sigma_{\\beta, w}^2 \\end{bmatrix} \\right )\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "\n",
    "What should we modify in our previous class to make things work?\n",
    "+ Good news: not a lot!\n",
    "+ Bad news: we need to be careful about a few things"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1) Change the starting parameters function\n",
    "\n",
    "We need to add names for the new parameters $\\rho_1, \\rho_2$ and we need to start corresponding starting values.\n",
    "\n",
    "The `param_names` function goes from:\n",
    "\n",
    "```python\n",
    "def param_names(self):\n",
    "    return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff']\n",
    "```\n",
    "\n",
    "\n",
    "to \n",
    "\n",
    "```python\n",
    "def param_names(self):\n",
    "    return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff',\n",
    "           'rho1', 'rho2']\n",
    "```\n",
    "\n",
    "and we change the `start_params` function from \n",
    "\n",
    "```python\n",
    "def start_params(self):\n",
    "    exog = sm.add_constant(self.exog)\n",
    "    res = sm.OLS(self.endog, exog).fit()\n",
    "    params = np.r_[res.params[0], res.scale, 0.001, 0.001]\n",
    "    return params\n",
    "```\n",
    "\n",
    "to\n",
    "\n",
    "```python\n",
    "def start_params(self):\n",
    "    exog = sm.add_constant(self.exog)\n",
    "    res = sm.OLS(self.endog, exog).fit()\n",
    "    params = np.r_[res.params[0], res.scale, 0.001, 0.001, 0.8, 0.8]\n",
    "    return params\n",
    "```\n",
    "\n",
    "2) Change the `update` function\n",
    "\n",
    "It goes from\n",
    "\n",
    "```python\n",
    "def update(self, params, **kwargs):\n",
    "    params = super(TVRegression, self).update(params, **kwargs)\n",
    "\n",
    "    self['obs_intercept', 0, 0] = params[0]\n",
    "    self['obs_cov', 0, 0] = params[1]\n",
    "    self['state_cov'] = np.diag(params[2:4])\n",
    "```\n",
    "\n",
    "\n",
    "to \n",
    "\n",
    "```python\n",
    "def update(self, params, **kwargs):\n",
    "    params = super(TVRegression, self).update(params, **kwargs)\n",
    "\n",
    "    self['obs_intercept', 0, 0] = params[0]\n",
    "    self['obs_cov', 0, 0] = params[1]\n",
    "    self['state_cov'] = np.diag(params[2:4])\n",
    "    self['transition', 0, 0] = params[4]\n",
    "    self['transition', 1, 1] = params[5]\n",
    "```\n",
    "\n",
    "\n",
    "3) (optional) change `transform_params` and `untransform_params`\n",
    "\n",
    "This is not required, but you might wanna restrict $\\rho_1, \\rho_2$ to lie between -1 and 1.\n",
    "In that case, we first import two utility functions from `statsmodels`.\n",
    "\n",
    "\n",
    "```python\n",
    "from statsmodels.tsa.statespace.tools import (\n",
    "    constrain_stationary_univariate, unconstrain_stationary_univariate)\n",
    "```\n",
    "\n",
    "`constrain_stationary_univariate` constraint the value to be within -1 and 1. \n",
    "`unconstrain_stationary_univariate` provides the inverse function.\n",
    "The transform and untransform parameters function would look like this\n",
    "(remember that $\\rho_1, \\rho_2$ are in the 4 and 5th index):\n",
    "\n",
    "```python\n",
    "def transform_params(self, unconstrained):\n",
    "    constrained = unconstrained.copy()\n",
    "    constrained[self.positive_parameters] = constrained[self.positive_parameters]**2\n",
    "    constrained[4] = constrain_stationary_univariate(constrained[4:5])\n",
    "    constrained[5] = constrain_stationary_univariate(constrained[5:6])\n",
    "    return constrained\n",
    "\n",
    "def untransform_params(self, constrained):\n",
    "    unconstrained = constrained.copy()\n",
    "    unconstrained[self.positive_parameters] = unconstrained[self.positive_parameters]**0.5\n",
    "    unconstrained[4] = unconstrain_stationary_univariate(constrained[4:5])\n",
    "    unconstrained[5] = unconstrain_stationary_univariate(constrained[5:6])\n",
    "    return unconstrained\n",
    "```\n",
    "\n",
    "I'll write the full class below (without the optional changes I have just discussed)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class TVRegressionExtended(sm.tsa.statespace.MLEModel):\n",
    "    def __init__(self, y_t, x_t, w_t):\n",
    "        exog = np.c_[x_t, w_t]  # shaped nobs x 2\n",
    "\n",
    "        super(TVRegressionExtended, self).__init__(\n",
    "            endog=y_t, exog=exog, k_states=2,\n",
    "            initialization='diffuse')\n",
    "\n",
    "        # Since the design matrix is time-varying, it must be\n",
    "        # shaped k_endog x k_states x nobs\n",
    "        # Notice that exog.T is shaped k_states x nobs, so we\n",
    "        # just need to add a new first axis with shape 1\n",
    "        self.ssm['design'] = exog.T[np.newaxis, :, :]  # shaped 1 x 2 x nobs\n",
    "        self.ssm['selection'] = np.eye(self.k_states)\n",
    "        self.ssm['transition'] = np.eye(self.k_states)\n",
    "        \n",
    "        #Which parameters need to be positive?\n",
    "        self.positive_parameters = slice(1, 4)\n",
    "\n",
    "    @property\n",
    "    def param_names(self):\n",
    "        return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff',\n",
    "                'rho1', 'rho2']\n",
    "\n",
    "    @property\n",
    "    def start_params(self):\n",
    "        \"\"\"\n",
    "        Defines the starting values for the parameters\n",
    "        The linear regression gives us reasonable starting values for the constant\n",
    "        d and the variance of the epsilon error\n",
    "        \"\"\"\n",
    "\n",
    "        exog = sm.add_constant(self.exog)\n",
    "        res = sm.OLS(self.endog, exog).fit()\n",
    "        params = np.r_[res.params[0], res.scale, 0.001, 0.001, 0.7, 0.8]\n",
    "        return params\n",
    "\n",
    "    def transform_params(self, unconstrained):\n",
    "        \"\"\"\n",
    "        We constraint the last three parameters\n",
    "        ('var.e', 'var.x.coeff', 'var.w.coeff') to be positive, \n",
    "        because they are variances\n",
    "        \"\"\"\n",
    "        constrained = unconstrained.copy()\n",
    "        constrained[self.positive_parameters] = constrained[self.positive_parameters]**2\n",
    "        return constrained\n",
    "\n",
    "    def untransform_params(self, constrained):\n",
    "        \"\"\"\n",
    "        Need to unstransform all the parameters you transformed\n",
    "        in the `transform_params` function\n",
    "        \"\"\"\n",
    "        unconstrained = constrained.copy()\n",
    "        unconstrained[self.positive_parameters] = unconstrained[self.positive_parameters]**0.5\n",
    "        return unconstrained\n",
    "\n",
    "    def update(self, params, **kwargs):\n",
    "        params = super(TVRegressionExtended, self).update(params, **kwargs)\n",
    "\n",
    "        self['obs_intercept', 0, 0] = params[0]\n",
    "        self['obs_cov', 0, 0] = params[1]\n",
    "        self['state_cov'] = np.diag(params[2:4])\n",
    "        self['transition', 0, 0] = params[4]\n",
    "        self['transition', 1, 1] = params[5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To estimate, we'll use the same data as in model 1 and expect the $\\rho_1, \\rho_2$ to be near 1.\n",
    "\n",
    "The results look pretty good!\n",
    "Note that this estimation can be quite sensitive to the starting value of $\\rho_1, \\rho_2$. If you try lower values, you'll see it fails to converge."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mod = TVRegressionExtended(y_t, x_t, w_t)\n",
    "res = mod.fit(maxiter=2000) #it doesn't converge with 50 iters\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model 3: multiple observation and state equations\n",
    "\n",
    "We'll keep the time-varying parameters, but this time we'll also have two observation equations.\n",
    "\n",
    "### Observation equations\n",
    "\n",
    "$\\hat{i_t}, \\hat{M_t},  \\hat{s_t}$ are observed each period.\n",
    "\n",
    "The model for the observation equation has two equations:\n",
    "\n",
    "$$ \\hat{i_t} = \\alpha_1 * \\hat{s_t} + \\varepsilon_1 $$\n",
    "\n",
    "$$ \\hat{M_t} = \\alpha_2 + \\varepsilon_2 $$\n",
    "\n",
    "Following the [general notation from state space models](https://www.statsmodels.org/stable/statespace.html), the endogenous part of the observation equation is $y_t = (\\hat{i_t},  \\hat{M_t})$ and we only have one exogenous variable $\\hat{s_t}$\n",
    "\n",
    "\n",
    "### State equations\n",
    "\n",
    "\n",
    "$$ \\alpha_{1, t+1} = \\delta_1 \\alpha_{1, t} + \\delta_2 \\alpha_{2, t} + W_1 $$\n",
    "\n",
    "$$ \\alpha_{2, t+1} = \\delta_3  \\alpha_{2, t} + W_2 $$\n",
    "\n",
    "\n",
    "### Matrix notation for the state space model\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\begin{bmatrix} \\hat{i_t} \\\\ \\hat{M_t} \\end{bmatrix} &= \n",
    "\\begin{bmatrix} \\hat{s_t}  & 0 \\\\ 0 & 1 \\end{bmatrix} \\begin{bmatrix} \\alpha_{1, t} \\\\ \\alpha_{2, t} \\end{bmatrix} + \\begin{bmatrix} \\varepsilon_{1, t} \\\\ \\varepsilon_{1, t} \\end{bmatrix}  \\hspace{6.5em} \\varepsilon_t \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{\\varepsilon_1}^2 & 0 \\\\ 0 & \\sigma_{\\varepsilon_2}^2  \\end{bmatrix} \\right )\n",
    "\\\\\n",
    "\\begin{bmatrix} \\alpha_{1, t+1} \\\\ \\alpha_{2, t+1} \\end{bmatrix} & = \\begin{bmatrix} \\delta_1 & \\delta_1 \\\\ 0 & \\delta_3 \\end{bmatrix} \\begin{bmatrix} \\alpha_{1, t} \\\\ \\alpha_{2, t} \\end{bmatrix} + \\begin{bmatrix} W_1 \\\\ W_2 \\end{bmatrix} \\hspace{3.em} \\begin{bmatrix} W_1 \\\\ W_2 \\end{bmatrix} \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{W_1}^2 & 0 \\\\ 0 & \\sigma_{W_2}^2 \\end{bmatrix} \\right )\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "I'll simulate some data, talk about what we need to modify and finally estimate the model to see if we're recovering something reasonable.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "true_values = {'var_e1': 0.01, 'var_e2': 0.01,\n",
    "               'var_w1': 0.01, 'var_w2': 0.01,\n",
    "               'delta1': 0.8, 'delta2': 0.5, 'delta3': 0.7}\n",
    "\n",
    "def gen_data_for_model3():\n",
    "    #Starting values\n",
    "    alpha1_0 = 2.1\n",
    "    alpha2_0 = 1.1\n",
    "\n",
    "    t_max = 500\n",
    "\n",
    "    def gen_i(alpha1, s):\n",
    "        return alpha1*s + np.sqrt(true_values['var_e1'])*np.random.randn()\n",
    "\n",
    "    def gen_m_hat(alpha2):\n",
    "        return 1*alpha2 + np.sqrt(true_values['var_e2'])*np.random.randn()\n",
    "\n",
    "    def gen_alpha1(alpha1, alpha2):\n",
    "        w1 = np.sqrt(true_values['var_w1'])*np.random.randn()\n",
    "        return true_values['delta1'] * alpha1 + true_values['delta2'] * alpha2 + w1\n",
    "\n",
    "    def gen_alpha2(alpha2):\n",
    "        w2 = np.sqrt(true_values['var_w2'])*np.random.randn()\n",
    "        return true_values['delta3'] * alpha2 + w2\n",
    "\n",
    "    s_t = 0.3 + np.sqrt(1.4)*np.random.randn(t_max)\n",
    "    i_hat = np.empty(t_max)\n",
    "    m_hat = np.empty(t_max)\n",
    "\n",
    "    current_alpha1 = alpha1_0\n",
    "    current_alpha2 = alpha2_0\n",
    "    for t in range(t_max):\n",
    "        #Obs eqns\n",
    "        i_hat[t] = gen_i(current_alpha1, s_t[t])\n",
    "        m_hat[t] = gen_m_hat(current_alpha2)\n",
    "\n",
    "        #state eqns\n",
    "        new_alpha1 = gen_alpha1(current_alpha1, current_alpha2)\n",
    "        new_alpha2 = gen_alpha2(current_alpha2)\n",
    "\n",
    "        #Update states for next period\n",
    "        current_alpha1 = new_alpha1\n",
    "        current_alpha2 = new_alpha2\n",
    "        \n",
    "    return i_hat, m_hat, s_t\n",
    "\n",
    "i_hat, m_hat, s_t = gen_data_for_model3()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What do we need to modify?\n",
    "\n",
    "Once again, we don't need to change much, but we need to be careful about the dimensions.\n",
    "\n",
    "#### 1) The `__init__` function changes from\n",
    "\n",
    "\n",
    "```python\n",
    "def __init__(self, y_t, x_t, w_t):\n",
    "        exog = np.c_[x_t, w_t]\n",
    "\n",
    "        super(TVRegressionExtended, self).__init__(\n",
    "            endog=y_t, exog=exog, k_states=2,\n",
    "            initialization='diffuse')\n",
    "\n",
    "        self.ssm['design'] = exog.T[np.newaxis, :, :]  # shaped 1 x 2 x nobs\n",
    "        self.ssm['selection'] = np.eye(self.k_states)\n",
    "        self.ssm['transition'] = np.eye(self.k_states)\n",
    "```\n",
    "\n",
    "to\n",
    "\n",
    "\n",
    "```python\n",
    "def __init__(self, i_t: np.array, s_t: np.array, m_t: np.array):\n",
    "\n",
    "        exog = np.c_[s_t, np.repeat(1, len(s_t))]  # exog.shape => (nobs, 2)\n",
    "        \n",
    "        super(MultipleYsModel, self).__init__(\n",
    "            endog=np.c_[i_t, m_t], exog=exog, k_states=2,\n",
    "            initialization='diffuse')\n",
    "        \n",
    "        self.ssm['design'] = np.zeros((self.k_endog, self.k_states, self.nobs))\n",
    "        self.ssm['design', 0, 0, :] = s_t\n",
    "        self.ssm['design', 1, 1, :] = 1\n",
    "```\n",
    "\n",
    "Note that we did not have to specify `k_endog` anywhere. The initialization does this for us after checking the dimensions of the `endog` matrix.\n",
    "\n",
    "\n",
    "#### 2) The `update()` function \n",
    "\n",
    "changes from \n",
    "\n",
    "```python\n",
    "def update(self, params, **kwargs):\n",
    "    params = super(TVRegressionExtended, self).update(params, **kwargs)\n",
    "\n",
    "    self['obs_intercept', 0, 0] = params[0]\n",
    "    self['obs_cov', 0, 0] = params[1]\n",
    "    \n",
    "    self['state_cov'] = np.diag(params[2:4])\n",
    "    self['transition', 0, 0] = params[4]\n",
    "    self['transition', 1, 1] = params[5]\n",
    "```\n",
    "\n",
    "\n",
    "to\n",
    "\n",
    "\n",
    "```python\n",
    "def update(self, params, **kwargs):\n",
    "    params = super(MultipleYsModel, self).update(params, **kwargs)\n",
    "    \n",
    "    \n",
    "    #The following line is not needed (by default, this matrix is initialized by zeroes),\n",
    "    #But I leave it here so the dimensions are clearer\n",
    "    self['obs_intercept'] = np.repeat([np.array([0, 0])], self.nobs, axis=0).T\n",
    "    self['obs_cov', 0, 0] = params[0]\n",
    "    self['obs_cov', 1, 1] = params[1]\n",
    "\n",
    "    self['state_cov'] = np.diag(params[2:4])\n",
    "    #delta1, delta2, delta3\n",
    "    self['transition', 0, 0] = params[4]\n",
    "    self['transition', 0, 1] = params[5]\n",
    "    self['transition', 1, 1] = params[6]\n",
    "```\n",
    "\n",
    "The rest of the methods change in pretty obvious ways (need to add parameter names, make sure the indexes work, etc). The full code for the function is right below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "starting_values = {'var_e1': 0.2, 'var_e2': 0.1,\n",
    "                   'var_w1': 0.15, 'var_w2': 0.18,\n",
    "                    'delta1': 0.7, 'delta2': 0.1, 'delta3': 0.85}\n",
    "\n",
    "\n",
    "class MultipleYsModel(sm.tsa.statespace.MLEModel):\n",
    "    def __init__(self, i_t: np.array, s_t: np.array, m_t: np.array):\n",
    "\n",
    "        exog = np.c_[s_t, np.repeat(1, len(s_t))]  # exog.shape => (nobs, 2)\n",
    "        \n",
    "        super(MultipleYsModel, self).__init__(\n",
    "            endog=np.c_[i_t, m_t], exog=exog, k_states=2,\n",
    "            initialization='diffuse')\n",
    "        \n",
    "        self.ssm['design'] = np.zeros((self.k_endog, self.k_states, self.nobs))\n",
    "        self.ssm['design', 0, 0, :] = s_t\n",
    "        self.ssm['design', 1, 1, :] = 1\n",
    "        \n",
    "        #These have ok shape. Placeholders since I'm changing them\n",
    "        #in the update() function\n",
    "        self.ssm['selection'] = np.eye(self.k_states)\n",
    "        self.ssm['transition'] = np.eye(self.k_states)\n",
    "        \n",
    "        #Dictionary of positions to names\n",
    "        self.position_dict = OrderedDict(var_e1=1, var_e2=2, \n",
    "                         var_w1=3, var_w2=4,\n",
    "                         delta1=5, delta2=6, delta3=7)\n",
    "        self.initial_values = starting_values\n",
    "        self.positive_parameters = slice(0, 4)\n",
    "        \n",
    "\n",
    "    @property\n",
    "    def param_names(self):\n",
    "        return list(self.position_dict.keys())\n",
    "\n",
    "    @property\n",
    "    def start_params(self):\n",
    "        \"\"\"\n",
    "        Initial values\n",
    "        \"\"\"\n",
    "        #(optional) Use scale for var_e1 and var_e2 starting values\n",
    "        params = np.r_[self.initial_values['var_e1'],\n",
    "                       self.initial_values['var_e2'],\n",
    "                       self.initial_values['var_w1'],\n",
    "                       self.initial_values['var_w2'],\n",
    "                       self.initial_values['delta1'],\n",
    "                       self.initial_values['delta2'],\n",
    "                       self.initial_values['delta3']]\n",
    "        return params\n",
    "\n",
    "    def transform_params(self, unconstrained):\n",
    "        \"\"\"\n",
    "        If you need to restrict parameters\n",
    "        For example, variances should be > 0\n",
    "        Parameters maybe have to be within -1 and 1\n",
    "        \"\"\"\n",
    "        constrained = unconstrained.copy()\n",
    "        constrained[self.positive_parameters] = constrained[self.positive_parameters]**2\n",
    "        return constrained\n",
    "\n",
    "    def untransform_params(self, constrained):\n",
    "        \"\"\"\n",
    "        Need to reverse what you did in transform_params()\n",
    "        \"\"\"\n",
    "        unconstrained = constrained.copy()\n",
    "        unconstrained[self.positive_parameters] = unconstrained[self.positive_parameters]**0.5\n",
    "        return unconstrained\n",
    "\n",
    "    def update(self, params, **kwargs):\n",
    "        params = super(MultipleYsModel, self).update(params, **kwargs)\n",
    "        \n",
    "        #The following line is not needed (by default, this matrix is initialized by zeroes),\n",
    "        #But I leave it here so the dimensions are clearer\n",
    "        self['obs_intercept'] = np.repeat([np.array([0, 0])], self.nobs, axis=0).T\n",
    "        \n",
    "        self['obs_cov', 0, 0] = params[0]\n",
    "        self['obs_cov', 1, 1] = params[1]\n",
    "        \n",
    "        self['state_cov'] = np.diag(params[2:4])\n",
    "        \n",
    "        #delta1, delta2, delta3\n",
    "        self['transition', 0, 0] = params[4]\n",
    "        self['transition', 0, 1] = params[5]\n",
    "        self['transition', 1, 1] = params[6]\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mod = MultipleYsModel(i_hat, s_t, m_hat)\n",
    "res = mod.fit()\n",
    "\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bonus: pymc3 for fast Bayesian estimation\n",
    "\n",
    "In this section I'll show how you can take your custom state space model and easily plug it to `pymc3` and estimate it with Bayesian methods. In particular, this example will show you an estimation with a version of Hamiltonian Monte Carlo called the No-U-Turn Sampler (NUTS).\n",
    "\n",
    "I'm basically copying the ideas contained [in this notebook](https://www.statsmodels.org/dev/examples/notebooks/generated/statespace_sarimax_pymc3.html), so make sure to check that for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Extra requirements\n",
    "import theano\n",
    "import theano.tensor as tt\n",
    "import pymc3 as pm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We need to define some helper functions to connect theano to the likelihood function that is implied in our model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Loglike(tt.Op):\n",
    "\n",
    "    itypes = [tt.dvector] # expects a vector of parameter values when called\n",
    "    otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)\n",
    "\n",
    "    def __init__(self, model):\n",
    "        self.model = model\n",
    "        self.score = Score(self.model)\n",
    "\n",
    "    def perform(self, node, inputs, outputs):\n",
    "        theta, = inputs  # contains the vector of parameters\n",
    "        llf = self.model.loglike(theta)\n",
    "        outputs[0][0] = np.array(llf) # output the log-likelihood\n",
    "\n",
    "    def grad(self, inputs, g):\n",
    "        # the method that calculates the gradients - it actually returns the\n",
    "        # vector-Jacobian product - g[0] is a vector of parameter values\n",
    "        theta, = inputs  # our parameters\n",
    "        out = [g[0] * self.score(theta)]\n",
    "        return out\n",
    "\n",
    "\n",
    "class Score(tt.Op):\n",
    "    itypes = [tt.dvector]\n",
    "    otypes = [tt.dvector]\n",
    "\n",
    "    def __init__(self, model):\n",
    "        self.model = model\n",
    "\n",
    "    def perform(self, node, inputs, outputs):\n",
    "        theta, = inputs\n",
    "        outputs[0][0] = self.model.score(theta)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll simulate again the data we used for model 1.\n",
    "We'll also `fit` it again and save the results to compare them to the Bayesian posterior we get."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_t, x_t, w_t, beta_x, beta_w = gen_data_for_model1()\n",
    "plt.plot(y_t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mod = TVRegression(y_t, x_t, w_t)\n",
    "res_mle = mod.fit(disp=False)\n",
    "print(res_mle.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Bayesian estimation\n",
    "\n",
    "We need to define a prior for each parameter and the number of draws and burn-in points"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set sampling params\n",
    "ndraws = 3000  #  3000 number of draws from the distribution\n",
    "nburn = 600   # 600 number of \"burn-in points\" (which will be discarded)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Construct an instance of the Theano wrapper defined above, which\n",
    "# will allow PyMC3 to compute the likelihood and Jacobian in a way\n",
    "# that it can make use of. Here we are using the same model instance\n",
    "# created earlier for MLE analysis (we could also create a new model\n",
    "# instance if we preferred)\n",
    "loglike = Loglike(mod)\n",
    "\n",
    "with pm.Model():\n",
    "    # Priors\n",
    "    intercept = pm.Uniform('intercept', 1, 10)\n",
    "    var_e = pm.InverseGamma('var.e', 2.3, 0.5)\n",
    "    var_x_coeff = pm.InverseGamma('var.x.coeff', 2.3, 0.1)\n",
    "    var_w_coeff = pm.InverseGamma('var.w.coeff', 2.3, 0.1)\n",
    "\n",
    "    # convert variables to tensor vectors\n",
    "    theta = tt.as_tensor_variable([intercept, var_e, var_x_coeff, var_w_coeff])\n",
    "\n",
    "    # use a DensityDist (use a lamdba function to \"call\" the Op)\n",
    "    pm.DensityDist('likelihood', lambda v: loglike(v), observed={'v': theta})\n",
    "\n",
    "    # Draw samples\n",
    "    trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True, cores=4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### How does the posterior distribution compare with the MLE estimation?\n",
    "\n",
    "The clearly peak around the MLE estimate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "results_dict = {'intercept': res_mle.params[0], 'var.e': res_mle.params[1],\n",
    "                'var.x.coeff': res_mle.params[2], 'var.w.coeff': res_mle.params[3]}\n",
    "plt.tight_layout()\n",
    "_ = pm.traceplot(trace,\n",
    "                 lines=[(k, {}, [v]) for k, v in dict(results_dict).items()],\n",
    "                 combined=True,\n",
    "                 figsize=(12, 12))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}