{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# VARMAX models\n", "\n", "This is a brief introduction notebook to VARMAX models in statsmodels. The VARMAX model is generically specified as:\n", "$$\n", "y_t = \\nu + A_1 y_{t-1} + \\dots + A_p y_{t-p} + B x_t + \\epsilon_t +\n", "M_1 \\epsilon_{t-1} + \\dots M_q \\epsilon_{t-q}\n", "$$\n", "\n", "where $y_t$ is a $\\text{k_endog} \\times 1$ vector." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2022-11-02T17:07:46.057234Z", "iopub.status.busy": "2022-11-02T17:07:46.056729Z", "iopub.status.idle": "2022-11-02T17:07:46.540664Z", "shell.execute_reply": "2022-11-02T17:07:46.539979Z" } }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2022-11-02T17:07:46.546034Z", "iopub.status.busy": "2022-11-02T17:07:46.544747Z", "iopub.status.idle": "2022-11-02T17:07:47.293268Z", "shell.execute_reply": "2022-11-02T17:07:47.292568Z" }, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import statsmodels.api as sm\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2022-11-02T17:07:47.298986Z", "iopub.status.busy": "2022-11-02T17:07:47.297595Z", "iopub.status.idle": "2022-11-02T17:07:47.576227Z", "shell.execute_reply": "2022-11-02T17:07:47.575561Z" }, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "dta = sm.datasets.webuse('lutkepohl2', 'https://www.stata-press.com/data/r12/')\n", "dta.index = dta.qtr\n", "dta.index.freq = dta.index.inferred_freq\n", "endog = dta.loc['1960-04-01':'1978-10-01', ['dln_inv', 'dln_inc', 'dln_consump']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model specification\n", "\n", "The VARMAX class in statsmodels allows estimation of VAR, VMA, and VARMA models (through the order argument), optionally with a constant term (via the trend argument). Exogenous regressors may also be included (as usual in statsmodels, by the exog argument), and in this way a time trend may be added. Finally, the class allows measurement error (via the measurement_error argument) and allows specifying either a diagonal or unstructured innovation covariance matrix (via the error_cov_type argument)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 1: VAR\n", "\n", "Below is a simple VARX(2) model in two endogenous variables and an exogenous series, but no constant term. Notice that we needed to allow for more iterations than the default (which is maxiter=50) in order for the likelihood estimation to converge. This is not unusual in VAR models which have to estimate a large number of parameters, often on a relatively small number of time series: this model, for example, estimates 27 parameters off of 75 observations of 3 variables." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2022-11-02T17:07:47.581814Z", "iopub.status.busy": "2022-11-02T17:07:47.580638Z", "iopub.status.idle": "2022-11-02T17:07:51.973937Z", "shell.execute_reply": "2022-11-02T17:07:51.973242Z" }, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Statespace Model Results \n", "==================================================================================\n", "Dep. Variable: ['dln_inv', 'dln_inc'] No. Observations: 75\n", "Model: VARX(2) Log Likelihood 361.037\n", "Date: Wed, 02 Nov 2022 AIC -696.075\n", "Time: 17:07:51 BIC -665.947\n", "Sample: 04-01-1960 HQIC -684.045\n", " - 10-01-1978 \n", "Covariance Type: opg \n", "===================================================================================\n", "Ljung-Box (L1) (Q): 0.05, 10.07 Jarque-Bera (JB): 11.05, 2.46\n", "Prob(Q): 0.82, 0.00 Prob(JB): 0.00, 0.29\n", "Heteroskedasticity (H): 0.45, 0.40 Skew: 0.16, -0.38\n", "Prob(H) (two-sided): 0.05, 0.03 Kurtosis: 4.85, 3.44\n", " Results for equation dln_inv \n", "====================================================================================\n", " coef std err z P>|z| [0.025 0.975]\n", "------------------------------------------------------------------------------------\n", "L1.dln_inv -0.2399 0.093 -2.578 0.010 -0.422 -0.058\n", "L1.dln_inc 0.2776 0.449 0.618 0.536 -0.602 1.157\n", "L2.dln_inv -0.1654 0.155 -1.066 0.286 -0.470 0.139\n", "L2.dln_inc 0.0643 0.421 0.153 0.879 -0.761 0.889\n", "beta.dln_consump 0.9840 0.637 1.545 0.122 -0.264 2.232\n", " Results for equation dln_inc \n", "====================================================================================\n", " coef std err z P>|z| [0.025 0.975]\n", "------------------------------------------------------------------------------------\n", "L1.dln_inv 0.0633 0.036 1.770 0.077 -0.007 0.133\n", "L1.dln_inc 0.0803 0.107 0.750 0.453 -0.129 0.290\n", "L2.dln_inv 0.0111 0.033 0.337 0.736 -0.054 0.076\n", "L2.dln_inc 0.0335 0.134 0.250 0.803 -0.229 0.296\n", "beta.dln_consump 0.7756 0.113 6.893 0.000 0.555 0.996\n", " Error covariance matrix \n", "============================================================================================\n", " coef std err z P>|z| [0.025 0.975]\n", "--------------------------------------------------------------------------------------------\n", "sqrt.var.dln_inv 0.0434 0.004 12.295 0.000 0.036 0.050\n", "sqrt.cov.dln_inv.dln_inc 6.006e-05 0.002 0.030 0.976 -0.004 0.004\n", "sqrt.var.dln_inc 0.0109 0.001 11.212 0.000 0.009 0.013\n", "============================================================================================\n", "\n", "Warnings:\n", "[1] Covariance matrix calculated using the outer product of gradients (complex-step).\n" ] } ], "source": [ "exog = endog['dln_consump']\n", "mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(2,0), trend='n', exog=exog)\n", "res = mod.fit(maxiter=1000, disp=False)\n", "print(res.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From the estimated VAR model, we can plot the impulse response functions of the endogenous variables." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2022-11-02T17:07:51.978536Z", "iopub.status.busy": "2022-11-02T17:07:51.978278Z", "iopub.status.idle": "2022-11-02T17:07:52.191422Z", "shell.execute_reply": "2022-11-02T17:07:52.190813Z" }, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "