{ "cells": [ { "cell_type": "markdown", "id": "8732de12-d3f2-4a09-8c39-e5c52a5ac94a", "metadata": {}, "source": [ "# Autoregressive Distributed Lag (ARDL) models\n", "\n", "\n", "## ARDL Models\n", "\n", "Autoregressive Distributed Lag (ARDL) models extend Autoregressive models with lags of explanatory variables. While ARDL models are technically AR-X models, the key difference is that ARDL models focus on the exogenous variables and selecting the correct lag structure from both the endogenous variable and the exogenous variables. ARDL models are also closely related to Vector Autoregressions, and a single ARDL is effectively one row of a VAR. The key distinction is that an ARDL assumes that the exogenous variables are exogenous in the sense that it is not necessary to include the endogenous variable as a predictor of the exogenous variables.\n", "\n", "The full specification of ARDL models is\n", "\n", "$$\n", "Y_t = \\underset{\\text{Constant and Trend}}{\\underbrace{\\delta_0 + \\delta_1 t + \\ldots + \\delta_k t^k}} \n", " + \\underset{\\text{Seasonal}}{\\underbrace{\\sum_{i=0}^{s-1} \\gamma_i S_i}}\n", " + \\underset{\\text{Autoregressive}}{\\underbrace{\\sum_{p=1}^P \\phi_p Y_{t-p}}}\n", " + \\underset{\\text{Distributed Lag}}{\\underbrace{\\sum_{k=1}^M \\sum_{j=0}^{Q_k} \\beta_{k,j} X_{k, t-j}}}\n", " + \\underset{\\text{Fixed}}{\\underbrace{Z_t \\Gamma}} + \\epsilon_t\n", "$$\n", "\n", "The terms in the model are:\n", "\n", "* $\\delta_i$: constant and deterministic time regressors. Set using trend.\n", "* $S_i$ are seasonal dummies which are included if seasonal=True.\n", "* $X_{k,t-j}$ are the exogenous regressors. There are a number of formats that can be used to specify which lags are included. Note that the included lag lengths do no need to be the same. If causal=True, then the lags start with lag 1. Otherwise lags begin with 0 so that the model included the contemporaneous relationship between $Y_t$ and $X_t$.\n", "* $Z_t$ are any other fixed regressors that are not part of the distributed lag specification. In practice these regressors may be included when they do no contribute to the long run-relationship between $Y_t$ and the vector of exogenous variables $X_t$.\n", "* $\\{\\epsilon_t\\}$ is assumed to be a White Noise process" ] }, { "cell_type": "code", "execution_count": 1, "id": "f7bb53a8-63a9-4f11-a9db-eb09280d457d", "metadata": { "execution": { "iopub.execute_input": "2021-11-12T23:38:47.822704Z", "iopub.status.busy": "2021-11-12T23:38:47.822195Z", "iopub.status.idle": "2021-11-12T23:38:49.246392Z", "shell.execute_reply": "2021-11-12T23:38:49.246807Z" } }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "\n", "sns.set_style(\"darkgrid\")\n", "sns.mpl.rc(\"figure\", figsize=(16, 6))\n", "sns.mpl.rc(\"font\", size=14)" ] }, { "cell_type": "markdown", "id": "f2df3123-3ef9-4bc5-bc3f-c2d3f2fe945d", "metadata": {}, "source": [ "### Data\n", "\n", "This notebook makes use of money demand data from Denmark, as first used in S. Johansen and K. Juselius (1990). The key variables are:\n", "\n", "* lrm: Log of real money measured using M2\n", "* lry: Log of real income\n", "* ibo: Interest rate on bonds\n", "* ide: Interest rate of bank deposits\n", "\n", "The standard model uses lrm as the dependent variable and the other three as exogenous drivers.\n", "\n", "Johansen, S. and Juselius, K. (1990), Maximum Likelihood Estimation and Inference on Cointegration – with Applications to the Demand for Money, Oxford Bulletin of Economics and Statistics, 52, 2, 169–210.\n", "\n", "We start by loading the data and examining it." ] }, { "cell_type": "code", "execution_count": 2, "id": "e835b518-53f6-45de-9c93-d397eaa08831", "metadata": { "execution": { "iopub.execute_input": "2021-11-12T23:38:49.252464Z", "iopub.status.busy": "2021-11-12T23:38:49.251974Z", "iopub.status.idle": "2021-11-12T23:38:49.531444Z", "shell.execute_reply": "2021-11-12T23:38:49.531843Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
lrmlryiboide
period
1986-07-0112.0561896.0989920.1115000.067941
1986-10-0112.0716286.0807060.1142670.075396
1987-01-0112.0279526.0611750.1193330.076653
1987-04-0112.0397886.0637300.1173330.076259
1987-07-0112.0152946.0508300.1189670.075163
\n", "
" ], "text/plain": [ " lrm lry ibo ide\n", "period \n", "1986-07-01 12.056189 6.098992 0.111500 0.067941\n", "1986-10-01 12.071628 6.080706 0.114267 0.075396\n", "1987-01-01 12.027952 6.061175 0.119333 0.076653\n", "1987-04-01 12.039788 6.063730 0.117333 0.076259\n", "1987-07-01 12.015294 6.050830 0.118967 0.075163" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from statsmodels.datasets.danish_data import load\n", "from statsmodels.tsa.api import ARDL\n", "from statsmodels.tsa.ardl import ardl_select_order\n", "\n", "data = load().data\n", "data = data[[\"lrm\", \"lry\", \"ibo\", \"ide\"]]\n", "data.tail()" ] }, { "cell_type": "markdown", "id": "32139ec2-4e69-4e53-a5b9-a6aef64edefe", "metadata": {}, "source": [ "We plot the demeaned data so that all series appear on the same scale. The lrm series appears to be non-stationary, as does lry. The stationarity of the other two is less obvious." ] }, { "cell_type": "code", "execution_count": 3, "id": "6ca52a18-3752-4c65-9043-6c91ba543d44", "metadata": { "execution": { "iopub.execute_input": "2021-11-12T23:38:49.538989Z", "iopub.status.busy": "2021-11-12T23:38:49.538484Z", "iopub.status.idle": "2021-11-12T23:38:49.913265Z", "shell.execute_reply": "2021-11-12T23:38:49.913690Z" } }, "outputs": [ { "data": { "image/png": "text/plain": [ "