# Time Series Filters¶

In [ ]:
from __future__ import print_function
import pandas as pd
import matplotlib.pyplot as plt

import statsmodels.api as sm
In [ ]:
In [ ]:
index = pd.Index(sm.tsa.datetools.dates_from_range('1959Q1', '2009Q3'))
print(index)
In [ ]:
dta.index = index
del dta['year']
del dta['quarter']
DatetimeIndex(['1959-03-31', '1959-06-30', '1959-09-30', '1959-12-31',
'1960-03-31', '1960-06-30', '1960-09-30', '1960-12-31',
'1961-03-31', '1961-06-30',
...
'2007-06-30', '2007-09-30', '2007-12-31', '2008-03-31',
'2008-06-30', '2008-09-30', '2008-12-31', '2009-03-31',
'2009-06-30', '2009-09-30'],
dtype='datetime64[ns]', length=203, freq=None, tz=None)
In [ ]:
print(sm.datasets.macrodata.NOTE)
In [ ]:
::
Number of Observations - 203

Number of Variables - 14

Variable name definitions::

year      - 1959q1 - 2009q3
quarter   - 1-4
realgdp   - Real gross domestic product (Bil. of chained 2005 US$, seasonally adjusted annual rate) realcons - Real personal consumption expenditures (Bil. of chained 2005 US$, seasonally adjusted annual rate)
realinv   - Real gross private domestic investment (Bil. of chained
2005 US$, seasonally adjusted annual rate) realgovt - Real federal consumption expenditures & gross investment (Bil. of chained 2005 US$, seasonally adjusted annual rate)
realdpi   - Real private disposable income (Bil. of chained 2005
US$, seasonally adjusted annual rate) cpi - End of the quarter consumer price index for all urban consumers: all items (1982-84 = 100, seasonally adjusted). m1 - End of the quarter M1 nominal money stock (Seasonally adjusted) tbilrate - Quarterly monthly average of the monthly 3-month treasury bill: secondary market rate unemp - Seasonally adjusted unemployment rate (%) pop - End of the quarter total population: all ages incl. armed forces over seas infl - Inflation rate (ln(cpi_{t}/cpi_{t-1}) * 400) realint - Real interest rate (tbilrate - infl) In [ ]: fig = plt.figure(figsize=(12,8)) ax = fig.add_subplot(111) dta.realgdp.plot(ax=ax); legend = ax.legend(loc = 'upper left'); legend.prop.set_size(20); realgdp realcons realinv realgovt realdpi cpi m1 \ 1959-03-31 2710.349 1707.4 286.898 470.045 1886.9 28.98 139.7 1959-06-30 2778.801 1733.7 310.859 481.301 1919.7 29.15 141.7 1959-09-30 2775.488 1751.8 289.226 491.260 1916.4 29.35 140.5 1959-12-31 2785.204 1753.7 299.356 484.052 1931.3 29.37 140.0 1960-03-31 2847.699 1770.5 331.722 462.199 1955.5 29.54 139.6 1960-06-30 2834.390 1792.9 298.152 460.400 1966.1 29.55 140.2 1960-09-30 2839.022 1785.8 296.375 474.676 1967.8 29.75 140.9 1960-12-31 2802.616 1788.2 259.764 476.434 1966.6 29.84 141.1 1961-03-31 2819.264 1787.7 266.405 475.854 1984.5 29.81 142.1 1961-06-30 2872.005 1814.3 286.246 480.328 2014.4 29.92 142.9 tbilrate unemp pop infl realint 1959-03-31 2.82 5.8 177.146 0.00 0.00 1959-06-30 3.08 5.1 177.830 2.34 0.74 1959-09-30 3.82 5.3 178.657 2.74 1.09 1959-12-31 4.33 5.6 179.386 0.27 4.06 1960-03-31 3.50 5.2 180.007 2.31 1.19 1960-06-30 2.68 5.2 180.671 0.14 2.55 1960-09-30 2.36 5.6 181.528 2.70 -0.34 1960-12-31 2.29 6.3 182.287 1.21 1.08 1961-03-31 2.37 6.8 182.992 -0.40 2.77 1961-06-30 2.29 7.0 183.691 1.47 0.81 ### Hodrick-Prescott Filter¶ The Hodrick-Prescott filter separates a time-series$y_t$into a trend$\tau_t$and a cyclical component$\zeta_t$$$y_t = \tau_t + \zeta_t$$ The components are determined by minimizing the following quadratic loss function $$\min_{\\{ \tau_{t}\\} }\sum_{t}^{T}\zeta_{t}^{2}+\lambda\sum_{t=1}^{T}\left[\left(\tau_{t}-\tau_{t-1}\right)-\left(\tau_{t-1}-\tau_{t-2}\right)\right]^{2}$$ In [ ]: gdp_cycle, gdp_trend = sm.tsa.filters.hpfilter(dta.realgdp) In [ ]: gdp_decomp = dta[['realgdp']] gdp_decomp["cycle"] = gdp_cycle gdp_decomp["trend"] = gdp_trend In [ ]: fig = plt.figure(figsize=(12,8)) ax = fig.add_subplot(111) gdp_decomp[["realgdp", "trend"]]["2000-03-31":].plot(ax=ax, fontsize=16); legend = ax.get_legend() legend.prop.set_size(20); /Users/tom.augspurger/Envs/py3/lib/python3.4/site-packages/IPython/kernel/__main__.py:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy from IPython.kernel.zmq import kernelapp as app /Users/tom.augspurger/Envs/py3/lib/python3.4/site-packages/IPython/kernel/__main__.py:3: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy app.launch_new_instance() ### Baxter-King approximate band-pass filter: Inflation and Unemployment¶ #### Explore the hypothesis that inflation and unemployment are counter-cyclical.¶ The Baxter-King filter is intended to explictly deal with the periodicty of the business cycle. By applying their band-pass filter to a series, they produce a new series that does not contain fluctuations at higher or lower than those of the business cycle. Specifically, the BK filter takes the form of a symmetric moving average $$y_{t}^{*}=\sum_{k=-K}^{k=K}a_ky_{t-k}$$ where$a_{-k}=a_k$and$\sum_{k=-k}^{K}a_k=0$to eliminate any trend in the series and render it stationary if the series is I(1) or I(2). For completeness, the filter weights are determined as follows $$a_{j} = B_{j}+\theta\text{ for }j=0,\pm1,\pm2,\dots,\pm K$$$$B_{0} = \frac{\left(\omega_{2}-\omega_{1}\right)}{\pi}$$$$B_{j} = \frac{1}{\pi j}\left(\sin\left(\omega_{2}j\right)-\sin\left(\omega_{1}j\right)\right)\text{ for }j=0,\pm1,\pm2,\dots,\pm K$$ where$\theta$is a normalizing constant such that the weights sum to zero. $$\theta=\frac{-\sum_{j=-K^{K}b_{j}}}{2K+1}$$$$\omega_{1}=\frac{2\pi}{P_{H}}$$$$\omega_{2}=\frac{2\pi}{P_{L}}$$$P_L$and$P_H$are the periodicity of the low and high cut-off frequencies. Following Burns and Mitchell's work on US business cycles which suggests cycles last from 1.5 to 8 years, we use$P_L=6$and$P_H=32$by default. In [ ]: bk_cycles = sm.tsa.filters.bkfilter(dta[["infl","unemp"]]) • We lose K observations on both ends. It is suggested to use K=12 for quarterly data. In [ ]: fig = plt.figure(figsize=(12,10)) ax = fig.add_subplot(111) bk_cycles.plot(ax=ax, style=['r--', 'b-']); ### Christiano-Fitzgerald approximate band-pass filter: Inflation and Unemployment¶ The Christiano-Fitzgerald filter is a generalization of BK and can thus also be seen as weighted moving average. However, the CF filter is asymmetric about$t$as well as using the entire series. The implementation of their filter involves the calculations of the weights in $$y_{t}^{*}=B_{0}y_{t}+B_{1}y_{t+1}+\dots+B_{T-1-t}y_{T-1}+\tilde B_{T-t}y_{T}+B_{1}y_{t-1}+\dots+B_{t-2}y_{2}+\tilde B_{t-1}y_{1}$$ for$t=3,4,...,T-2$, where $$B_{j} = \frac{\sin(jb)-\sin(ja)}{\pi j},j\geq1$$$$B_{0} = \frac{b-a}{\pi},a=\frac{2\pi}{P_{u}},b=\frac{2\pi}{P_{L}}$$$\tilde B_{T-t}$and$\tilde B_{t-1}$are linear functions of the$B_{j}$'s, and the values for$t=1,2,T-1,$and$T$are also calculated in much the same way.$P_{U}$and$P_{L}\$ are as described above with the same interpretation.

The CF filter is appropriate for series that may follow a random walk.

In [ ]:
In [ ]:
(-2.5364584673346386, 0.10685366457233414, 9)
In [ ]:
cf_cycles, cf_trend = sm.tsa.filters.cffilter(dta[["infl","unemp"]])
(-3.0545144962572355, 0.030107620863485937, 2)
In [ ]:
fig = plt.figure(figsize=(14,10))
cf_cycles.plot(ax=ax, style=['r--','b-']);
infl     unemp
1959-03-31  0.237927 -0.216867
1959-06-30  0.770007 -0.343779
1959-09-30  1.177736 -0.511024
1959-12-31  1.256754 -0.686967
1960-03-31  0.972128 -0.770793
1960-06-30  0.491889 -0.640601
1960-09-30  0.070189 -0.249741
1960-12-31 -0.130432  0.301545
1961-03-31 -0.134155  0.788992
1961-06-30 -0.092073  0.985356

Filtering assumes a priori that business cycles exist. Due to this assumption, many macroeconomic models seek to create models that match the shape of impulse response functions rather than replicating properties of filtered series. See VAR notebook.