Dates in timeseries models ============================ .. _tsa_dates_notebook: `Link to Notebook GitHub `_ .. raw:: html
In [ ]:
from __future__ import print_function
   import statsmodels.api as sm
   import numpy as np
   import pandas as pd
   

Getting started

In [ ]:
data = sm.datasets.sunspots.load()
   

Right now an annual date series must be datetimes at the end of the year.

In [ ]:
from datetime import datetime
   dates = sm.tsa.datetools.dates_from_range('1700', length=len(data.endog))
   

Using Pandas

Make a pandas TimeSeries or DataFrame

In [ ]:
endog = pd.TimeSeries(data.endog, index=dates)
   

Instantiate the model

In [ ]:
ar_model = sm.tsa.AR(endog, freq='A')
   pandas_ar_res = ar_model.fit(maxlag=9, method='mle', disp=-1)
   

Out-of-sample prediction

In [ ]:
pred = pandas_ar_res.predict(start='2005', end='2015')
   print(pred)
   

Using explicit dates

In [ ]:
ar_model = sm.tsa.AR(data.endog, dates=dates, freq='A')
   ar_res = ar_model.fit(maxlag=9, method='mle', disp=-1)
   pred = ar_res.predict(start='2005', end='2015')
   print(pred)
   
2005-12-31    20.003285
   2006-12-31    24.703979
   2007-12-31    20.026123
   2008-12-31    23.473638
   2009-12-31    30.858572
   2010-12-31    61.335449
   2011-12-31    87.024691
   2012-12-31    91.321256
   2013-12-31    79.921629
   2014-12-31    60.799526
   2015-12-31    40.374879
   Freq: A-DEC, dtype: float64
   

This just returns a regular array, but since the model has date information attached, you can get the prediction dates in a roundabout way.

In [ ]:
print(ar_res.data.predict_dates)
   
[ 20.00328112  24.70398853  20.02613031  23.47364995  30.85857026
     61.33544403  87.02467843  91.32123263  79.92159878  60.79948588
     40.37483539]
   

Note: This attribute only exists if predict has been called. It holds the dates associated with the last call to predict.