[SOUND] Welcome. The topic of this lecture is time series, and in particular, specification and estimation. In this lecture, you will learn which steps to take to specify time series models, and to estimate parameters in such models. Stationarity is crucial here. And therefore, you should take care of any non-stationarity right at the start. Once a stationary series is obtained after proper transformation, you can use the autocorrelation and partial autocorrelation functions to specify a first-guess model. We start with the major motivation for using time series, that is, forecasting. Forecasts are based on a model that properly summarizes past information. This past information can concern the own past of the dependent variable y, or possibly also the past of an explanatory factor, x. For notational convenience, we use PY and PX to denote this past information. In a univariate model, the forecast is based only on the past of the dependent variable y itself. And if we use also an explanatory factor x, then the forecast is a function of the past of both x and y. Of course, we want to use the past information in an optimal way, such that there is no predictive value anymore in the errors that we make. Indeed, we wish to arrive at a forecast error that is uncorrelated with the information in PY and PX. Note that if the forecast error would be predictable, then some relevant information is still missing that could be used to improve the forecast. Let us start with a univariate time series model. Here the forecast for y is a function of past observations of y only. Now we first have to decide on the function F. And although many functions F can be chosen, a popular choice is the linear function. When the lagged information in PY is limited at lag p, the well-known autoregressive model of order p emerges. The true value of y is the forecast plus an error term that is equal to the forecast error. And together, this gives rise to the AR(p) model. Now, I invite you to consider the following test question. Consider forecasts from an autoregression of order p, which says that there is useful information in the past until, and including, p observations ago. And consider the forecast error epsilon, which should be uncorrelated with the past of y. Can you show that, in this situation, the epsilon process is white noise, that is, the future values of the epsilons cannot be predicted in a linear way? The answer is shown on the slide. The crucial step here is that epsilon is a linear function of current and past values of the dependent variable. To estimate the parameters in an AR model of order p, we use the same ideas as in linear regression, where an optimal strategy is to minimize the sum of squared errors, which, of course, now are the forecast errors. So, you can use ordinary least squares. As a moving average model includes also lagged forecast errors that are still unknown before estimation, we have to resort to the method of maximum likelihood in case of ARMA models. The usefulness of least squares extends to the case where the time series model also includes the past of an explanatory factor x. And again, here the popular choice is to use a linear forecast function. And this model that includes the lags of y and also the lags of x, when there are p lags of y, and r lags of x, we usually call this the autoregressive distributed lag model of order p and r, or shortly ADL(p,r). Also for this ADL model, we can use the least squares method to estimate the parameters. The ADL model is particularly useful to examine what is called Granger causality, named after the Nobel Laureate Sir Clive Granger. This idea of causality builds on the idea of forecastability. That is, when the past of one variable is helpful to predict the future of another, you might consider that as some form of causality. What you do is to construct two ADL models. One for the variable y, and another for the variable x. Note that both models include the past of the dependent variable itself plus the past of the other variable. If some of the gamma parameters in the ADL model for y are different from 0, then the past of x helps to predict the future of y. And vice versa, in case the gamma star parameters are non-zero in the ADL model for x. In case of such of non-zero parameters, you can say that one variable is Granger causal to the other. For example, we may find that x is Granger causal for y, but not the other way around. This indicates that the past of x can be helpful for predicting y. But for forecasting x, only its own past is relevant. You can check the significance of, for example, the gamma star coefficients in the ADL model for x by means of the familiar F-test. And, conveniently, as these two models only include lag variables, you can still estimate the parameters using least squares for each of the two equations separately. The first thing that needs to be done in modeling is to make sure that the time series you wish to analyze is stationary. The reason that we need stationarity is that this is required for proper statistical analysis. So, one should first somehow test if the variable of interest is stationary. To do this, we can again make use of a time series model. For example, in the case of an autoregression of order 1, we know that when the parameter is equal to 1, then the time series is not stationary. And this suggests that we can test for stationarity by testing the value of this parameter. As we are familiar with statistical tests for parameters to be equal to 0, we usually rewrite the AR(1) model by subtracting the one period lagged y from both sides of the equation. Now, the rho parameter can be tested to be equal to 0 using a t-test. As we are interested in the parameter rho being 0, or smaller than 0, we reject non-stationarity when the t- ratio is more negative than minus 2.9. Note, that this is not the usual value of minus 1.65. And this is due to the fact that under the null hypothesis, y is non-stationary, which gives rise to a different statistical theory. Now I invite you to consider the following test question. This question deals with a test for the value of 1, which is often called a unit root, in case the model is not an AR(1), but an AR(2). It is then convenient to write the AR(2) model as a mixture of variables in first differences and in levels. The answer is shown on the slide. Please take some time to check the steps on the slide. The rho parameter now is the sum of the two autoregressive parameters minus 1. The expression in the foregoing test exercise provides the basis of the so-called Dickey-Fuller test. The test equation can either include a deterministic trend, or not, and the choice is usually based on the visual impression of the data. The inclusion or exclusion of the deterministic trend term, beta times T, matters for the relevant 5% critical value. When the trend term is not included, the critical value for the t-test on rho is minus 2.9, as we saw before. But when the trend is included, it becomes minus 3.5. In practice, we decide on the number of lags in the autoregression by testing for correlation in the residuals, or by using a model selection criteria. When the autoregression has more than one lag, the Dickey-Fuller test is usually called the augmented Dickey-Fuller test, abbreviated as ADF. So, now, how should you proceed to specify a time series model? Well, first you perform an ADF test. And when you can reject a unit root, or non-stationarity, this means that y is stationary, and you can model the series y without further transformation. But when it is not rejected, you should take the first difference, and continue with delta y. Next, you can use OLS to estimate the parameters in an autoregression, or when you wish to consider an autoregressive distributed lag model, you perform unit root tests for both series y and x, and proceed with levels or with first differences. And then again, you can use OLS to estimate the parameters in the model. There is, however, one exceptional, and practically relevant case, namely, when y and x are not stationary, but a linear combination of the two variables is. Two variables are called cointegrated when a linear combination is stationary. This can only occur when y and x have the same stochastic trend. You might, then, interpret this linear combination y equal to c times x as the long-run equilibrium, which is an attractive concept in economics. A simple test for cointegration is the Engle-Granger test. This test amounts to regressing y on an intercept and x to estimate the long-run equilibrium relationship between these variables. The residuals from this regression can then be interpreted as deviations from the equilibrium, and if the equilibrium relation actually exists, that is if y and x actually are cointegrated, the residuals should be stationary. And this can be examined, again, using the ADF test as before. Because the test is now applied to residuals instead of an actually observed time series, the distribution of the ADF test is different. The 5% critical value for the relevant t-test is now minus 3.4, or even minus 3.8 if the trend term is included. In case of cointegration, the ADL model can be written in the so-called error correction format, which includes lagged difference variables and the stationary linear combination between y and x. You see an example here on the slide. The term error correction follows from the notion that deviations from the long-run equilibrium get corrected by the beta-1 parameter in forecasting the changes of y. Now I invite you to make the training exercise, where you can train yourself with the topics that were treated in this lecture. As always, you can find this exercise also on the website. And this concludes our lecture on specification and estimation of time series models.