Essentials of Time Series Modeling

Chapter 3 in The Essentials of Time Series Modeling: An Applied Treatment with Emphasis on Topics Relevant to Financial Analysis © by Houston H. Stokes 4 September 2013. All rights reserved. Preliminary Draft

Chapter 3

Time Series Analysis Basics 1

3.0 Overview 1

3.1 Basic Definitions 1

3.2 Examples 5

Table 3.1 ACF, PACF and Spectral Plots of AR Series 5

Figure 3.1 ACF and PACF of High Frequency Series 6

Figure 3.2 Peridogram and Spectrum of High Frequency Series 7

Table 3.2 Estimating the characteristic Roots of an AR Model 8

Table 3.3 Geweke and Porter-Hudak Fractional Differencing Test 12

Table 3.4 Fractional Differencing Example 14

Table 3.5 Program to obtain Theoretical ACF of Fractionally Differenced Series 18

Table 3.6 Listing of the fdifinfo subroutine 21

Table 3.7 Testing the Tsay-Chung Result on Fractional Differencing 23

Table 3.8 Tests of the Polyroot Program 26

Table 3.9 Understanding Stationary by Inspection of Polynomial 27

3.3 Conclusion 29

Time Series Analysis Basics

3.0 Overview

Times series analysis is based on forecasting the realization of jointly distributed random variables. The estimated filter captures the systematic information in the series which can be used to project ahead. Additional applications include various diagnostic tests. In this chapter only first order relationships will be considered. Assuming is a time series where only will be studied.

3.1 Basic Definitions

A stochastic process is a sequence of observations evolving through a probability law. A structural change implies that the probability law underlying a stochastic process is changing. The random walk assume

(3.1-1)

where has a zero expectation. Consider case of a coin flip where takes on values 1 and -1. A random walk with a drift is

(3.1-2)

The more general form is

(3.1-3)

where B is the lag operator defined as . If the random walk model is correct then

(3.1-4)

and using a simple case

(3.1-5)

Note that the effect of long past shocks is given the same weight as shocks that occurred in more recent periods. If is normally distributed we can make a probability statement about the forecast.

Box-Jenkins Time Series models involve solution of order linear difference equations. Examples:

(3.1-6)

The difference equation has a general solution where c is an arbitrary constant since . Consider

(3.1-7)

Equation (3.1-7) shows how an autoregressive process (AR(1)) with coefficient can be written as an infinite moving average model (MA()) where the shocks die out. Note that this will only work if the AR coefficient is less that 1 in absolute value. In contrast to the random walk model, here long past shocks have less weight than more recent shocks. If we have a unit root and the shocks do not die out. This is the classic random walk model that will be shown not to have a variance. It will be shown that given some stationary conditions, an AR or ARMA model can be written as an infinite MA model (Wold Decomposition). This suggests that a model of the form

(3.1-8)

can be written as an MA model since and can be both be written as MA models and in the ARIMA model of the shocks are combined to be aggregates of the shocks implicit in ARIMA model of the series and . ARIMA model building there is one shock sequence. Each shock is a composite of the shocks from the different Wold decompositions. An ARIMA model can be thought of as the ultimate reduced form of some structural model that may or may not be known. The VMA model allows for shocks from different processes to be explicitly modeled. Shocks originating from different parts of the economy may have different effects. Assume k series. A VMA model where there is no mapping from shocks from one series to any other series is actually k MA models.

Since the MA model is an ultimate reduced form, inspection of the coefficients of the VMA model tells the relative magnitude of different shocks. Assume a model on interest rates, consumption, investment and exports. Shocks from the export sector may have different impacts on interest rates and consumption.

Difference equations can be solved by iteration provided you know initial conditions. We want to solve given initial value .

(3.1-9)

Now assume that we do not know but that . Take the last equation in (3.1-9) and sub in for from the first equation of (3.1-9).

(3.1-10)

Since converges to zero and

. (3.1-11)

Equation (3.1-10) can be written as a MA model of the form

(3.1-12)

For added detail on this see Enders (2010, 11). The importance of (3.1-12) is that by reducing the AR model to an MA model, we can forecast outside the sample period since . In addition, out of sample forecasts can be updated in the field as new error values become available. Note that (3.1-12) does not work with the random walk model (3.1-1) since all shocks are equally weighted. In such a case it is not possible to derive (3.1-12) from (3.1-10) directly.

Modify equation (3.1) to have a random walk with a drift

(3.1-13)

3.2 Examples

The file genarma2 in matrix.mac can be used to study the effect of noise on AR models. A more extensive example is in Table 3.1 to help in understanding what a series with specific information looks like in both the frequency (defined later) and time domain. The goal is to use the ACF and PACF of a series to develop an ARIMA model (filter) that "cleans" the series. Once the systematic information in a series is captured it can be used to make projections.

Table 3.1 ACF, PACF and Spectral Plots of AR Series

______

b34sexec matrix;

* Enders test cases on page 14;

* Use to study ar values of .9, .5, -.5, 1. 1.01 -1.01 ;

* By adjusting var can show effect of noise ;

n=2000;

ndrop=0;

ar=-.9;

var=.1;

con=0.0;

start=array(1:1.);

test=genarma(ar,ma,con,start,var,n,ndrop);

call load(data2acf);

call load(do2spec);

call data2acf(test,'ACF & PACF of AR(-.9)', dmax1(norows(test)/10,2),

'acf.wmf');

call character(cc,'Spectral Analysis of AR(-.9) Model');

weights=index(1 2 3 4 5 6 7 6 5 4 3 2 1);

/; Low Tech way to Proceed

/; call spectral(test,sinx,cosx,px,sx1,freq: weights);

/; call graph(freq,sx1 :heading cc

/; :plottype xyplot :file 'specar1.wmf');

/;

call do2spec(test,cc,weights,'spec.wmf');

b34srun;

Graphs produced from the example in Table 3.1 are

Figure 3.1 ACF and PACF of High Frequency Series

Figure 3.2 Peridogram and Spectrum of High Frequency Series

Using this code in Table 3.1 try alternative AR(1) models with coefficients = .9, .5, -.5, 1., 1.01, and -1.01. Describe what you see. Why cannot you use 1.2? If running in a text mode, add call print(test); since graphics will not show the ACF. Alter the statement var=.1; to var=10. ; What do you see in the plot? For AR coefficients > 1.0 in absolute value, N cannot be made too large. Why is this the case?

To save the ACF, PACF and spectrum (terms to be defined below) we can use

acf1=acf(case_1,dmax1(norows(test)/50,2),se,pacf1);

call spectral(test,sinx,cosx,px,sx1,freq:1 2 3 2 1);

A discussion of how the spectrum relates to the ACF will be presented later. While the ACF looks at the structure in the time domain, the spectrum decomposes the variance by frequency.

An important topic concerns ARMA model stability conditions. Given an AR model a necessary condition for stability (all roots lie within the unit circle) is while a sufficient condition is . At least one characteristic root equals unity if . The test program polyroot1, shown in Table 3.2, can be used to estimate the characteristic roots of an AR(k) model which can be inspected.

Table 3.2 Estimating the characteristic Roots of an AR Model

______

b34sexec matrix$

call echooff;

* Polyroot Equation used to test if AR model is stable ;

* Model is stable if Characteristic roots < 1;

* or inside unit circle.;

* For high order systems necessary condition is sum (coef) < 1;

* Sufficient condition is sum (abs(coef)) < 1 ;

* At least one root is unity if sum(coef) = 1;

* test y(t) = -.9*y(t-1) + u(t) ;

coef=array(2:.9,1.);

roots=polyroot(coef);

call print('Tests y(t) = -.9*y(t-1) + u(t)',

coef,roots);

* test y(t) = -1.1*y(t-1) + u(t) ;

coef=array(2:1.1,1.);

roots=polyroot(coef);

call print('Tests y(t) = -1.1*y(t-1) + u(t)'

coef,roots);

* test y(t) = .2*y(t-1) + .35*y(t-2) + u(t) ;

* Enders page 26 case 1 ;

coef=array(3:-.35,-.2,1.);

roots=polyroot(coef);

call print('Tests y(t) -.2*y(t-1) - .35*y(t-2)= u(t)'

coef,roots);

* test y(t) = .7*y(t-1) + .35*y(t-2) + u(t) ;

* Enders page 27 case 2;

coef=array(3:-.35,-.7,1.);

roots=polyroot(coef);

call print('Tests y(t) -.7*y(t-1) - .35*y(t-2)= u(t)'

coef,roots);

* Enders page 30 Imaginary case 1;

coef=array(3:.9,-1.6,1.);

roots=polyroot(coef);

call print('Tests y(t) -1.6*y(t-1) + .9*y(t-2)= u(t)'

coef,roots);

* Enders page 30 Imaginary case 2;

coef=array(3:.9,.6,1.);

roots=polyroot(coef);

call print('Tests y(t) +.6*y(t-1) + .9*y(t-2)= u(t)'

coef,roots);

b34srun$

Edited output from running the code in Table 3.2 is

Matrix Command. Version January 2000

Report any problems.

=> CALL ECHOOFF$

Tests y(t) = -.9*y(t-1) + u(t)

COEF = Array of 2 elements

0.900000 1.00000

ROOTS = Complex Array of 1

( -0.9000 , 0.000 )

Tests y(t) = -1.1*y(t-1) + u(t)

COEF = Array of 2 elements

1.10000 1.00000

ROOTS = Complex Array of 1

( -1.100 , 0.000 )

Tests y(t) -.2*y(t-1) - .35*y(t-2)= u(t)

COEF = Array of 3 elements

-0.350000 -0.200000 1.00000

ROOTS = Complex Array of 2

( -0.5000 , 0.000 ) ( 0.7000 , 0.000 )

Tests y(t) -.7*y(t-1) - .35*y(t-2)= u(t)

COEF = Array of 3 elements

-0.350000 -0.700000 1.00000

ROOTS = Complex Array of 2

( -0.3374 , 0.000 ) ( 1.037 , 0.000 )

Tests y(t) -1.6*y(t-1) + .9*y(t-2)= u(t)

COEF = Array of 3 elements

0.900000 -1.60000 1.00000

ROOTS = Complex Array of 2

( 0.8000 , 0.5099 ) ( 0.8000 , -0.5099 )

Tests y(t) +.6*y(t-1) + .9*y(t-2)= u(t)

COEF = Array of 3 elements

0.900000 0.600000 1.00000

ROOTS = Complex Array of 2

( -0.3000 , 0.9000 ) ( -0.3000 , -0.9000 )

B34S Matrix Command Ending. Last Command reached.

In practice inspection of whether the autocorrelation function (ACF) dies out will tell us if a model is stationary or whether it needs some form of differencing. The Dickey-Fuller (1979) and Phillips-Perron (1988) tests for unit roots are usually performed. (See Stokes (200x) chapter 12). The Elliott, Rothenberg and Stock (1996) DF-GLS test is an alternative way to proceed and is discussed in Stokes (200x chapter 12). The random walk with drift model and will be discussed below. Before proceeding to that task, a little about notation is in order. Equation (3.2-1) shows various applications of lag operators

(3.2-1)

Differencing can be used to make a series stationary. First and second differencing are defined in equations (3.2-2 and (3.2-3) as

(3.2-2)

(3.2-3)

while seasonal differencing and second differencing for a monthly series is defined in equation (3.2-4) as

(3.2-4)

The usual test is to see if the ACF dies out. We prove later using MA(1) model that the ACF(1) of a white noise series has a value = .-5. Differencing can add structure to a series.

Fractional differencing assumes that d in is not an integer. Fractional differencing has been used in financial market research into long memory models. Assume a monthly stock return series. If there were ACF spikes at lags 1 and 3, if this series in now obtained on a daily basis in theory these spikes would be 30 periods and 90 period back. Something more than integer differencing is needed to make such a series stationary. For references see Cambell-Lo-MacKinley (1997) and Baillie (1996). Define as the Gamma function. It can be shown (See Greene 2000 page 786) that

(3.2-5)

The differenced series is stationary if . If d > 0, the ACF decays so slowly that its sum diverges to infinity. Geweke & S. Porter-Hudak (1983) developed a spectral based test to detect fractional differencing that is based on a regression of the logs of the first periodogram values of the series being tested on the first frequencies.[1] If then terms are being used. A t test on the slope determines if fractional differencing is called for. The gph subroutine listed in Table 3.3 illustrates the test:


Table 3.3 Geweke and Porter-Hudak Fractional Differencing Test

______

subroutine gph(series,power,d,se,se2,iprint);

/;

/; Estimates fractional difference power for series using the

/; frequency domain regression techniques of Geweke and Porter-Hudak.

/;

/; Works in a manner similar to RATS GPH routine which was used as a

/; basis

/;

/; series => series to analyze. This is usually the first difference

/; of the series actually being studied.

/;

/; POWER => Lowest T**power frequencies are used in running the

/; regression. Usually set as .5 (i.e. use square root of the

/; number of observations)

/;

/; d => estimate of fractional differencing

/;

/; se => Standard error of d

/;

/; se2 => Asymptotic SE of d

/;

/; iprint => 0 => no printing

/; 1 => print results

/; 2 => print results and regression

/;

/; Reference: Geweke and Porter-Hudak, "The Estimation and Application

/; of Long Memory Time Series Models", J. of Time Series Analysis,

/; 1983, (221-238).

/;

/; Built 30 April 2006 by Houston H. Stokes.

/; Improved error messages 5 September 2006.

/;

call spectral(series,sinx,cosx,px,sx,freq1);

n=idint(dfloat(norows(series))**power);