Two Methodologies to Build Inflation Leading Indicators for Brazil

Marcelle Chauvet

Solange Gouvea

Marta Baltar Moreira

José Ricardo da Costa e Silva

Research and Studies Department

Central Bank of Brazil

July 2000

Abstract

The goal of the paper is to describe and compare the performance of two different methodologies to build composite leading inflation indicators: a dynamic factor model, estimated through Kalman Filter, and a linear regression framework. The models encompass four variables that present predictive content in forecasting inflation from a total of 19 pre-selected potential candidates. In the case of the dynamic factor, all possible combinations of these 19 variables entail potential cases. In the case of the linear regressions with k-step ahead (k=1,…12), there are possible combinations. Both methodologies built effective leading inflation indicators, according to several statistical tests.

  1. INTRODUCTION

Over the last 12 months, monetary policy in Brazil has been guided by the inflation targeting regime. Considering that the Real had already floated, there was no concern about the necessary flexibility to pursue inflation as an overriding objective. Still, it was essential to get rid of a possible fiscal dominance, to set an explicit target for inflation and to make decisions transparent. In addition, the role of the monetary authority remained to be established. The Central Bank was to be assigned operational independence to fix monetary policy instruments and the responsibility to meet the targets. Therefore, when implementing this new system the government put in place some arrangements in both institutional and fiscal areas in order to better fulfill these requirements.

Once those main decisions were taken, the Central Bank of Brazil focused on the technical challenges implicit in the inflation targeting framework. It was necessary to develop alternative methods to forecast inflation. The inflation targeting system demands a preemptive stance from the Central Bank when setting the instruments of monetary policy due to the existence of lags in its transmission mechanisms. In this sense, it was essential to deal with econometric forecast techniques in order to properly assess the repercussions of eventual shocks that may deviate the future path of inflation from its previously fixed target.

Macroeconomic forecasts aim to provide the policy maker with the best possible set of information to support the decision making process. There are several different approaches to either forecast an event itself or its magnitude. In the first place, small-scale models of the monetary policy transmission mechanism were built using theoretical relations among macroeconomic variables. Secondly, non-structural models like Vector Autoregressive (VAR) have been used to provide short-term inflation forecasts. A third method involves the construction of leading indicators for inflation with the main purpose of detecting early-warning signals of turning points in the future inflation path. In contrast to macroeconomic models, the leading indicator approach is based on very little economic theory.

Leading indicators are a forecast tool to predict cyclical movements of an economic time series[1]. Differently from other forecasting methods, the main goal of leading indicators is not to predict future values of the dependent variable, but rather their turning points. Predictions are based on the observation of the leading indicator once the cycles and the lead are already identified. Actually, it is a qualitative forecast approach.

The first and main proposal of the leading indicators was to predict changes in economic activity. It was based on the idea that its cyclical fluctuations may be related to some economic indicators. The National Bureau of Economic Research started an important research program on business cycle in the late 30s. As part of this program, Burns and Mitchell (1946) classified economic time series as coincident, lagged or leading depending whether they reproduce business cycles simultaneously, with a lag in time or anticipating them.

Although the construction of leading indicators is based on atheoretical statistical study, there are some economic reasons why indicators may anticipate the behavior of the business cycles. Leeuw (1991) identifies five reasons out of which the first three have been extensively discussed in the literature. They can be summarized as follows:

  • Production time: It refers to the fact that in the production process some activities occur earlier than others in a concatenate sequence. An expansion of the economic activity should begin with a previous increase in demand for consumer goods, materials, and new contract order for plant and equipment. As an example, increases in the production of cardboard should happen before growth in industrial sales.
  • Ease of adaptation: It considers the different flexibility of some variables to react in earlier stages of an expansion or decline in the economic activity. This is due to the fact that some decisions related to investment cost less than others. Obviously, it is less risky to expand hours of work than the number of workers when an expansion of the economy appears to be on the road.
  • Market expectation: It accounts for the fact that some economic indicators are more sensitive to future changes in economic activity than others. Stock price, some material and commodities price are good examples of those.
  • Primer Movers: It indicates that some economic time series may record in advance the behavior of economic fundamentals that eventually lead to a future short run movement in economic activity. Fiscal and monetary variables should represent this kind of leading indicators.
  • Changes versus level: It represents the idea that changes in the growth rate of economic indicators may be signaling future direction changes. In other words, the first difference of the economic indicators could be interpreted as a lead to a change of their level.

A single economic time series, like unemployment, hours of work, and others, can be a leading indicator candidate. However, the literature suggests that a composite index would perform better than one variable alone. The main explanation for that relies on the fact that a compounded index can neutralize false signs of its components. For example, an increase in commodity price may not be anticipating a demand increase, but reflecting a cartel action. Assuming that there are several reasons to justify the ability of an indicator to anticipate future changes in economic activity, it is useful to work with a set of different indicators.

Although leading economic indicators have been mostly used to predict movements in economic activity, it can also be applied to anticipate inflation turning points. The rationale is that both are characterized by the existence of cycles and can be forecast by some economic indicators.

Recent literature proposes the usage of leading indicators to anticipate inflation turning points. Moore (1986)[2], Roth (1991), Webb (1995), Dasgupta (1991), and Chauvet (1999a, 1999b) are good examples. According to Moore (1989), import price, domestic credit and the percentage of working-age population employed are important sensitive indicators for future inflation. Boughton and Branson (1989) conclude that Commodity prices may lead consumer price when the data are denominated in currency other than US dollar.

The objective of this paper is to compare two alternative methods to build leading inflation indicators for Brazil.

Section 2 sets out the whole methodology involved in the construction of the leading inflation indicator, including variable selection and both estimation procedures to compose the indicators. Section 3 presents an in-sample analysis, including tuning point analysis, the Quadratic Probability Score and the lead average and standard deviation approach. Section 4 introduces and out-of-sample analysis of the selected indicators, and section 5 concludes.

  1. METHODOLOGY
I.Variable Selection and Treatment [3]

In order to build leading inflation indicator for Brazil, more than 200 economic time series that could anticipate inflation movement were collected and analyzed. They were selected from different fields like monetary and financial sectors, being monetary aggregates, interest rates and stock indexes good examples of those. Analogously, exchange rate, trade balance, public sector net external debt, commodity price indexes and others were chosen to represent external pressures. In addition, industrial production indexes, industrial sales, employment rates, wage indexes and others were selected for being related to the concept of level of activity, while wholesales, aggregate and consumer price indexes to prices.

A careful research about data quality and reliability was implemented to discard those series with changes in methodology or collection procedure problems. In addition, timeliness was also a criterion used for selection[4]. This investigation proved 115 time series to be inadequate.

The series were tested for the presence of unit root[5] and transformed to achieve stationarity[6] using first difference, if the tendency were found to be stochastic or detrended in the deterministic case. The same scale for all these series was achieved through normalization[7].

The series used as the reference variable is the first difference of the National Consumer Price Index (IPCA[8]), detrended using an exponential function[9].

After this treatment, another data selection was performed using two econometric procedures. One was to evaluate their ability to Granger-cause inflation. This test was performed for each lag from two to twelve. Based on these results, 49 time series were selected according to their predictive content in explaining inflation.

In the latter procedure, variables were categorized according to their lead-lag maximum cross-correlation with inflation. A maximum correlation[10] around lags 0, -6 or 6 corresponded to a coincident, lagging or leading indicator classification. A total of twenty variables turned out to belong to the leading group[11]. However, one more variable was discarded, due to multicolinearity[12]. These selection criteria yield 19 variables.

II.Methods to compose the index

The literature presents several ways to combine variables to build a leading indicator. Webb (1995) and Cifuents (2000) used arithmetic mean, Contador (1999) suggested the use of weighted mean[13] and Chauvet (1999a, 1999b) used a dynamic factor model to extract the common movement underlying a set of variables that lead inflation. The resulting factor is a linear combination of the variables used weighted by the estimated factor loadings.

This work uses two distinct approaches to construct the composite indexes. One is based on Chauvet (1999a, 1999b). The other uses a linear regression to find the best set of coefficients to be used as weights on the composition of the indexes.

II.1 The Dynamic Factor Exercise

Chauvet (1999a, 1999b) suggested the combination of four variables to build the leading inflation indicator based on the idea that each of them would capture different sources of inflation pressure[14]. The leading inflation indicators were estimated as an unobserved variable using the Kalman Filter[15]. The idea is to extract a common factor of all component economic time series. Following this rationale, all possible combinations of four variables with those 19 pre-selected variables were considered[16].

The model used was suggested by Chauvet (1999a, 1999b) and is built from adynamic factor as described in the following equation, and estimated through Kalman Filter algorithm:

(1) Yt =  LIIt + t

(2) Lt = LIIt-1 + t

where t ~ i.i.d. N(0,) and t ~ i.i.d. N(0, ). Yt represents a 4x1 vector that contains the four economic time series previously selected,  is 4x1 vector that measure the sensibility of the composing series to the leading inflation indicator, t is the 4x1 vector of measurement errors, and t is the scalar transition shock. LIIt is the unobserved dynamic factor, i.e. the leading inflation indicator.

The model estimated assumes that  is a diagonal matrix, that is, the errors associated with each variable on the measurement equation (1) are uncorrelated. In addition, is uncorrelated with .[17]

Originally, all possible combinations of variables generated 3876 potential leading inflation indicators, depending on its predictive power. However, those indexes that were highly[18] correlated to one of the four component series were discarded. The intention was to avoid obtaining indicators showing the same predictive properties as the original series. This procedure reduced the number of potential indicators to 690.

In order to analyze the causality relation between each composite index and inflation, Granger Causality Test was performed for each lag from 2 to 12. Those indicators that were found not to cause inflation, using 5% as the level of significance for all lags were discarded.

Afterwards, the correlation coefficients of each indicator with ledinflation (from 0 to 12 months) were evaluated. The maximum correlation of each index was used to identify the lead of the indicator and also to rank them. Those leading indicators presenting maximum correlation below 0.45 were eliminated.

In the Table 1, the shadowed line represent the number of months that better anticipate inflation, given the occurrence of maximum correlation.

Table 1: Correlation between Inflation and Dynamic Factor Indicators
Absolute Value of Correlation
Lead / LII_5 / LII_6 / LII_7
0 / 0.1225 / 0.1786 / 0.1068
1 / 0.1446 / 0.1719 / 0.1126
2 / 0.2552 / 0.1934 / 0.2429
3 / 0.0476 / 0.0366 / 0.0675
4 / 0.1591 / 0.2261 / 0.1287
5 / 0.3776 / 0.3735 / 0.3704
6 / 0.5430 / 0.4959 / 0.5321
7 / 0.4606 / 0.4078 / 0.4801
8 / 0.1485 / 0.0795 / 0.1588
9 / 0.0190 / 0.0028 / 0.0122
10 / 0.0318 / 0.0839 / 0.0296
11 / 0.0971 / 0.0511 / 0.1150
12 / 0.0431 / 0.0578 / 0.0598

In order to avoid negligible contribution or predominance of variables, indicators whose correlation with the component variables was below 0.12 or above 0.85 were cast off.

Finally, it was implemented a recursive n-step ahead out of sample stability test (ROSST) and a stochastic simulation test (SST) to select the most stable indicators. The first test consists of successive re-estimations of the index through the Kalman Filter by subtracting from the component variables one observation each time until reaching half of the sample[19]. Comparison between each re-estimated indicator and the one generated using the full sample should show similar behavior. Figure 1 shows the 32 re-estimated indicators.

Figure 1: Stability Test of LII_7

The idea behind SST is to identify and discard those indicators that might present a different behavior when they are re-estimated incorporating a new simulated observation. This simulation was made by adding to the last observation of each component positive and negative shocks based on their measurement errors. For each indicator all possible combinations of shocks were implemented assuming just one level of shock one-step ahead. All the procedures described above yielded three leading inflation indicators.

The following figure presents two of these indicators and their relation with inflation phases. The shadowed areas represent periods of growing inflation. The indicators were lagged on 6 months in order to compare their movements with inflation.

Figure 2: Comparison between inflation and the Dynamic Factor indicators

II.2 Regression Exercise[20]

As an alternative to the dynamic factor model, a weighted mean approach was implemented to combine the same 19 variables and generate composite indexes.

The literature proposes many different ways to build leading indicators using weighted mean. The rationale used to define the weights may involve economic discretionary or statistical relations. In this work, it was adopted the second approach.

Using OLS, inflation was regressed against lagged leading series as a way to find the best statistical weights. This procedure incorporates not only an alternative methodology to compose the leading index, but also the intention to impose causality between the indicator and inflation.

The 19 pre-selected variables were combined in groups of four, keeping the rationale above mentioned. This also allows comparison between the indicators yielded from this method with those from the dynamic factor model.

The weights were estimated by the following model:

(3) t = Yt-k + t t ~ N(0, 2)

where Yt represents a 4x1 vector that contains the four economic time series previously selected,  is 1x4 weight vector and t is a scalar error.

The indicator is built using the following equation: [21]

(4) LSt = Yt

where LSt is the leading inflation indicator.

Similarly to the other exercise, all possible combinations of the variables were estimated. For each combination of variables, 12 different regressions were run (one for each lag k from 1 to 12). This means that model (3) was run 46512 times.

For each set of 12 estimations generated from a combination of variables, the best lead relation (k) was chosen based on the maximum correlation between LSt-k and t. Then, the indexes were grouped by their best lead relation (1 to 12) and ranked according to the their correlation with t within each group. Those leading indicators presenting correlation below 0.55 were eliminated.

In the Table 2, the shadowed line represent the number of months that better anticipate inflation, given the occurrence of maximum correlation.

Table 2: Absolute Value of Correlations between the Regression Indicators and Inflation

Absolute Value of Correlations between Regression Indicators and Inflation
Lead / LS_1 / LS_2 / LS_3 / LS_4 / LS_5 / LS_6 / LS_7
0 / 0.1177 / 0.1227 / 0.1112 / 0.0278 / 0.0347 / 0.1349 / 0.0084
1 / 0.1900 / 0.1300 / 0.1425 / 0.2458 / 0.2085 / 0.1489 / 0.2505
2 / 0.3312 / 0.3439 / 0.3337 / 0.4663 / 0.3356 / 0.3333 / 0.4077
3 / 0.2116 / 0.2591 / 0.2325 / 0.3436 / 0.2529 / 0.2319 / 0.3752
4 / 0.1868 / 0.2314 / 0.1360 / 0.4476 / 0.1844 / 0.2126 / 0.5079
5 / 0.2849 / 0.3534 / 0.3054 / 0.5134 / 0.2923 / 0.3198 / 0.4549
6 / 0.4541 / 0.5550 / 0.4483 / 0.4982 / 0.4535 / 0.4822 / 0.4380
7 / 0.5671 / 0.6233 / 0.5931 / 0.6695 / 0.5776 / 0.5750 / 0.5893
8 / 0.3557 / 0.3618 / 0.3526 / 0.4891 / 0.3057 / 0.3657 / 0.2917
9 / 0.2137 / 0.2126 / 0.1269 / 0.2066 / 0.3116 / 0.1745 / 0.1909
10 / 0.1806 / 0.2943 / 0.2560 / 0.2353 / 0.3682 / 0.2451 / 0.3313
11 / 0.2058 / 0.1782 / 0.1801 / 0.1353 / 0.1721 / 0.1766 / 0.1437
12 / 0.2707 / 0.2721 / 0.2678 / 0.2813 / 0.2150 / 0.3075 / 0.1432

It is interesting to emphasize that the most correlated indexes with inflation belonged to the groups presenting best lead relationwithin the range 4 to 10 months. This result is consistent because the selected component variables also showed maximum correlation with inflation within the same range.

Differently from the dynamic factor exercise, Granger Causality Test[22] was not useful to eliminate any of the candidate indicators. It was quite expected given that this construction method itself implicitly imposes causality relation.

Aiming to avoid negligible contribution or predominance of a variable, the indicators that presented coefficients three times larger than any of the others were rejected.

In order to analyze the stability of the regressions coefficients, it was applied a modified recursive least square test[23]. Its different approach relies on the fact that it compares the coefficients of each re-estimation with those obtained using full sample. Based on these results a measure of total dispersion was created to rank the candidate leading indicators. Similarly to the Dynamic Factor exercise, ROSST was also applied. Finally, seven leading inflation indicators came out of the whole procedures outlined above.

The following figure presents two of these indicators and their relation with inflation phases. The shadowed areas represent periods of growing inflation. The indicators were lagged on 7 months in order to compare their movements with inflation.