# Battese and Coelli (1992) Propose a Stochastic Frontier Production Function for (Unbalanced)

**Efficiency Measures for Spanish Service Firms. A Stochastic Frontier Approach.**

Justo de Jorge

Cristina Suárez

Universidad de Alcalá

February 2004

### Abstract

The paper estimates the levels of technical efficiency reached by Spanish service firms over the period 1996-2002. The intangible nature of services make them more difficult to quantify, and we combine work measurements with the methodology of the stochastic frontier production function for unbalanced panel data, and for which the firm effects are an exponential function of time.

Also, we obtain other relevant technological measurements of these productive processes such as the scale and the technical progress parameters. Our econometric results show the great heterogeneity in the firms’ efficiency and the predominance of decreasing returns to scale. As a consequence of a sectorial segregation we find a larger variety in the parameter of returns to scale in Spanish service sectors and the rate of technical progress reveals a great heterogeneity among them.

Keywords: stochastic frontier production functions, technical efficiency, unbalanced panel data, Spanish service firms

Address for correspondence:

Justo de Jorge

Facultad de Ciencias Económicas y Empresariales

Universidad de Alcalá

Plaza de la Victoria, 2

28802 Alcalá de Henares, Madrid (SPAIN)

e-mail:

Cristina Suárez

Facultad de Ciencias Económicas y Empresariales

Universidad de Alcalá

Plaza de la Victoria, 2

28802 Alcalá de Henares, Madrid (SPAIN)

e-mail:

1. Introduction

How far are the various strands of contemporary economics leading towards a better understanding of services dynamics?. FollowingTordoir (1987) regarding economic research on services, economists can be divided into three camps; those who do not (yet) recognize the services as an important field for research and theory-development, those who have worked on the subject and are able to encapsulate it within a pre-existing theoretical framework without changing the latter, and finally those who adapt theory and concepts, confronting new information on services and the need for new methods on investigation and interpretation of services dynamics. This investigation claimsto be included into the last camp.

The concept of a “service” is unusually complex –it brings problems of quantifying output, of value-added, of relative prices, of intangibility, of defining service occupations, of intermediate and final consumption, of the fact that a consumer of a service is also an active agent in its production, that many service transactions are unique and cannot be replicated, and so on-. However, it is necessary proceed with caution: “…. such is the diversity of the service sector that it defies applications of a principal theory, a particular analytical method, or a dominant mode of interpretation” (Daniels (1985)).

The Spanish service firms are cause of constant argument and preoccupation for government's officials, as well as for society as a whole. This is a consequence that the economy has become more service oriented, and in the last decades the service sector has generated a big proportion of employment in Spain. In our days, the number of Spanish workers employed in the service sector has gone to approximately the 60% of the total employment (see Cuadrado Roura et al (1999) for a survey of this sector).

This paper estimates the levels of technical efficiency reached by the Spanish service firms through an econometric estimation of stochastic frontier production functions from an unbalanced panel (between 1996 and 2002) of 4344 Spanish service firms. In addition to these (in)efficiency indicators other important technological measurements of these productive processes are obtained such as the scale and the technical progress parameters. The most abundant types of frontiers found in empirical literature are deterministic frontiers computed from mathematical programming. Nevertheless, this paper estimates stochastic parametric frontiers using a panel data approach and for which the firm effects are an exponential function of time, due to this type of frontier has relative advantages over the rest.

The paper is organized as follows: in the second section the theoretical framework is developed, in the third the data, the econometric specification and the results of the estimates are dealt with, and at the end, the main conclusions are presented.

**2. Time-varying model for unbalanced panel data**

To measure the (in)efficiency of Spanish service firms, we propose a stochastic frontier production function for unbalanced panel data (Battese and Coelli (1992)). This type of frontier and the computation method present advantages with respect other alternatives, for example the deterministic frontiers[1]. First, the deterministic frontiers are based on the assumption that the only type of explanation for the deviation between the observed output and its frontier output is due to its own inefficiency. This idea is difficult to maintain at the empirical level due to it ignores the possibility that the observed output can differ from the potential because of two other factors: stochastic shocks and measurement error in the variables.

Second, the mathematical programming methods have two disadvantages with respect to specifying a statistical relationship between the outputs and the inputs. On the one hand, the frontier estimation is made over a subsample of the whole and then these methods are extremely sensitive to the existence of outliers. On the other hand, the estimated coefficients lack statistical properties, so it is not possible to make any statistical inference or establish hypothesis contrasts from them.

The stochastic frontier production function proposed, has firm effects that are assumed to be distributed as truncated normal random variables and, also, are permitted to vary systematically with time. The model may be expressed as:

Yit =o + *j Xjit+ (Vit - Uit) ,i=1,...,N, t=1,...,T,*[1]

where Yit denotes (the logarithm of) the production of the *i-th firm in the t-th time period; Xk represents the k-th* (transformations of the) input quantities; k stands for the output elasticity with respect to the k-th input; the Vit is a random variable which is assumed to be iid N(0,V2), and distributed independently of the Uit which has the specification:

Uit = Uiit = Uiexp(-(t-Ti)) [2]

where the Ui is a non-negative random variable which is assumed to account for technical inefficiency in production and are assumed to be iid as truncations at zero of the N(,2) distribution and is a parameter to be estimated.

The last period (t=Ti) for firm i contains the base level of inefficiency for that firm (Uit = Ui). If > 0, then the level of inefficiency decays toward the base level. If < 0, then the level of inefficiency increases to the base level, and if = 0, then the level of inefficiency remain constant.

We utilize the parametrization of Battese and Corra (1977) who replace *V2 and 2 with 2=V2+2 and =2/(V2+2). The parameter, *, must lie between 0 and 1.

The imposition of one or more restrictions upon this model formulation can provide a number of special cases of this particular model, which have appeared in the literature. For example, setting to be zero provides the time-invariant model. One can also test whether any form of stochastic frontier production function is required at all by testing the significance of the parameter. If the null hypothesis, that equals zero, is accepted, this would indicate that 2 is zero and hence that the Uit term should be removed from the model, leaving a specification with parameters that can be consistently estimated using ordinary least squares.

The predictions of individual firm technical efficiencies from the estimated stochastic production frontiers are defined as:

* EFit= exp(-Uit)= E[exp(-Uit)Ei] = *[3]

where *Ei represents the (Ti x 1) vector of Eit* ‘s associated with the time periods observed for the i th firm, where Eit = Vit - Uit;

[4]

[5]

where *i represents the (Ti x 1) vector of it* ‘s associated with the time periods observed for the i th firm, and (.) represents the distribution function for the standard normal random variable. If the firm effects are time invariant, then the technical efficiency is obtained by replacing it= 1 and = 0.

**3. Empirical specification and results**

The SABE (*Sistema de Análisis de Balances Españoles*) provides the necessary data to estimate an efficiency measure. It is an annual survey which looks at a panel of firms representative of Spanish service firms and contains data of balance sheet, cash flow and qualitative data. The database is an unbalanced panel observed over the period 1996-2002. SABE also provides information about the major two-digit NACE codes to which the firms belong. Using these codes, we subdivide the data into four groups. Table 1 reflects the sectorial division of the service sector analyzed in this paper, the number of firms and the number of observations of each subsample.

For each sector a Cobb-Douglas production frontier is estimated. Although the Cobb-Douglas production function imposes enough restrictions, it should be pointed out that it has certain clear advantages. For example, the estimation requires few independent variables.

## Table 1: Sector classification, number of firms and number of observations

Sector classification / CNAE 1993 / Num.of firms / Num.of observationsSale, maintenance & repair of motor vehicles; retail sale of fuel / 50 / 549 / 2371

Wholesale trade & commission trade, except of motor vehicles & motorcycles / 51 / 2393 / 10381

Retail trade, except of motor vehicles & motorcycles; repair of personal & household goods / 52 / 1050 / 4056

Hotels and restaurants / 55 / 352 / 1338

Output is measured by the yearly value added (VA), defined as sales less cost of goods plus inventories, and is converted into real terms[2]. Labor (N) is measured as the number of employees. In this type of study, the standard practice is to define labor in terms of hours worked but this information is not available. Capital quantities (K) are defined as the marked value of assets owned by the firms, in constant prices. Table 2 presents a summary statistics of the data used in this study.

Taking logarithms from this Cobb-Douglas production function we have:

*log(VAit) =o + 1 log(Nit)+ 2 log(Kit)+ 3 t+ (Vit - Uit) * [6]

The coefficients 1 and 2are the output elasticities of inputs, and the sum of them gives us the elasticity of scale, which indicates the returns to scale. There is also the variable t which is a variable added here to measure the Hicks-neutral technical change, that is common among firms in the same sector. Vitand Uit are the random variables whose distributional properties are defined in the last section.

**Table 2: Summary statisticsa**

50 / VA / 780.41 / 4139.26 / 1.37 / 106016.3

#### N

/ 20.98 / 42.12 / 1 / 634K / 456.22 / 1522.29 / 0.79 / 21888.4

51 / VA / 559.37 / 2519.35 / 0.69 / 68069.03

#### N

/ 18.42 / 73.80 / 1 / 2324K / 427.16 / 3271.02 / 0.79 / 98998.11

52 / VA / 688.53 / 6126.89 / 0.74 / 186628.1

#### N

/ 30.41 / 203.96 / 1 / 6506K / 1059.35 / 9531.63 / 0.79 / 187612.2

55 / VA / 1188.61 / 6666.18 / 1.43 / 111267.8

#### N

/ 90.59 / 812.43 / 1 / 18110K / 1460.23 / 6100.47 / 0.82 / 108213

Notes: a Output (VA) and capital (K) are in thousands of euros.

The stochastic frontier model, defined by equation [6], contains four parameters and the four additional parameters associated with the distributions of the Vitand Uitrandom variables, and we estimate them by maximum-likelihood estimation methods.

The process of estimation proposed is the following: Model A is the stochastic frontier production function in which the firm effects, Uit , have the time-varying structure defined in the last section; Model B is the case in which the Ui’s have half-normal distribution, assumes that = 0; Model C is the time-invariant model, assumes that = 0; Model D is the time-invariant model in which the Ui’s have half-normal distribution, assumes that = = 0; Finally, Model E is the average response function in which firms are assumed to be fully technically efficient, the firm effects are absent from the model, assumes that = = = 0.

Table 3 displays the estimated coefficients. Presented below the name of the sectors, in brackets, are the names for the best model of each estimation, based on the generalized likelihood-ratio statistic. To determine the most suitable model, we conducted various hypothesis test of restriction on the parameters of the production structure. These generalized likelihood ratio statistics along with the decision are reported in the table.

As can be observed, the results of these estimations are so good. Labor and capital inputs are significant across all the estimations. The null hypothesis, H0: = = = 0, is rejected in all the sectors, it is evident that the traditional average production function is not an adequate representation of the sample. Furthermore, the hypothesis that the half-normal distribution is an adequate representation for the distribution of the firm effects is also rejected (H0: = = 0 and H0: = 0). However, the hypothesis that time-invariant models for firm effects is not rejected for three of the four sectors (sectors 50, 51 and 55). Given that in those sectors, the time-invariant is appropiated to the firm effects, the hypothesis that the half-normal distribution is adequate for the distribution of the firm effects is also rejected. For sector 52 the model A would not be rejected (H0: = = 0, H0: = 0, and H0: = 0), and time-variant models for firm effects are adequate.

1

**Table 3: Estimations and tests of hypothesis for four sectors (period 1996-2002 )**

Variable / Model A / Model B / Model C / Model D / Model E

c / 2.597

(41.256) / 26.149

(19.818) / -18.244

(64.554) / -6.997

(8.993) / 5.621

(10.097)

n / 0.817

(0.021)** / 0.867

(0.018)** / 0.817

(0.021)** / 0.867

(0.018)** / 0.932

(0.015)**

k / 0.113

(0.010)** / 0.154

(0.010)** / 0.114

(0.010)** / 0.156

(0.010)** / 0.138

(0.009)**

t / 0.002

(0.020) / -0.011

(0.009) / 0.012

(0.004)** / 0.005

(0.004) / -0.001

(0.005)

/ 3.176

(1.912)* / - / 3.497

(64.043) / - / -

(Model C) / / 0.003

(0.007) / 0.024

(0.013)* / - / - / -

2=V2+2 / 0.323

(0.018)** / 0.844

(0.083)** / 0.325

(0.017)** / 0.903

(0.082)** / 0.370

(0.011)**

/ 0.691

(0.019)** / 0.845

(0.018)** / 0.693

(0.019)** / 0.854

(0.016)** / -

Log (likelihood) / -1257.241 / -1499.962 / -1257.382 / -1501.762 / -1586.506

Test of hypothesis for parameters

Null hypothesis / Assumptions / 2-statistics / Decision

= = = 0 / Model A / 658.53 / Reject H0

= = 0 / Model A / 489.04 / Reject H0

= 0 / Model A / 485.44 / Reject H0

= 0 / Model A / 0.28 / Accept H0

= = 0 / Model C ( = 0) / 658.25 / Reject H0

= 0 / Model C ( = 0) / 488.76 / Reject H0

SECTOR 51 / Maximum-likelihood estimates

Variable / Model A / Model B / Model C / Model D / Model E

c / -9.278

(17.678) / 11.389

(9.766) / -36.534

(6.820)** / -27.551

(4.733)** / 5.496

(5.802)

n / 0.596

(0.010)** / 0.643

(0.009)** / 0.597

(0.010)** / 0.647

(0.009)** / 0.756

(0.008)**

k / 0.137

(0.005)** / 0.149

(0.005)** / 0.137

(0.005)** / 0.151

(0.005)** / 0.162

(0.005)**

t / 0.008

(0.009) / -0.003

(0.005) / 0.022

(0.002)** / 0.016

(0.002)** / -0.001

(0.003)

/ 3.100

(0.357)** / - / 3.444

(5.110) / - / -

(Model C) / / 0.005

(0.003) / 0.019

(0.004)** / - / - / -

2=V2+2 / 0.529

(0.014)** / 1.756

(0.072)** / 0.534

(0.014)** / 1.863

(0.072)** / 0.558

(0.008)

/ 0.738

(0.008)** / 0.910

(0.004)** / 0.740

(0.008)** / 0.915

(0.004)** / -

Log (likelihood) / -7440.042 / -8135.787 / -7441.299 / -8146.283 / -9102.874

Test of hypothesis for parameters

Null hypothesis / Assumptions / 2-statistics / Decision

= = = 0 / Model A / 3325.66 / Reject H0

= = 0 / Model A / 1412.48 / Reject H0

= 0 / Model A / 1391.49 / Reject H0

= 0 / Model A / 2.51 / Accept H0

= = 0 / Model C ( = 0) / 1409.97 / Reject H0

= 0 / Model C ( = 0) / 3323.29 / Reject H0

Table 3: Continued

SECTOR 52 / Maximum-likelihood estimatesVariable / Model A / Model B / Model C / Model D / Model E

c / -10.606

(27.340) / 6.972

(16.180) / -58.758

(28.011)** / -45.307

(8.327)** / -31.008

(9.119)**

n / 0.650

(0.015)** / 0.721

(0.013)** / 0.653

(0.015)** / 0.726

(0.013)** / 0.756

(0.012)**

k / 0.157

(0.008)** / 0.163

(0.008)** / 0.158

(0.008)** / 0.166

(0.008)** / 0.162

(0.007)**

t / 0.008

(0.014) / -0.002

(0.008) / 0.032

(0.004)** / 0.024

(0.004)** / 0.017

(0.004)**

/ 2.287

(0.301)** / - / 2.939

(26.882) / - / -

(Model A) / / 0.011

(0.006)* / 0.039

(0.010)** / - / - / -

2=V2+2 / 0.393

(0.014)** / 0.890

(0.055)** / 0.401

(0.014)** / 0.980

(0.055)** / 0.492

(0.011)**

/ 0.610

(0.016)** / 0.797

(0.014)** / 0.615

(0.016)** / 0.814

(0.012)** / -

Log (likelihood) / -2937.425 / -3166.700 / -2939.065 / -3173.945 / -3307.078

Test of hypothesis for parameters

Null hypothesis / Assumptions / 2-statistics / Decision

= = = 0 / Model A / 739.31 / Reject H0

= = 0 / Model A / 473.04 / Reject H0

= 0 / Model A / 458.55 / Reject H0

= 0 / Model A / 3.28 / Reject H0

SECTOR 55 / Maximum-likelihood estimates

Variable / Model A / Model B / Model C / Model D / Model E

c / 106.730

(117.343) / 54.940

(28.602)* / -49.716

(31.434) / -35.956

(14.092)** / 3.147

(15.993)

n / 0.598

(0.023)** / 0.541

(0.017)** / 0.599

(0.023)** / 0.548

(0.017)** / 0.641

(0.016)**

k / 0.126

(0.015)** / 0.213

(0.012)** / 0.126

(0.015)** / 0.212

(0.011)** / 0.212

(0.010)**

t / -0.039

(0.044) / -0.025

(0.014)* / 0.028

(0.006)** / 0.020

(0.007)** / 4.8e-08

(0.008)

/ 24.921

(60.333) / - / 3.359

(28.551) / - / -

(Model C) / / 0.003

(0.006) / 0.064

(0.016)** / - / - / -

2=V2+2 / 0.422

(0.030)** / 1.014

(0.113)** / 0.425

(0.030)** / 1.168

(0.122)** / -0.768

(0.039)**

/ 0.703

(0.026)** / 0.854

(0.019)** / 0.704

(0.026)** / 0.870

(0.017)** / -

Log (likelihood) / -895.660 / -979.602 / -896.900 / -986.535 / -1051.406

Test of hypothesis for parameters

Null hypothesis / Assumptions / 2-statistics / Decision

= = = 0 / Model A / 311.49 / Reject H0

= = 0 / Model A / 181.75 / Reject H0

= 0 / Model A / 167.89 / Reject H0

= 0 / Model A / 2.48 / Accept H0

= = 0 / Model C ( = 0) / 309.01 / Reject H0

= 0 / Model C ( = 0) / 179.27 / Reject H0

Notes: The estimated standard errors for the parameter estimators are presented below the corresponding estimates. *, ** Indicate significance at 10 and 5% respectively

1

Table 4 reports the elasticity of scale, the rate of technical progress and the efficiency measure by sectors. These results show that the sum of the parameters are not significantly different from one and, as a consequence, the sectors present decreasing returns to scale with a parameter placed in the interval 0.93-0.72 (Sale, maintenance & repair of motor vehicles; retail sale of fuel and Hotels and restaurants, respectively).

Conversely, the rate of technical progress reveals a great heterogeneity among sectors. One can see from a parameter not significantly different from zero in sector 52 (Retail trade, except of motor vehicles & motorcycles; repair of personal & household goods) to 2.8% for Hotels and restaurants.

## Table 4: Relevant parameters

Sector classification / Elasticityof scale / Rate technical

progress (%) / Efficiency measure

Mean / Min. Value / Max. Value

50 / 0.931 / 1.2 / 0.034 / 0.005 / 0.432

51 / 0.734 / 2.2 / 0.039 / 0.001 / 0.572

52 / 0.807 / 0.8b / 0.115c / 0.016c / 0.787c

55 / 0.725 / 2.8 / 0.041 / 0.003 / 0.462

Notes: a Not significantly different from 1. b Not significantly different from 0. c Base level of inefficiency (t=Ti).

The mean of the efficiency indicator is 0.058 and the standard deviation 0.036 for the full sample (4344 Spanish service firms); these results confirm the heterogeneity among service firms in Spain. As can be seen from the information presented, and as expected, there is also great heterogeneity in the mean level and in the dispersion of the inefficiency at the sector level. The mean value of the inefficiency indicator varies from 11.7% for Retail trade, excluding of motor vehicles & motorcycles; repair of personal & household goods sector, to 3.4% for Sale, maintenance & repair of motor vehicles; retail sale of fuel sector. The histogram (presented in Figure 1) displays a dispersion of efficiency indicator and a great divergence for the different sectors; also, it is presented the histogram for the full sample (4344 firms).

The technical efficiency for 50, 51 and 55 sectors are obtained by replacing it= 1 and = 0 in equation [3] due to the hypothesis of time-invariant is accepted as appropiated to the firm effects. The technical efficiency for Retail trade, except of motor vehicles & motorcycles; repair of personal & household goods sector varies between 11.5% in 1996 to 11.7% in 2002, because the estimate for the parameter is positive ( = 0.011), and the efficiency increases over time, according to the assumed exponential model defined by equation [2].

Table 5 presents a descriptive approach to the efficiency measure of the service firms. We can distinguish between firms according to other variables such as the firm age or the export activity and, also, we present an indicator of the efficiency difference according to such characteristics. Firm age is computed as the difference between the calendar year at t and the birth-year reported by the firm and export activity reports differences between exporters and non-exporters.

Figure 1. Histogram and descriptive statistics of efficiency indicator

Sector 50: Sector 51:

Sale, maintenance & repair of motor vehicles; Wholesale trade & commission trade,

retail sale of fuel except of motor vehicles & motorcycles

Sector 52: (t=Ti) Sector 55:

Retail trade, exc. of motor vehicles & motorcycles; Hotels and restaurants

repair of personal & household goods

Full Sample:

Our results confirm the empirical findings of previous papers that use micro panel data to measure the relationship between efficiency and indicators such as age and export activity. Efficiency differentials between younger and older firms are substantial and significant in the Spanish service firms. Younger firms have a lower efficiency measure than older. We find that the difference in efficiency measure is 4.4 percent higher for older firms in Retail trade, exc. of motor vehicles & motorcycles; repair of personal & household goods and 2.1 percent higher in Hotels and restaurants.

**Table 5: Efficiency measure according to firm age and export activity**

Less than 5 / 6-15 / More than 15

50 / 0.026 / ** / 0.033 / ** / 0.041

51 / 0.033 / ** / 0.038 / ** / 0.048

52 / 0.102 / ** / 0.113 / ** / 0.146

55 / 0.031 / ** / 0.039 / ** / 0.052

Export Activity

Exporters / Non-exporters

50 / 0.053 / ** / 0.033

51 / 0.054 / ** / 0.035

52 / 0.194 / ** / 0.109

55 / 0.057 / ** / 0.040

Note: ** Indicate significance at 5% (test of the null hypothesis that the mean of the efficient measures are equal between groups).