homework #6

15.6 (i) Plugging (15.26) into (15.22) and rearranging gives

y1 = bo + b1(pi0 + pi1z1 + pi2z2 + v2) + b2z1 + u1

= (bo + b1pi0) + (b1pi1 + b2)z1 + b1pi2z2 + u1+ b1v2,

and so a0 =,b0 + b1pi0, a1 = b1pi1 + b2, and a2 = b1pi2.

(ii) From the equation in part (i), v1 = u1 + b1v2.

(iii) By assumption, u1 has zero mean and is uncorrelated with z1 and z2, and v2 has these properties by definition. So v1 has zero mean and is uncorrelated with z1 and z2, which means that OLS consistently estimates the aj. [OLS would only be unbiased if we add the stronger assumptions E(u,Iz1,z2) = E(v2lzl,z2) = 0].

15.7 (i) Even at a given income level, some students are more motivated and more able than others, and their families are more supportive (say, in terms of providing transportation) and enthusiastic about education. Therefore, there is likely to be a self-selection problem: students that would do better anyway were also more likely to attend a choice school.

(ii) Assuming we have the functional form for faminc correct, the answer is yes. Since u1 does not contain income, random assignment of grants within income class means that grant designation is not correlated with unobservables such as student ability, motivation, and family support.

(iii) The reduced form is

choice = pi0+ pi1faminc + pi2grant + v2,

and we need pi2 not equal to 0. In other words, after accounting for income, the grant amount must have some affect on choice. This seems reasonable, provided the grant amounts differ within each income class.

(iv) The reduced form for score is just a linear function of the exogenous variables (see Problem 15.6):

score = alpha0 + alpha1faminc + alpha2grant + v1.

This equation allows us to directly estimate the effect of increasing the grant amount on the test score, holding family income fixed. From a policy perspective this is itself of some interest.

15.11 (i) We plug x,* = xt - et into yt = b0 + b1x*t + ut:

Yt = b0 + b1(xt - et) + ut = b0 + b1xl + ut - b1et

= b0 + b1xt + vt,

where vt = ut - b1et. By assumption, ul is uncorrelated with x,* and et; therefore, ut is uncorrelated with xt. Since el is uncorrelated with x,*, E(xtet) = E[(x,* + et)et] = E(x*t et) + E(et^2) = (sigmae)^2. Therefore, with vt defined as above, Cov(xt,vt) = Cov(xt,ut) - b1Cov(xt,et) = - b1(sigmae)^2 < 0 when b1 > 0. Because the explanatory variable and the error have negative covariance, the OLS estimator of b1 has a downward bias [see equation (5.4)].

(ii) By assumption E(x*t-1, ut) = E(et-l ut) = E(x*t-1, et) = E(et-1 et) = 0, and so

E(xt-1 ut) = E(xt-1et) = 0 because xt = x*t + et. Therefore, E(xt-1 vt) = E(xt-1 ut) -b1 E(xt-1 et).

(iii) Most economic time series, unless they represent the first difference of a series or the percentage change, are positively correlated over time. If the initial equation is in levels or logs, xt, and xt-1 are likely to be positively correlated. If the model is for first differences or percentage changes, there still may be positive or negative correlation between xt, and xt-1.

(iv) Under the assumptions made, xt-1 is exogenous in Yt = b0 + b1xt + vt,

as we showed in part (ii): Cov(xt-1 vt) = E(xt-1 vt) = 0. Second, xt-1 will often be correlated with xt, and we can check this easily enough by running a regression of xt on xt-1. This suggests estimating the equation by instrumental variables, where xt-1 is the IV for xt. The IV estimator will be onsistent for bl, (and b0), and asymptotically normally distributed.

15.13 (i) The equation estimated by OLS is

children hat = -4.138 - .0906 educ + .332 age - .00263 age^2

(0.241) (.0059) (.017) (.00027)

n = 4361, R^2 = .569.

Another year of education, holding age fixed, results in about .091 fewer children. In other words, for a group of 100 women, if each gets another year of education, they collectively are predicted to have about nine fewer children.

(ii) The reduced form for educ is

educ = pi0 + pi1 age + pi2 age^2 + pi3 firsthalf + v,

and we need pi3 not equal to 0. When we run the regression we obtain pi3 = -.852 and se(pi3) = .113. Therefore, women born in the first half of the year are predicted to have almost one year less education, holding age fixed. The t statistic on firsthalf is over 7.5 in absolute value, and so the identification condition holds.

(iii) The structural equation estimated by IV is

children hat = -3.388 - .1715 educ + .324 age - .00267 age^2

(0.548) (.0532) (.018) (.00028)

n = 4361, R^2=.550.

The estimated effect of education on fertility is now much larger. Naturally, the standard error for the IV estimate is also bigger, about nine times bigger. This produces a fairly wide 95% Cl for b1.

(iv) When we add electric, tv, and bicycle to the equation and estimate it by OLS we obtain

children hat = - 4.390 - .0767 educ +.340 age - .00271 age^2 - .303 electric

(.0240) (.0064) (.016) (.00027) (.076)

- .253 tv + .318 bicycle

(.091) (.049)

n = 4356, R^2 = .576.

The 2SLS (or IV) estimates are

children hat = -3.591 - .1640 educ + .328 age - .00272 age^2 - .107 electric

(0.645) (.0655) (.019) (.00028) (.166)

- .0026 tv + .332 bicycle

(.2092) (.052)

n = 4356, R^2 = .558.

Adding electric, tv, and bicycle to the model reduces the estimated effect of educ in both cases, but not by too much. In the equation estimated by OLS, the coefficient on tv implies that, other factors fixed, four families that own a television will have about one fewer child than four families without a TV. Television ownership can be a proxy for different things, including income and perhaps geographic location. A causal interpretation is that TV provides an alternative form of recreation.

Interestingly, the effect of TV ownership is practically and statistically insignificant in the equation estimated by IV (even though we are not using an IV for tv). The coefficient on electric is also greatly reduced in magnitude in the IV estimation. The substantial drops in the magnitudes of these coefficients suggest that a linear model might not be the functional form, which would not be surprising since children is a count variable. (See Section 17.4.)

15.14 (i) IQ scores are known to vary by geographic region, and so does the availability of four year colleges. It could be that, for a variety of reasons, people with higher abilities grow up in areas with four year colleges nearby.

(ii) The simple regression of IQ on nearc4 gives

IQ = 100.61 + 2.60 nearc4

(0.63)(0.74)

N = 2061, R^2 = .0059,

which shows that predicted IQ score is about 2.6 points higher for a man who grew up near a four-year college. The difference is statistically significant (t statistic ~ 3.51).

(iii) When we add smsa66, reg662, ... , reg669 to the regression in part (ii), we obtain

IQ = 104.77 +.348 nearc4 + 1.09 smsa66 + …

(1.62) (.814) (0.81)

N = 2061, R^2 = .0626,

where, for brevity, the coefficients on the regional dummies are not reported. Now, the relationship between IQ and nearc4 is much weaker and statistically insignificant. In other

words, once we control for region and environment while growing up, there is no apparent link between IQ score and living near a four-year college.

(iv) The findings from parts (ii) and (iii) show that it is important to include smsa66, reg662, … , reg669 in the wage equation to control for differences in access to colleges that might also be correlated with ability.

15.17 (i) Sixteen states executed at least one prisoner in 1991, 1992, or 1993. (That is, for 1993, exec is greater than zero for 16 observations.) Texas had by far the most executions with 34.

(ii) The results of the pooled OLS regression are

mrdrte = -5.28 - 2.07 d93 + .128 exec + 2.53 unem

(4.43) (2.14) (.263) (0.78)

N = 102, R^2 = .102, Radj^2 = .074.

The positive coefficient on exec is no evidence of a deterrent effect. Statistically, the coefficient is not different from zero. The coefficient on unem implies that higher unemployment rates are associated with higher murder rates.

(iii) When we difference (and use only the changes from 1990 to 1993), we obtain

mrdrte = .413 - .104exec - .067unem

(.209) (.043) (.159)

N = 51, R^2 = .110, Radj^2 =.073.

The coefficient on exec is negative and statistically significant (p-value ~ .02 against a twosided alternative), suggesting a deterrent effect. One more execution reduces the murder rate by about. 1, so 10 more executions reduce the murder rate by one (which means one murder per 100,000 people). The unemployment rate variable is no longer significant.

(iv) The regression exec on exec-1 yields

exec = .350 - 1.08 exec-1

(.370) (0.17)

N = 51, R^2 = .456, Radj^2 = .444,

which shows a strong negative correlation in the change in executions. This means that, apparently, states follow policies whereby if executions were high in the preceding three-year period, they are lower, one-for-one, in the next three-year period.

Technically, to test the identification condition, we should add unem to the regression. But its coefficient is small and statistically very insignificant, and adding it does not change the outcome at all.

(v) When the differenced equation is estimated using exec-1 as an IV for exec, we obtain

mrdrte = .411 - .100 exec - .067 unem

(.211) (.064) (.159)

N = 51, R^2 = .110, Radj^2 = .073.

This is very similar to when we estimate the differenced equation by OLS. Not surprisingly, the most important change is that the standard error on b1 is now larger and reduces the statistical significance of b1.