Professor Kasey Buckles, Fall2007

Professor Kasey Buckles, Fall2007

Economics 30331: Econometrics

M I D - T E R M 2 E X A M I N A T I O N

Instructions

Write your name on this test and on the blue book. Multiple choice answers should be written in the space provided on this exam, short answers should be given in the blue book.
You may use your one page of additional notes, as well as a calculator. You should not share these with other students.
You have 75 minutes to take the exam. The exam is65points, and the points assigned to each question should give you an idea of how many minutes to spend on each question (with 10 ‘extra’ minutes left over to use as necessary).
There are 5 Multiple Choice questions and5 Short Answers. Please check to make sure that you have the full test.
This exam is administered under Notre Dame’s Honor Code.

Name: ______

(Signing your name indicates that you have read and understood the above instructions)

I. Multiple Choice (4points each)

Select the best answer for each question, and write the letter of your answer to each question in the following space. No explanations are necessary.

1. ______2. ______3. ______4. ______5. ______

1. For β1-hat to be a consistent estimator of β1, we need:

A. the zero-conditional mean assumption to hold.

B. E[u|x] = 0.

C. covariance of u and xj to be zero for all xj.

D. none of the above.

E. all of the above.

2. Billy Beane wants to find undervalued baseball players. He believes that walks (walks) and extra bases (xtrabases) determine wins. Which of the following strategies will help him find the players he wants?

Estimate the model salary =β0 + β1walks + β2xtrabases + u. Use the model to predict each player’s residual, uhat. Players with negative uhat are undervalued.
Estimate the model salary =β0 + β1walks + β2xtrabases + u. Use the model to predict each player’s residual, uhat. Players with positive uhat are undervalued.
Estimate the model salary =β0 + β1walks + β2xtrabases + u. Use the model to predict each player’s salary, yhat. Players with small yhat are undervalued.
Estimate the model salary =β0 + β1walks + β2xtrabases + u. Use the model to predict each player’s salary, yhat. Players with large yhat are undervalued.

Questions 3and 4. Suppose a researcher wants to investigate the role of breastfeeding on children’s intelligence, and estimates this model:

kidIQ = 0 + 1monthsBF + 2 momIQ + 3female + 4 povinc + u

Where monthsBF is months the child was breastfed, momIQ is mom’s IQ score, female is a dummy =1 if the child is a girl, and povinc is family income relative to the poverty line.

3. Suppose that we estimate a coefficient on monthsBF of 0.5. However, the variable monthsBF is measured with error, so that monthsBF = months breastfed + e1. Suppose further that this measurement error, e1, is uncorrelated with the true months breastfed. We would expect:

A) the true to be greater than 0.5.

B) the true to be less than 0.5.

C) that hat is unbiased.

D) that hat is unbiased but that the standard error for -hat is larger than if we

observed the months breastfed.

4. The researcher believes that “mom’s health” belongs in the true model, but does not observe it. Instead she includes “bmi,” or body-mass index, which she does observe. If mom’s health is correlated with monthsBF and with kidIQ, including bmi in the model will:

A) result in unbiased estimates of because bmi is likely uncorrelated with mom’s

health.

B) cause attenuation bias in 

C) result in a lower R-squared if bmi is an irrelevant variable

D) result in unbiased estimates of , where she is using bmi as a proxy for mom’s

health.

5. We estimate the following model: bwght = β0 + β1npvis + β2npvissq + β3mage + β4 magesq+ u, where npvis is # prenatal visits; npvissq is npvis2; mage is mother’s age; and magesq is mage2.

. reg lbwght npvis npvissq mage magesq

Source | SS df MS Number of obs = 1764

------+------F( 4, 1759) = 11.56

Model | 1.90136387 4 .475340968 Prob > F = 0.0000

Residual | 72.3040459 1759 .0411052 R-squared = 0.0256

------+------Adj R-squared = 0.0234

Total | 74.2054098 1763 .04209042 Root MSE = .20274

------

lbwght | Coef. Std. Err. t P>|t| [95% Conf. Interval]

------+------

npvis | .0180374 .0037086 4.86 0.000 .0107636 .0253112

npvissq | -.0004079 .0001204 -3.39 0.001 -.0006441 -.0001717

mage | .025392 .0092542 2.74 0.006 .0072417 .0435423

magesq | -.0004119 .0001548 -2.66 0.008 -.0007154 -.0001083

_cons | 7.583713 .1370568 55.33 0.000 7.314901 7.852524

------

Holding npvis fixed, at what mother’s age is the birth weight of the child maximized?

A. 61.6

B. 30.8

C. The younger the mother is, the greater the birthweight is on average.

D. The older the mother is, the greater the birthweight is on average.

II. Short Answer

6. (8 points)

a.) Under what conditions would an OLS estimator be consistent but not unbiased? Be sure to give all the necessary conditions for consistency.

b.) Under what conditions can we relax the normality assumption and still have valid inference?

7. (8 points) Consider a linear model to explain monthly beer consumption:

beer = β0 + β1inc + β2price + β3educ + β4female + u

E[u|inc,price,educ,female] = 0 and Var[ u |inc,price,educ,female] = σ2inc2

a.) Write the transformed equation that has a homoskedastic error term (the WLS model).

b.) Suppose the researcher is no longer sure that she has specified the form of the heteroskedasticity correctly, so she decides to do Feasible GLS. What is the general form of heteroskedasticity that Feasible GLS estimates? That is, what is the formula for heteroskedasticity we use with Feasible GLS?

c.) In what way is Feasible GLS an improvement over OLS with robust standard errors if there is heteroskedasticity in the model?

8. (8 points) An econometrician estimates the following model:

colgpa = 0 + 1sat + 2tothrs + 3sathours + 4female + u

where colgpa is college GPA, sat is SAT score, tothrs is the number of credit hours accumulated prior to the semester, sathours is an interaction of sat and tothrs, and female is a dummy variable = 1 if the student is female.

. reg colgpa sat tothrs sathours female

Source | SS df MS Number of obs = 4137

------+------F( 4, 4132) = 294.98

Model | 398.539145 4 99.6347864 Prob > F = 0.0000

Residual | 1395.65653 4132 .337767795 R-squared = 0.2221

------+------Adj R-squared = 0.2214

Total | 1794.19567 4136 .433799728 Root MSE = .58118

------

colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval]

------+------

sat | .0027439 .0001184 23.17 0.000 .0025117 .002976

tothrs | .0155283 .0018969 8.19 0.000 .0118095 .0192472

sathours | -.0000128 1.83e-06 -7.00 0.000 -.0000164 -9.24e-06

female | .2248177 .0184042 12.22 0.000 .1887355 .2608998

_cons | -.3971054 .1236172 -3.21 0.001 -.6394617 -.1547492

------

a.) What is the marginal effect of SAT score on college GPA? Interpret this marginal effect.

b.) Suppose SAT score is reported with error, and that people with higher SAT scores are less likely to report their SAT score incorrectly. Is this classical errors-in-variables? Will this type of measurement error bias the coefficients in any way? If so, which way?

9. (9 points) An econometrician estimates the following model:

lwage = β0 + β1educ + β2exper + β3female + u

After estimating the model, she predicts the fitted values yhat, the residuals uhat, and creates squares of each of those terms (yhatsq and uhatsq)and the cube of yhat (yhatcu). She then performs the following test:

. reg uhatsq yhat yhatsq

Source | SS df MS Number of obs = 526

------+------F( 2, 523) = 4.28

Model | .754211511 2 .377105756 Prob > F = 0.0143

Residual | 46.0645188 523 .088077474 R-squared = 0.0161

------+------Adj R-squared = 0.0123

Total | 46.8187303 525 .089178534 Root MSE = .29678

------

uhatsq | Coef. Std. Err. t P>|t| [95% Conf. Interval]

------+------

yhat | -.530426 .2964568 -1.79 0.074 -1.112818 .0519664

yhatsq | .1904855 .0910306 2.09 0.037 .0116549 .3693161

_cons | .5227343 .2375785 2.20 0.028 .056009 .9894597

------

a.) What kind of test is the econometrician performing? What is the null hypothesis of the test? Does she reject or fail to reject the null hypothesis at the 5% level?

b.) The econometrician decides to use robust standard errors in her model by using Stata’s robust command (, r). Intuitively, what does the robust command do?

10. (12 points) Using the same model as in question 9, the researcher performs the following test:

. reg lwage educ exper female yhatsq yhatcu

Source | SS df MS Number of obs = 526

------+------F( 5, 520) = 61.16

Model | 54.9301257 5 10.9860251 Prob > F = 0.0000

Residual | 93.399636 520 .179614685 R-squared = 0.3703

------+------Adj R-squared = 0.3643

Total | 148.329762 525 .28253288 Root MSE = .42381

------

lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

------+------

educ | -.2582977 .1351011 -1.91 0.056 -.5237087 .0071134

exper | -.0269987 .0141256 -1.91 0.057 -.054749 .0007515

female | .9985318 .5191083 1.92 0.055 -.0212753 2.018339

yhatsq | 2.128348 1.002903 2.12 0.034 .1581091 4.098586

yhatcu | -.3708878 .2194771 -1.69 0.092 -.8020587 .060283

_cons | .7946423 .1680487 4.73 0.000 .4645045 1.12478

------

a.) What type of test is the econometrician performing? What is the null hypothesis of the test? What would she need to do to test this null hypothesis?

b.) If the econometrician rejected the null hypothesis, what should she do?

c.) In the model for question 7 and 8, the dependent variable is log(wage). This means that people who do not work will be omitted from the sample. Is this endogenous or exogenous sample selection? What is the consequence of this type of sample selection?