Tom Dolezal
Econometrics HW #6
1. A. Here is the copy/pasted SAS results:
The SAS System 20:11 Wednesday, April 23, 2008 1
The REG Procedure
Model: MODEL1
Dependent Variable: lnpricek
Number of Observations Read 880
Number of Observations Used 880
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 6 92.51688 15.41948 420.24 <.0001
Error 873 32.03227 0.03669
Corrected Total 879 124.54915
Root MSE 0.19155 R-Square 0.7428
Dependent Mean 4.64671 Adj R-Sq 0.7410
Coeff Var 4.12232
Parameter Estimates
Parameter Standard
Variable Label DF Estimate Error t Value Pr > |t|
Intercept Intercept 1 3.99461 0.03778 105.73 <.0001
sqfth 1 0.06374 0.00204 31.32 <.0001
beds beds 1 -0.08486 0.01337 -6.35 <.0001
baths baths 1 0.00891 0.01812 0.49 0.6231
age age 1 -0.00245 0.00036601 -6.70 <.0001
stories stories 1 -0.01827 0.02191 -0.83 0.4045
Vacant Vacant 1 -0.08031 0.01324 -6.06 <.0001
Overall, the model is significant with a decently high adj. R2 value. Here, the sign of sqfth and baths are positive, as would be expected, since having more of those would be expected to lead to a % increase in the price. Age is also as is expected, as it is negative, and I would think that an older house, all other things equal, would sell for less than a newer house. But stories and beds are negative. I would expect those two variables to be positive, as having more stories and more bedrooms would seemingly increase the value of a home. As for the significance, a 1 unit increase in the respective variable would have a (insert whatever coefficient here times100) percentage increase on the price of the home/1000 (ex: 1 extra bathroom will increase the price of the home/1000 by 0.891%), all other things equal. Baths and stories are not significant at the 95% level, but all of the other variables are.
b. If the home is vacant, the home sale price/1000 will be 8.031% lower than if it was not vacant, all other things equal.
c. Here is the copy/pasted SAS output. I used ‘proc sort’ and ‘by vacant’ statements:
The SAS System 20:11 Wednesday, April 23, 2008 9
------Vacant=0 ------
The REG Procedure
Model: MODEL1
Dependent Variable: lnpricek
Number of Observations Read 415
Number of Observations Used 415
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 5 50.04064 10.00813 233.63 <.0001
Error 409 17.52030 0.04284
Corrected Total 414 67.56094
Root MSE 0.20697 R-Square 0.7407
Dependent Mean 4.73155 Adj R-Sq 0.7375
Coeff Var 4.37427
Parameter Estimates
Parameter Standard
Variable Label DF Estimate Error t Value Pr > |t|
Intercept Intercept 1 3.97972 0.05544 71.78 <.0001
sqfth 1 0.06847 0.00301 22.78 <.0001
beds beds 1 -0.09777 0.01996 -4.90 <.0001
baths baths 1 0.01932 0.02517 0.77 0.4433
age age 1 -0.00200 0.00053874 -3.70 0.0002
stories stories 1 -0.06547 0.03374 -1.94 0.0530
The SAS System 20:11 Wednesday, April 23, 2008 10
------Vacant=1 ------
The REG Procedure
Model: MODEL1
Dependent Variable: lnpricek
Number of Observations Read 465
Number of Observations Used 465
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 5 37.25184 7.45037 242.82 <.0001
Error 459 14.08308 0.03068
Corrected Total 464 51.33492
Root MSE 0.17516 R-Square 0.7257
Dependent Mean 4.57099 Adj R-Sq 0.7227
Coeff Var 3.83206
Parameter Estimates
Parameter Standard
Variable Label DF Estimate Error t Value Pr > |t|
Intercept Intercept 1 3.92459 0.04721 83.13 <.0001
sqfth 1 0.05931 0.00277 21.42 <.0001
beds beds 1 -0.06785 0.01780 -3.81 0.0002
baths baths 1 -0.01034 0.02646 -0.39 0.6961
age age 1 -0.00289 0.00049956 -5.78 <.0001
stories stories 1 0.02653 0.02847 0.93 0.3519
Again, the model is significant, with a decently high adjusted R2 value. Here, the coefficient of baths is positive in the first yet negative in the second. Also, the stories coeffeicient is negative in the first yet positive in the second. Beds is still the opposite from expected sign in both. Sqfth, beds, and age are about the same in both.
d.Ho: The models are equivalent Ha: the models are not equivalent
F= ((SSErestricted – (SSE1+SSE2))/(k+1))/((SSE1+SSE2)/((n1+n2)-2(k+1)) =
(32.03-(14.08+17.52))/(6+1)/((14.08+17.52)/((415+465)-2(6+1)) = 1.68
F7,864= 2.01, and 1.68<2.01, so we can NOT reject Ho. So I can conclude that the models are equivalent.
2. Here are the regression results:
The SAS System 22:03 Wednesday, April 23, 2008 2
The REG Procedure
Model: MODEL1
Dependent Variable: lnsalary
Number of Observations Read 353
Number of Observations Used 353
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 13 321.65592 24.74276 49.19 <.0001
Error 339 170.51958 0.50301
Corrected Total 352 492.17551
Root MSE 0.70923 R-Square 0.6535
Dependent Mean 13.49218 Adj R-Sq 0.6403
Coeff Var 5.25660
Parameter Estimates
Parameter Standard
Variable Label DF Estimate Error t Value Pr > |t|
Intercept Intercept 1 11.12955 2.30445 4.83 <.0001
years years 1 0.05842 0.01227 4.76 <.0001
gamesyr gamesyr 1 0.00977 0.00338 2.89 0.0041
bavg bavg 1 0.00048138 0.00114 0.42 0.6734
hrunsyr hrunsyr 1 0.01915 0.01596 1.20 0.2312
rbisyr rbisyr 1 0.00179 0.00748 0.24 0.8112
runsyr runsyr 1 0.01187 0.00453 2.62 0.0091
fldperc fldperc 1 0.00028326 0.00231 0.12 0.9024
allstar allstar 1 0.00634 0.00288 2.20 0.0287
frstbase frstbase 1 -0.13280 0.13092 -1.01 0.3111
scndbase scndbase 1 -0.16110 0.14143 -1.14 0.2555
thrdbase thrdbase 1 0.01453 0.14304 0.10 0.9192
shrtstop shrtstop 1 -0.06057 0.13020 -0.47 0.6421
catcher catcher 1 0.25356 0.13131 1.93 0.0543
a. Ho: catcher = outfield
Ha: catcher ≠ outfield
Note: this can be done in 2 different ways. The first, is to just use the t-statistic for catcher above, since outfield (which is not included) is the intercept, and that measures if it is different. The p-value here is .0543, so I can NOT reject the null at the α = .05 significance level.
Or, you could just do it this way (what I did first before I realized that the above way worked as well: gives the same results):
SAS was giving me an error, so I had to include outfield in the above model. I used the ‘test’ statement, and got the following results:
The SAS System 22:03 Wednesday, April 23, 2008 5
The REG Procedure
Model: MODEL1
Test 1 Results for Dependent Variable lnsalary
Mean
Source DF Square F Value Pr > F
Numerator 1 1.87551 3.73 0.0543
Denominator 339 0.50301
At the α = .05 level, I must NOT reject this (although it is very close to being rejected), because .0543>.05. So, I can conclude that, statistically, the salaries of an outfielder and a catcher are the same.
b. Ho: frstbase = scndbase = thrdbase = shrstop = outfield = catcher
Ha: they are ≠
Again, here I used the ‘test outfield = catcher = frstbase = scndbase = thrdbase = shrtstop’ statement in SAS, and got this result:
The SAS System 22:03 Wednesday, April 23, 2008 23
The REG Procedure
Model: MODEL1
Test 1 Results for Dependent Variable lnsalary
Mean
Source DF Square F Value Pr > F
Numerator 5 0.89406 1.78 0.1168
Denominator 339 0.50301
Here, the p-value > .05, so I can NOT reject the null. So, in conclusion, there is no difference in average salary across positions.