EXAM3 PRACTICE PROBLEMS
The data from Exam2 reported Car Class (Compact, Midsize, and Large), Displacement (liters), Fuel Type (Premium or Regular), and MPG for 60 US car models. The first two and last three observations and summary stats appear below.
Car / Class / Displacement / Fuel Type / Hwy MPG1 / Midsize / 3.5 / R / 28
2 / Midsize / 3 / R / 26
. / . / . / .
. / . / . / .
58 / Compact / 6 / P / 20
59 / Midsize / 2.5 / R / 30
60 / Midsize / 2 / R / 32
Average / 3.287 / 25.683
StdDev / 1.059 / 3.721
Min / 2 / 15
Max / 6.2 / 33
Count / 60 / 60
Answer these firstfive questions without using the data.
1. Al regressed MPG on displacement and got an estimated coefficient () of -2.97. Al forgot to write down his estimated intercept (). Please calculate it for him.
We know that Al’s regression will go through the sample averages. This means that
25.683 = – 2.97 (3.287). So = 25.683 + 2.97*3.287 = 35.4
2. The engine in car B is a full liter larger than the engine in car A. Using Al’s regression, how much lower do we expect the mileage (MPG) of B to be?
Could the question get be any easier? We expect car B’s MPG to be 2.97 MPG lower than A’s.
3. Is Al’s t-stat (the one associated with the coefficient of his X-variable) positive or negative?
The sign of the t-stat of a regression coefficient always matches the sign of the coefficient. Since Al’s coefficient is negative, his t-stat will be negative.
4. Car B has a 4.2 liter engine. Will car B get more than 30 MPG? (You phone Al and discover that the standard error of his regression model is 2.00.)
The point forecast of B’s MPG is 35.4 – 2.97 (4.2) = 22.93. The probability its MPG will exceed 30 is t.dist.rt[(30-22.93)/2.00,60-2] = 0.0004. This calculation requires the four regression assumptions (MPG is linearly related to displacement, homeskedasticity, normality, and independence).
5. Bo created dummy variable P defined to be 1 if the car used premium gasoline and 0 if it used regular. Bo then regressed MPG on DISPLACEMENT and P and got
CoefficientsIntercept / 35.518
Displacement / -2.761
P / -1.270
Do the cars in the sample that use premium gasoline have a higher or lower average displacement?
The coefficient of Displacement went from -2.97 to -2.76 when P was added to the model. That means that higher Displacement cars tended to use Premium (that makes sense) which moved the -2.72 down to -2.97. So the Premium cars in the sample had higher average displacement.
You may use the data to help answer the remaining questions. You can find the data linked to Class 19 assignment.
6. A new midsize car with a 3.1-liter engine uses Premium gasoline. Will its MPG be less than 27?
To answer this question, we need a model. We know three things about the car: its class, it uses P, and it has a 3.1-liter engine. First we create dummy variables DM (midsize) and DL (large) car classes. We fit the four-variable model.
Coefficients / Standard Error / t Stat / P-valueIntercept / 29.001 / 1.263 / 22.953 / 0.000
DM / 4.262 / 0.701 / 6.077 / 0.000
DL / 2.055 / 0.584 / 3.518 / 0.001
Displacement / -1.613 / 0.275 / -5.859 / 0.000
P / -0.569 / 0.445 / -1.279 / 0.206
Notice that Displacement is a significant predictor of MPG but gasoline type is not. Also notice that Midsize cars get the highest MPGs (for their engine size and fuel type), and Large cars the next highest. Compact cars get the lowest. It is pretty clear that many of the compact cars are low-mileage sports cars. The Midsize cars are the higher mileage economy cars.
Since P is not significant, we drop it from the model and fit the simpler 3-variable model.
Regression StatisticsMultiple R / 0.917
R Square / 0.840
Adjusted R Square / 0.832
Standard Error / 1.527
Observations / 60
ANOVA
df / SS / MS / F / Significance F
Regression / 3 / 686.365 / 228.788 / 98.088 / 2.8283E-22
Residual / 56 / 130.618 / 2.332
Total / 59 / 816.983
Coefficients / Standard Error / t Stat / P-value
Intercept / 28.641 / 1.239 / 23.123 / 0.000
DM / 4.487 / 0.683 / 6.570 / 0.000
DL / 2.115 / 0.585 / 3.614 / 0.001
Displacement / -1.640 / 0.276 / -5.944 / 0.000
Note that the overall model is significant (P-value of 2.8E-22), displacement is significant (p-value of 0.000 to three decimals), and car type is significant (both M and L coefficient are significant).
Our point forecast of MPG for the new car is
Coefficients / New CarIntercept / 28.641 / 1
DM / 4.487 / 1
DL / 2.115 / 0
Displacement / -1.640 / 3.1
Point Forecast / 28.043
Assuming the four regression assumptions hold, the probability the new car will get more than 27 MPG is t.dist.rt[(27-28.043)/1.527,56] = 0.75.
7. Which car’s mileage is most unusually high (low)?
When I ran the above regression, I checked the box for RESIDUALS (not shown here). Car 10 has the highest positive residual of 3.97 meaning it’s MPG is 3.97 higher than expected for its class and engine size. (10 is a midsize, 2.5 liter getting 33 MPG.) Car 49 has the most negative residual. Its MPG is 4.29 below its expectation. (49 is a Compact 5.7 liter getting only 15 MPG.) Interestingly, cars 10 and 49 also have the overall lowest and highest MPGs. These cars’ MPGs are both the most extreme AND the most extreme for their class/displacement.