Pollution and Mortality Stepwise Selection (Part I)
Output from Stepwise (menu version)
*** Stepwise Regression ***
*** Stepwise Model Comparisons ***
Start: AIC= 98206.93
MORTALITY ~ PRECIP + HUMIDITY + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + SOUND + DENSITY + NONWHITE + WHITECOL + POOR
Single term deletions
Model:
MORTALITY ~ PRECIP + HUMIDITY + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + SOUND + DENSITY +
NONWHITE + WHITECOL + POOR
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63229.12 98206.9
PRECIP 1 4884.88 68114.00 100401.2
HUMIDITY 1 14.46 63243.58 95530.8
JANTEMP 1 6316.09 69545.21 101832.4
JULYTEMP 1 3747.87 66976.99 99264.2
OVER65 1 415.17 63644.29 95931.5
HOUSE 1 2741.69 65970.81 98258.0
EDUC 1 4660.46 67889.57 100176.8
SOUND 1 57.89 63287.01 95574.2
DENSITY 1 2189.56 65418.68 97705.9
NONWHITE 1 33120.46 96349.58 128636.8
WHITECOL 1 79.36 63308.48 95595.7
POOR 1 63.04 63292.16 95579.4
Step: AIC= 95530.79
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + SOUND + DENSITY + NONWHITE + WHITECOL + POOR
Single term deletions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + SOUND + DENSITY + NONWHITE +
WHITECOL + POOR
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63243.58 95530.8
PRECIP 1 5040.61 68284.19 97880.8
JANTEMP 1 6616.78 69860.36 99457.0
JULYTEMP 1 4621.08 67864.66 97461.3
OVER65 1 406.09 63649.67 93246.3
HOUSE 1 2727.89 65971.48 95568.1
EDUC 1 4718.21 67961.79 97558.4
SOUND 1 51.42 63295.01 92891.6
DENSITY 1 2249.71 65493.29 95089.9
NONWHITE 1 33121.69 96365.27 125961.9
WHITECOL 1 77.44 63321.02 92917.6
POOR 1 57.75 63301.33 92897.9
Single term additions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + SOUND + DENSITY + NONWHITE +
WHITECOL + POOR
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63243.58 95530.79
HUMIDITY 1 14.4627 63229.12 98206.93
Step: AIC= 92891.62
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + DENSITY + NONWHITE + WHITECOL + POOR
Single term deletions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + DENSITY + NONWHITE + WHITECOL +
POOR
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63295.01 92891.6
PRECIP 1 5155.58 68450.59 95356.6
JANTEMP 1 9717.81 73012.82 99918.8
JULYTEMP 1 4711.22 68006.23 94912.2
OVER65 1 435.81 63730.81 90636.8
HOUSE 1 2757.86 66052.87 92958.9
EDUC 1 5428.63 68723.63 95629.6
DENSITY 1 2209.05 65504.06 92410.1
NONWHITE 1 33200.71 96495.72 123401.7
WHITECOL 1 57.79 63352.80 90258.8
POOR 1 15.60 63310.60 90216.6
Single term additions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + DENSITY + NONWHITE + WHITECOL +
POOR
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63295.01 92891.61
HUMIDITY 1 7.99578 63287.01 95574.22
SOUND 1 51.42473 63243.58 95530.79
Step: AIC= 90216.61
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + DENSITY + NONWHITE + WHITECOL
Single term deletions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + DENSITY + NONWHITE + WHITECOL
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63310.6 90216.6
PRECIP 1 5333.89 68644.5 92859.9
JANTEMP 1 14436.46 77747.1 101962.5
JULYTEMP 1 5788.63 69099.2 93314.6
OVER65 1 571.09 63881.7 88097.1
HOUSE 1 2892.42 66203.0 90418.4
EDUC 1 5655.27 68965.9 93181.3
DENSITY 1 2661.80 65972.4 90187.8
NONWHITE 1 46479.10 109789.7 134005.1
WHITECOL 1 53.60 63364.2 87579.6
Single term additions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + DENSITY + NONWHITE + WHITECOL
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63310.60 90216.61
HUMIDITY 1 7.54469 63303.06 92899.67
SOUND 1 9.27203 63301.33 92897.94
POOR 1 15.59837 63295.01 92891.61
Step: AIC= 87579.61
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + DENSITY + NONWHITE
Single term deletions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + DENSITY + NONWHITE
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63364.2 87579.6
PRECIP 1 5425.43 68789.6 90314.4
JANTEMP 1 14469.14 77833.3 99358.2
JULYTEMP 1 6095.32 69459.5 90984.3
OVER65 1 589.85 63954.1 85478.9
HOUSE 1 2839.41 66203.6 87728.4
EDUC 1 9826.79 73191.0 94715.8
DENSITY 1 2608.21 65972.4 87497.2
NONWHITE 1 47493.24 110857.4 132382.2
Single term additions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + OVER65 + HOUSE + EDUC + DENSITY + NONWHITE
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63364.21 87579.61
HUMIDITY 1 7.40322 63356.80 90262.81
SOUND 1 5.37617 63358.83 90264.84
WHITECOL 1 53.60301 63310.60 90216.61
POOR 1 11.40700 63352.80 90258.81
Step: AIC= 85478.86
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + HOUSE + EDUC + DENSITY + NONWHITE
Single term deletions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + HOUSE + EDUC + DENSITY + NONWHITE
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63954.1 85478.9
PRECIP 1 5421.96 69376.0 88210.2
JANTEMP 1 14954.43 78908.5 97742.7
JULYTEMP 1 5601.36 69555.4 88389.6
HOUSE 1 2562.10 66516.2 85350.4
EDUC 1 9758.83 73712.9 92547.1
DENSITY 1 2983.22 66937.3 85771.5
NONWHITE 1 65702.71 129656.8 148491.0
Single term additions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + HOUSE + EDUC + DENSITY + NONWHITE
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 63954.06 85478.86
HUMIDITY 1 0.0990 63953.96 88169.37
OVER65 1 589.8494 63364.21 87579.61
SOUND 1 3.5439 63950.51 88165.92
WHITECOL 1 72.3655 63881.69 88097.10
POOR 1 139.3969 63814.66 88030.07
Step: AIC= 85350.36
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + EDUC + DENSITY + NONWHITE
Single term deletions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + EDUC + DENSITY + NONWHITE
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 66516.2 85350.4
PRECIP 1 6681.27 73197.4 89341.0
JANTEMP 1 12537.37 79053.5 95197.1
JULYTEMP 1 6093.75 72609.9 88753.5
EDUC 1 7403.87 73920.0 90063.6
DENSITY 1 6540.34 73056.5 89200.1
NONWHITE 1 73230.76 139746.9 155890.5
Single term additions
Model:
MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + EDUC + DENSITY + NONWHITE
scale: 1345.3
Df Sum of Sq RSS Cp
<none> 66516.16 85350.36
HUMIDITY 1 0.298 66515.86 88040.67
OVER65 1 312.544 66203.61 87728.42
HOUSE 1 2562.100 63954.06 85478.86
SOUND 1 6.230 66509.93 88034.73
WHITECOL 1 19.630 66496.53 88021.33
POOR 1 30.794 66485.36 88010.17
*** Linear Model ***
Call: lm(formula = MORTALITY ~ PRECIP + JANTEMP + JULYTEMP + EDUC + DENSITY + NONWHITE, data = ex1217,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-80.68 -21.53 1.424 22.77 83.05
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 1242.4388 123.2833 10.0779 0.0000
PRECIP 1.4013 0.6073 2.3073 0.0250
JANTEMP -1.6845 0.5330 -3.1607 0.0026
JULYTEMP -2.8403 1.2890 -2.2035 0.0319
EDUC -16.1578 6.6524 -2.4289 0.0186
DENSITY 0.0076 0.0033 2.2828 0.0265
NONWHITE 5.2753 0.6906 7.6387 0.0000
Residual standard error: 35.43 on 53 degrees of freedom
Multiple R-Squared: 0.7086
F-statistic: 21.48 on 6 and 53 degrees of freedom, the p-value is 1.305e-012
Alternative Search using the Command Line:
The function leaps(X, Y, method=”Cp”) finds the “best” subsets with a given number of predictors using a method of model selection criteria such as "Cp", "r2", and "adjr2".
The output is a list with four components giving information on the regression subsets. Components Cp (or adjr2 or r2), size and label will all have the same length -- one element per subset; this will be the number of rows in which.
Cp,adjr2,r2
the first returned component will be named Cp, adjr2, or r2 depending on the method used for evaluating the subsets. This component gives the values of the desired statistic. If r2 or adjr2 are used, the result is in percent.
size
the number of explanatory variables (including the constant term if int is TRUE) in each subset.
label
a vector of character strings, each element giving the names of the variables in the subset.
which
logical matrix with as many rows as there are returned subsets. Each row is a logical vector that can be used to select the columns of x in the subset.
int
logical value telling whether the which matrix contains, as its first column, the status of the intercept variable.
Ø leaps.1217 <- leaps(ex1217[,3:14], ex1217[,2])
Ø plot(leaps1217$size,leaps1217$Cp, ylab="Mallow's Cp", xlab="p (Number of Coefficients in the Model)")
Ø title("Cp Plot for Mortality Data without Pollution Variables")
Ø abline(0,1)
Ø leaps1217$rank = rank(leaps1217$Cp) # gives the rank of each model
Ø leaps1217$best = sort.list(leaps1217$Cp) # leaps1217$best[1] is the index of the best model (number 51)
Ø leaps1217$label[51] # included variables:
Ø "PRECIP,JANTEMP,JULYTEMP,EDUC,DENSITY,NONWHITE”
Ø
Ø Next Time BIC