My E-Lab Tools Report
Regression Analysis with Diagnostic Tools
The data from highway deaths per 100 million vehicle miles and highway speed limits for 10 countries are given below:
(Death, Speed) = (3.0, 55), (3.3, 55), (3.4, 55), (3.5, 70), (4.1, 55), (4.3, 60), (4.7, 55), (4.9, 60), (5.1, 60), and (6.1, 75).
I use these data to test this program.
Results are:
Mean (x) = 60Mean (y) = 4.24
Variance (x) = 50Variance (y) = 0.9493333
Slope = 0.0755556Its Standard Error = 0.0407401
Intercept = 0.293333Its Standard Error = 2.459341
Correlation = 0.54833Its Standard Error = 0.2956633
F- Statistic = 3.4394525Its P-value = 0.1008
Conclusion: Strong evidence against linearity
Report on the residuals:
Randomness: Strong evidence against randomness
Normality: No evidence against normality.
The numerical results agree with results from Excel. Notice that a p-value of 0.1008 for F-statistics is close enough to 0.10 that at the 0.10 confidence level the evidence still favors rejecting the null hypothesis, i.e. there is evidence of a linear relationship.
Notice also that there might be scaling problem with the data, since x and y have a huge difference in their magnitudes. So one might suggest dividing X value by, say 10, before constructing a regression model.
To overcome the non-randomness of the residual one may use the log transformation of the X values.
Scattered Diagram
Data from textbook problem 12.8 are used in this program. Results are:
Scatter Diagram
The numbers represent observation-counts,
which may have repeated values or almost the same magnitude.
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / 1 / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / 2 / -- / -- / -- / -- / --
-- / -- / -- / -- / 1 / -- / 1 / -- / -- / -- / -- / --
-- / -- / -- / -- / 1 / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / 1 / -- / -- / -- / 1 / -- / -- / --
-- / -- / -- / -- / 2 / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
No outlier was removed.
The resulting scattered diagram gives a visual depiction, although not exact, rendition of the data, which nevertheless is acceptable for a high-level view of the data and possible linear relationship.
Testing the Population Correlation Coefficient
Data from textbook problem 12.8 are used. Results are:
H0: The population's correlation is about the claimed value.
Ha: The population's correlation is quite different from the claimed value.
With the claimed population’s correlation = 0, the results are:
Correlation (X, Y) = 0.9707253
Test-statistic = 2.97652
P-value = 0.00146
Conclusion: Very strong evidence against the null hypothesis.
Notice that, the null hypothesis means the claimed population’s correlation is 0. The test-statistic used is based on the Fisher's transformation, which is more general in testing any claimed value for the correlation. Note that the t-statistic given in our textbook is limited in testing the zero-correlation only. For more technical details I clicked on (Back to: Statistical Thinking for Decision-Making).
The program does conclude correctly that there is very strong evidence against the null hypothesis.