My E-Lab Tools Report

Regression Analysis with Diagnostic Tools

The data from highway deaths per 100 million vehicle miles and highway speed limits for 10 countries are given below:

(Death, Speed) = (3.0, 55), (3.3, 55), (3.4, 55), (3.5, 70), (4.1, 55), (4.3, 60), (4.7, 55), (4.9, 60), (5.1, 60), and (6.1, 75).

I use these data to test this program.

Results are:

Mean (x) = 60Mean (y) = 4.24

Variance (x) = 50Variance (y) = 0.9493333

Slope = 0.0755556Its Standard Error = 0.0407401

Intercept = 0.293333Its Standard Error = 2.459341

Correlation = 0.54833Its Standard Error = 0.2956633

F- Statistic = 3.4394525Its P-value = 0.1008

Conclusion: Strong evidence against linearity

Report on the residuals:

Randomness: Strong evidence against randomness

Normality: No evidence against normality.

The numerical results agree with results from Excel. Notice that a p-value of 0.1008 for F-statistics is close enough to 0.10 that at the 0.10 confidence level the evidence still favors rejecting the null hypothesis, i.e. there is evidence of a linear relationship.

Notice also that there might be scaling problem with the data, since x and y have a huge difference in their magnitudes. So one might suggest dividing X value by, say 10, before constructing a regression model.

To overcome the non-randomness of the residual one may use the log transformation of the X values.

Scattered Diagram

Data from textbook problem 12.8 are used in this program. Results are:

Scatter Diagram

The numbers represent observation-counts,
which may have repeated values or almost the same magnitude.

-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / 1 / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / 2 / -- / -- / -- / -- / --
-- / -- / -- / -- / 1 / -- / 1 / -- / -- / -- / -- / --
-- / -- / -- / -- / 1 / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / 1 / -- / -- / -- / 1 / -- / -- / --
-- / -- / -- / -- / 2 / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --
-- / -- / -- / -- / -- / -- / -- / -- / -- / -- / -- / --

No outlier was removed.

The resulting scattered diagram gives a visual depiction, although not exact, rendition of the data, which nevertheless is acceptable for a high-level view of the data and possible linear relationship.

Testing the Population Correlation Coefficient

Data from textbook problem 12.8 are used. Results are:

H0: The population's correlation is about the claimed value.
Ha: The population's correlation is quite different from the claimed value.

With the claimed population’s correlation = 0, the results are:

Correlation (X, Y) = 0.9707253

Test-statistic = 2.97652

P-value = 0.00146

Conclusion: Very strong evidence against the null hypothesis.

Notice that, the null hypothesis means the claimed population’s correlation is 0. The test-statistic used is based on the Fisher's transformation, which is more general in testing any claimed value for the correlation. Note that the t-statistic given in our textbook is limited in testing the zero-correlation only. For more technical details I clicked on (Back to: Statistical Thinking for Decision-Making).

The program does conclude correctly that there is very strong evidence against the null hypothesis.