Activity 6: Regression and Correlation
Observations on rainfall volume (m3) and runoff volume for a particular location were recorded. These data can be obtained on the website. When you click on the link, the data should open automatically in Minitab.
This lab will cover some things we haven’t discussed in class, including Sections 12.4 and 12.5. The topics covered in this lab might be on the final exam.
A) Are the assumptions for regression reasonable in this case? Go to Stat > Regression > Regression and enter the variables. The response is runoff volume and the predictor is rainfall volume. Under ‘Graphs’, obtain a normal probability plot of the residuals and a plot of the residuals versus rainfall volume. Copy and paste these plots below. What do these plots indicate about the assumptions? (The model assumptions are on p. 500. The two plots indicate whether the normality assumption is satisfied and whether the assumption of constant variance σ2 is satisfied. You should know how to check the former assumption using the normal probability plot. The latter assumption may be checked by verifying that the height of the spread of residuals around the center line is roughly constant across the width of the plot of residuals versus rainfall.)
B) Is there a significant linear relationship between rainfall and runoff volume? Give the appropriate hypotheses, show how the test statistic (given in the output) was calculated, report the p-value (also given in the output), and make a complete conclusion. Hint: The linear relationship is significant as long as the estimated slope coefficient is significantly different from zero. If you’re stuck, refer to the model utility test on p. 526.
C) The sample correlation, r, is a numerical measure of the strength of the linear relationship between two variables. It always lies between -1 and 1, and its properties are listed on p. 540. Its square is the coefficient of determination, and its sign (plus or minus) is the same as the slope of the regression line. What is the correlation between rainfall and runoff volume? Explain how you found it from the regression output.
D) Use Minitab to test for the absence of correlation. The alternative will be the 2-sided alternative on p. 544. Go to Stat > Basic Statistics > Correlation. Report a p-value for the test, and interpret it in plain English. (Is there evidence that a linear relationship between rainfall volume and runoff volume exists?)
E) In a year with 30 m3 of rainfall, what is a good estimate of the expected (mean) amount of runoff? What is a good estimate for the actual (mean plus error) runoff? The difference between these questions is the difference between a confidence interval for the mean (top of p. 532) and a prediction interval for a future observation (p. 535). Minitab can give you both intervals at the same time if you go to Stat > Regression > Regression and click the Options button. Type the value(s) of the predictor variable you want in the “new observations” box and make sure the confidence level is 95%. You do not need to check any of the storage boxes; the confidence interval for the mean (CI) and the prediction interval (PI) will be displayed automatically. Give each of these intervals below, and explain in plain English why the PI is wider than the CI.