Biometry Assignment #11 – Simple Linear Regression (50 pts.)

Spring 2017

1 - Mercury Contamination of Walleyes in IslandLake Reservoir

Goal:Develop a regression models to predict/explain mercury level found in the tissues of a walleye (ppm) using length (in.).

Data Set:Island Lake

This assignment is similar to your simple linear regression handout; however, I want you to investigate mercury contamination levels found in walleyes (ppm)versus the length of the walleye (in.). The primary interest is in developing a walleye consumption advisory based on length for walleyes in Island Lake Reservoir near Duluth, so let Y=HGPPM and X = LGTHIN.

Main Items to address:

a. Obtain a linear correlation measurement to initially investigate the linear relationship between these two variables. Are these variables linearly related to each other? Explain.

(2 pts.)

b. Perform the overall regression usefulness test (i.e. HO: Regression is not useful vs HA: Regression is useful) to formalize your initial investigation of these variables. What is your decision for this test? Write a conclusion in everyday language for this test. (2 pts.)

c. Perform the test to ensure that the slope of our regression line is not zero (i.e. HO: LGTHIN = 0 vs HA: LGTHIN ≠ 0). What is your decision for this test? Write a conclusion using everyday language for this test. (2 pts.)

d. What is the RSquare value for this analysis? In the context of this problem, carefully explain what this number is measuring.(3 pts.)

e. Create a scatterplot of the data with the estimated regression line added.

In the context of this problem, carefully interpret the y-intercept and slope of your estimated regression line. Again, carefully explain what these numbers are measuring. (You need to do more than say they are the y-intercept and slope of the line.) (4 pts.)

f. Discuss whether or not the assumptions for this procedure are being meet. Also, identify any outliers in the data set. If there are problems, I do not expect to you try to fix them, just identify them and for the purposes of the rest of the problem we will pretend they are not there. (4 pts.)
Checking the assumptions:
Model Appropriate: Make sure no existing trends remain in the residual plot.
> Constant Variance: Make sure there is no megaphone patterns in the residual plot
Independence: Don’t really need to check this as these data are collected over
time.
> Normality: Make a histogram of the residuals and make sure they follow a normal
distribution
> Outliers: Any observations that fall outside ±2*RMSE are considered possible outliers.

g. It is recommended that humans should not consume more than one fish per month with mercury levels in its tissues greater than .5 ppm. Because your average walleye angler does not carry a gas spectrometer in their fishing boat, actually measuring the Hg level found in a walleye they have caught is a problem. However, it is very easy for an angler to measure the length of their walleye in inches.

Using your regression, model what length of walleye would you recommend for the “do not eat more than one walleye exceeding ______inches per month” advisory? (2 pts.)

It is also recommended that humans should never consume fish with mercury levels exceeding 1 ppm in their tissues. Complete the following “we recommend that you do not eat any walleyes exceeding ______inches from IslandLake”. (2 pts.)

h. Using your regression analysis, estimate the mean mercury level found in the population of walleyes in IslandLake that are the lengths below. Give both a single value estimate and a 95% confidence interval for the mean. Also give the correct interpretation of the confidence interval for each case.

a) 21.2 inches in length (Note: this is the actual length of one of the walleyes in the data)(3 pts.)

b) 11 inches in length (Note: this is the actual length of one of the walleyes in the data) (3 pts.)

i. Suppose you just caught a whopper walleye measuring 25.1 inches from IslandLake. What do predict the mercury level would be in this particular fish? Give both a single value estimate and an interval estimate, giving the correct interpretation of the interval estimate. (3 pts.)

j. Would you recommend using your model to predict the mercury level for a walleye that is 8 inches in length? How about 29 inches? Explain your reasoning. (1 pt.)

k. Would you recommend using this model to predict the mercury levels for walleyes in the Mississippi River? Explain. (1 pt.)

l. The IslandLake walleye data file also contains the weight (lbs.) for each of the fish sampled. Do you think using weight as opposed to length to establish consumption advisories is a good idea? Justify your answer. (2 pts.)

2 - Waist Circumference and Deep Abdominal AT

Goal:Develop a regression model to predict/explain deep abdominal AT (Y)using waist circumference (cm) as the predictor (X).

Data File:Waist Circumference

Despres et al. in “Estimate of Deep Abdominal Adipose-Tissue Accumulation from Simple Anthropometric Measurements in Men”, American Journal of Clinical Nutrition, (1991), point out that the topography of adipose tissue (AT) is associated with metabolic complications considered as risk factors for cardiovascular disease. It is important, they state, to measure the amount of intraabdominal AT as part of the evaluation of the cardiovascular-disease risk of an individual. Computed tomography (CT), the only available technique that precisely and reliably measures the amount of deep abdominal AT, however, is costly and requires irradiation of the subject. In addition, the technique is not available to many physicians. Despres and his colleagues conducted a study to develop equations to predict the amount of deep abdominal AT from simple anthropometric measurements. Their subjects were men between the ages of 18 and 42 years who were free from metabolic disease that would require treatment. Among the measurements taken on each subject were deep abdominal AT obtained by CT and waist circumference (cm). The question of interest is how well can one predict and estimate deep abdominal AT from a knowledge of waist circumference.

Main Items to address:

a.) Create a scatter plot of the data and compute the correlation between waist circumference and deep abdominal AT. Comment what you see in this plot in terms of the relationship between deep abdominal AT and waist circumference. (2 pts.)

b.) In the context of this problem, carefully interpret the y-intercept and slope of your estimated regression line, i.e. carefully explain what these numbers are measuring.
(You need to do more than say they are the y-intercept and slope of the line.) Also explain to a colleague how they would use this model to predict deep abdominal AT.

(3 pts.)

c.) What is the R- Square value for this analysis? In the context of this problem, carefully explain what this number is measuring. (2 pts.)

d.) Discuss whether or not the regression assumptions are being met. Also, identify any outliers in the data set. (4 pts.)
Checklist for checking the regression model assumptions:
Model Appropriate: Make sure no existing trends remain in the residual plot.
> Constant Variance: Make sure there is no megaphone patterns in the residual plot
Independence: Don’t really need to check this as these data are collected over
time.
> Normality: Make a histogram of the residuals and make sure they follow a normal
distribution
> Outliers: Any observations that fall outside ±2*RMSE are considered possible outliers.

e.) Give a 95% prediction interval for the deep abdominal AT for 25 year old man with a waist circumference of 105 cm. Interpret this interval. (2 pts.)
Note: There is an individual with a waist circumference of 105 cm in these data.

f.) Can we use the results of this study to predict the deep abdominal AT of an individual with a waist circumference of 135 cm? Explain. (1 pt.)

g.) Can we use the results of this study predict the deep abdominal AT of 50 year old male with a waist circumference of 100 cm? Explain. (1 pt.)

h.) Can we use the results of this study predict the deep abdominal AT of 24 year old female with a waist circumference of 70 cm? Explain. (1 pt.)

1