Title: Correlation

Topic: Correlation

Objectives:

1. To interpret correlation

2. To learn to roughly estimate correlation

3. To understand when correlation is not useful

4. To recognize distinction between correlation and causation

State Core Reference: ______

Materials: A TI-84 or TI-83 for each group of 2-3 participants (to play “Guess the Correlation”). The game will need to be loaded onto the calculators.

Instructions and Activity Description:

Students will work through the following questions individually or in groups. There are four parts, ‘motivation’, ‘what is correlation?’, ‘interpreting correlation’, and ‘cautions’ – work through and discuss part I before handing out the remainder of the task sheet. There are comments (in italics) in the outline below that do not appear on the student task sheets. After working through and discussing parts I- IV, play “Guess the Correlation”.

Activity Guide

Part I. Motivation: % body fat is a difficult variable to measure so it would be helpful to use a more easily measured variable to estimate an individual’s % body fat.

A. Without looking at the data, which variable would you expect to be most helpful in estimating body fat, age, weight or abdomen circumference? Why?

B. Now considering the scatterplots below, which variable would you use to estimate % body fat? Why?

The strength of the relationship between % body fat and ab. circumference appears to be strongest. Thus this will give us the best estimate.

C. Which variable would give the worst estimate of % body fat? Explain.

Age does not appear to be associated with % body fat.

D. Use the variable you chose in B and the plots to estimate the % body fat of a 43 yr-old who weighs 175 lbs and has an ab. circumference of 80 in. About how accurate do you expect your estimate to be?

Estimates of error can be found by considering the spread in the plot. This is not examined in detail in this activity.

Scatterplots of pairs of body size variables for 252 men are shown below. (This dataset is available at lib.stat.cmu.edu/datasets/bodyfat)


II. What is correlation? : The correlation, denoted r, between two variables describes how closely the points are clustered around a line and indicates the strength and direction of a linear association.

-1 ≤ r ≤ 1
o  r = -1 signifies perfect negative correlation.
o  r = 0 signifies no correlation.
o  r = 1 signifies perfect positive correlation.

Use the scatterplots to estimate the correlation between the other pairs of variables:

1. The correlation between age and weight is about______.
The actual correlation is about -0.013.
2. The correlation between age and ab. circumference is about______. The actual correlation is about 0.230.
3. The correlation between weight and ab. circumference is about______. The actual correlation is about 0.888.
Name two variables for which the correlation would be negative. /

III. Interpreting correlation – Students will discuss the following scenarios in pairs or with groups.

Main point: Correlation does not equal causation.

A. As abdominal circumference increases, so does age - does this mean that an increase in your belly size will cause you to get older? Explain.

B. In a recent article entitled “The leaner you are, the richer you'll get” byT. Kostigen of MarketWatch, the author reported on a study that concluded "All other things considered… obese people tend to earn less [than average].” That is, weight is negatively correlated with income. The author summed up by writing “So don't be stupid: Get in shape to get rich.” Is the author’s advice justified by the study? Explain.

IV. Cautions.

Non-linear association: The correlation between temperature in Nottingham and time is about 0.05. Does this mean there is not association between the variables? Explain.

Correlation is a measure of linear association only. The variables ‘temperature’ and ‘time’ are strongly associated as shown in the graph below, but the association is not linear. Show examples of other types of association that the correlation coefficient would fail to measure (outliers, quadratic association, etc).

Part V. Play “guess the correlation”.

Notes to Self:

Correlation Activity Task Sheet

The body fat percentage for an individual is a difficult variable to measure. One way to get around the difficulty is to use a more easily measured variable to estimate % body fat.

Part I: Motivation

1. The age, weight, and abdomen circumference of an individual are easily obtained. Which variable would you expect to be most helpful in estimating body fat? Why?

2.Considering the scatterplots below. The three plots show bodyfat plotted against age, weight, and abdomen circumference. Based on what you see if the plots, which variable would you use to estimate % body fat? Why?

3. Which of the three variables would you expect to give the worst estimate of % body fat? Explain.

4. Using only the variable you chose in B and the plots below, estimate the % body fat of a 43 yr-old who weighs 175 lbs and has an abdomen circumference of 80 in. About how accurate do you expect your estimate to be?

Scatterplots of pairs of body size variables for 252 men are shown below. (This dataset is available at lib.stat.cmu.edu/datasets/bodyfat)

Part II. What is correlation?

The correlation between two variables describes how closely the points are clustered around a line and indicates the strength and direction of a linear association. It is described by the correlation coefficient, r.

-1 ≤ r ≤ 1
o  r = -1 signifies perfect negative correlation.
o  r = 0 signifies no correlation.
o  r = 1 signifies perfect positive correlation.

Use the scatterplots to estimate the correlation between the other pairs of variables:

1. The correlation between age and weight is about______.
2. The correlation between age and ab. circumference is about______.
3. The correlation between weight and ab. circumference is about______.
Name two variables for which the correlation would be negative. /

Part III. Interpreting correlation

Discuss the following scenarios in pairs or with groups.

1. As abdominal circumference increases so does age, does this mean that an increase in your belly size will cause you to get older? Explain.


2. In a recent article entitled “The leaner you are, the richer you'll get” byT. Kostigen of MarketWatch, the author reported on a study that concluded "All other things considered… obese people tend to earn less [than average].” That is, weight is negatively correlated with income. The author summed up by writing “So don't be stupid: Get in shape to get rich.” Is the author’s advice justified by the study? Explain.

Part IV. Cautions

Non-linear association: The correlation between temperature in Nottingham and time is about 0.05. Does this mean there is no association between the variables? Explain.