Scatter Plot, Coefficient of Correlation(r), Linear Regression Line and Equation

  1. In order to determine r, you must turn on the DIAGNOSTIC command.

To do so, choose 2nd CATALOG, which is found above the 0 key.

Use the down arrow to scroll to DiagnosticOn, and press enter twice.

  1. To plot the scatter plot, type the x values in L1 and the y values in L2. Choose 2nd stat plot (y=). Turn the plot 1 on. Choose the first type of graph. Make sure the XList and YList are correct. Choose ZOOM, then ZoomStat(9) to set the window and graph.
  1. To drawn the line of fit and find the r value and the values for the equation (y=ax+b), choose STAT, CALC, option 4:LinReg(ax+b).

Choose the VARS button( next to CLEAR), right arrow to Y-VARS, option 1: function, press enter, which will place Y1 after the command LinReg(ax+b). Press enter. Or type equation into y=, graph.

  1. Press graph, and the line of fit or the regression line will be drawn.
  1. Linear Regression Hypothesis T Test

a)State null and alternate hypotheses: Ho: p=0 Ha: p≠0

b)Specify the significance – given

c)Identify the degrees of freedom as n-2

d)Determine CV and rejection regions

e)Find the t statistic: STAT, TESTS, Choose E-LinRegTTest, enter lists used, p:≠0, Calculate.

f)Make a decision to reject or fail to reject Ho

g)Interpret decision.

Excel directions

Type the data: x in column A and y in column B. Choose Insert Scatter to sketch the graph. To find the linear correlation coefficient, clickfx, choose statistical and CORREL. Type in the input range: A1:A? and B1:b?, both Boolean options are true. The correlation coefficient, r will appear. Then choose LINEST, this will give you the coefficients for the linear equation: the slope and the y-intercept (y=mx+b). Or under Chart Tools, choose Design. On the far left choose Add Chart Element, then Trendline, More Trendline options. Window will open on the right of the screen. Check Display Equation and Display r2 .

Problem #1:

A large industrial plant has seven divisions that do the same type of work. A safety inspector visits each division of 20 workers quarterly. The number x of work-hours devoted to safety training and the number y of work-hours lost due to industry related accidents is recorded for each separate division.

Data: x y

110.080

219.565

330.068

445.055

550.035

665.010

78012

  1. Make a scatter diagram for these pairs of data. Sketch below.
  2. According to the scatter plot, as the number of hours spent on safety training increases, what happens to the number of hours lost due to industry related accidents? ___decreases___
  3. Does this indicate a positive or negative correlation? __negative______
  4. Complete the steps to find the correlation coefficient, the line of fit and the values for the linear regression equation.

1)Does the line of fit, fit reasonably well? ___yes______

2)What is the correlation coefficient, r, for the safety report? r= _-0.953______

3)R is negative, why? _As safety training hours increase, hours lost due to accidents decrease.

4)Does the correlation coefficient, r, substantiate your conclusion above? ___yes_____

5)What does a positive correlation tell you? _As x increases, y also increases____

  1. Write the equation of the least-squares line( line of fit). _y = -1.066x + 92.057
  2. What is the slope of the line of fit?__-1.066______
  3. What is the marginal change between one hour of safety training and one week lost due to industry related accidents? -1.066______
  4. If 85 hours were devoted to safety training, how many hours would be lost due to work-related accidents? ___1.447 hours______

Problem # 2:

Throughout the world, natural ocean beaches are beautiful to see. If you have visited natural beaches, you may have noticed that when the gradient or drop-off is steep, the grains of sand tend to be larger. In fact, a manmade beach with the “wrong” size granules of sand tends to be washed away and eventually replaced with the proper size grains of sand by the action of the ocean and the gradient of the bottom.

In the data that follows, x = median diameter (mm) of granules of sand, and y = gradient of beach slopes in degrees on the natural ocean beaches.

DataxyDataxy

10.170.6370.669.62

20.190.7080.301.50

30.220.8290.354.40

40.2351.15100.427.30

50.2350.88110.8511.30

6 0.518.11121.0012.40

  1. Make a scatter diagram for these pairs of data. Sketch below.
  2. According to the scatter plot, as thediameter of the sand increases, what happens to the gradient of the beach in degrees? ____increases______
  3. Does this indicate a positive or negative correlation? __positive______
  4. Complete the steps to find the correlation coefficient, the line of fit and the values for the linear regression equation.

a)Does the line of fit, fit reasonably well? _yes______

b)What is the correlation coefficient, r, for the safety report. r= _0.959______

c)r is positive, why? _As gradient of beach increases, the size of the granules increase.

d)Does the correlation coefficient, r, substantiate your conclusion above? ___yes_____

e)What does a positive correlation tell you? __As x values increase, so do y values

f)Write the equation of the least-squares line.( line of fit)__y=15.981x – 1.044_____

1) What is the slope of the line of fit? __15.981______

2)What is the marginal change for the size of the sand and the degree gradient of the beach?

If the size of the sand granule is .60 mm, what is the degree gradient need to keep the sand from washing away from the beach?_f1) 15.981___, f2) 7.645 degrees______

Group Correlation Exercise:

Air Force servicemen and servicewomenwere tested for maximal oxygen use in a 12 min distance run versus a 1.5 mile run. The participant runs on a level terrain for the prescribed distance. The technicians records the participant’s time to the closet second, the heart rate for 15 s immediately after the participant crosses the finish line and then estimate the VO2 max consumption using a Wilmore and Bergfeld fitness table. Below is listed the data for both the 12 min and the 1.5 mile run with the corresponding VO2 maximum oxygen levels.

Data 1.5 Mile Run (n=29)Data 12 Min Distance (n=25)

X yxy

Time (min)VO2Max(ml/kg.min)MilesVO2Max(ml/kg.min)

8.159.21.123.2

8.260.01.1530.0

9.049.71.230.0

9.559.91.1532.5

9.851.21.2536.3

10.048.51.2637.0

10.349.51.2840.0

10.549.51.3132.3

10.950.01.335.0

11.047.51.3840.0

11.140.01.445.1

11.945.11.4347.5

12.044.01.4550.0

12.349.11.546.2

12.450.01.641.0

12.540.01.6546.4

12.640.01.6749.0

12.543.01.6750.0

13.038.81.748.0

13.140.01.847.5

13.832.01.8551.5

13.936.11.959.8

14.539.82.049.8

14.838.12.1560.0

15.037.52.259.0

15.530.0

15.932.0

16.830.0

17.324.5

Data One: 1.5 mile run Time vs. VO2MaxName ______

  1. Make a scatter diagram for these pairs of data. Sketch below.
  2. According to the scatter plot, as the time increases, what happens to the oxygen consumed? ______
  3. Does this indicate a positive or negative correlation? ______
  4. Complete the steps to find the correlation coefficient, the line of fit and the values for the linear regression equation.
  5. Does the line of fit, fit reasonably well? ______
  6. What is the correlation coefficient, r, for the 1.5 mile run? r= ______
  7. Does the correlation coefficient, r, substantiate your conclusion above? ______
  8. What does the r value tell you about the data values x and y? ______
  9. Write the equation of the least-squares line( line of fit).______
  10. What is the slope of the line of fit(regression line)? ______
  11. What is the marginal change for the 1.5 mile run time to one ml/kg.min.versus maximal oxygen consumption?
  12. If the 1.5 mile run time is 7 minutes, what is the amount of oxygen consumed?______, ______

Data Two: 12 Min distance vs. VO2 Consumed

  1. Make a scatter diagram for these pairs of data. Sketch below.
  2. According to the scatter plot, as the distance increases, what happens to the oxygen consumed? ______
  3. Does this indicate a positive or negative correlation? ______
  4. Complete the steps to find the correlation coefficient, the line of fit and the values for the linear regression equation.
  5. Does the line of fit, fit reasonably well? ______
  6. What is the correlation coefficient, r, for 12 min. distance run? r= ______
  7. What does the r value tell you about the data values x and y? ______
  8. Does the correlation coefficient, r, substantiate your conclusion above? ______
  9. Write the equation of the least-squares line( line of fit). ______
  10. What is the slope of the line of fit?______
  11. What is the marginal change for the 1 min of distance to one ml/kg.min.of oxygen consumption?
  12. If the 12 min distance is 2.3 miles,what is the amount of oxygen consumed?

1.5 MILE RUN VS VO2 MAX 1.2 MIN. DISTANCE VS VO2 MAX

80

VO2

40

0

6 8 10 12 14 16 18

1.5 Run. Time (min.)

80

VO2 40

0

0.7 1 1.3 1.6 1.9 2.2 2.5

12 min. distance (miles)