Chapter 13

Linear Regression and Correlation

True/False

1. If a scatter diagram shows very little scatter about a straight line drawn through the plots, it indicates a rather weak correlation.

Answer: False

2. A scatter diagram is a chart that portrays the correlation between a dependent variable and an independent variable.

Answer: True

3. An economist is interested in predicting the unemployment rate based on gross domestic product. Since the economist is interested in predicting unemployment, the independent variable is gross domestic product.

Answer: True

4. There are two variables in correlation analysis referred to as the dependent and determination variables.

Answer: False

5. Correlation analysis is a group of statistical techniques used to measure the strength of the relationship (correlation) between two variables.

Answer: True

6. The purpose of correlation analysis is to find how strong the relationship is between two variables.

Answer: True

7. Originated by Karl Pearson about 1900, the coefficient of correlation describes the strength of the relationship between two, interval or ratio-scaled variables.

Answer: True

8. The coefficient of correlation, r, is often referred to as Spearman's rho.

Answer: False


9. The coefficient of correlation r is often referred to as the Pearson product-moment correlation coefficient.

Answer: True

10. A correlation coefficient equal to –1 or +1 indicates perfect correlation.

Answer: True

11. The strength of the correlation between two variables depends on the sign of the coefficient of correlation.

Answer: False

12. A coefficient of correlation, r, close to 0 (say, 0.08) shows that the relationship between two variables is quite weak.

Answer: True

13. Correlation coefficients of –0.91 and +0.91 represent relationships between two variables that have equal strength but different directions.

Answer: True

14. A coefficient of correlation of –0.96 indicates a very weak negative correlation.

Answer: False

15. The coefficient of determination is the proportion of the total variation in the dependent variable Y that is explained or accounted for by its relationship with the independent variable X.

Answer: True

16. The coefficient of determination is found by taking the square root of the coefficient of correlation.

Answer: False

17. If the coefficient of correlation is –0.90, the coefficient of determination is –0.81.

Answer: False

18. If the coefficient of correlation is –0.50, the coefficient of determination is +0.25.

Answer: True

19. If the coefficient of correlation is 0.68, the coefficient of determination is 0.4624.

Answer: True


20. The correlation coefficient is the proportion of total variation in Y that is explained by X.

Answer: False

21. The coefficient of determination is the proportion of total variation in Y that is not explained by X.

Answer: False

22. The coefficient of determination is the proportion of total variation in Y that is explained by X.

Answer: True

23. Pearson's product-moment correlation coefficient, r, requires that the data be interval or ratio scaled, such as incomes and weights.

Answer: True

24. The standard error of estimate measures the accuracy of our prediction.

Answer: True

25. Pearson's coefficient of correlation can be used if the data is nominally scaled.

Answer: False

26. The coefficient of determination can only be positive.

Answer: True

27. If the coefficient of determination is expressed as a percent, its value is between 0% and 100%.

Answer: True

28. A t test is used to test the significance of the coefficient of correlation.

Answer: True

29. To test the significance of Pearson's r, we use the standard normal z distribution.

Answer: False

30. When testing the strength of the relationship between two variables, the null hypothesis is: .

Answer: True


31. When testing the strength of the relationship between two variables, the alternate hypothesis is: .

Answer: True

32. The basic question in testing the significance of ρ (rho) is to make a statistical inference about the true relationship between two variables.

Answer: True

33. One assumption underlying linear regression is that the Y values are statistically dependent. This means that in selecting a sample, the Y values chosen, for a particular X value, depend on the Y values for any other X value.

Answer: False

34. The technique used to measure the strength of the relationship between two variables using the coefficient of correlation and the coefficient of determination is called regression analysis.

Answer: False

35. A regression equation may be determined using a mathematical method called the least squares principle.

Answer: True

36. A regression equation found using the least squares principle is the best-fitting line because the sum of the squares of the vertical deviations between the actual and estimated values is minimized.

Answer: True

37. The least squares technique minimizes the sum of the squares of the vertical distances between the actual Y values and the predicted values of Y.

Answer: True

38. The values of a and b in the regression equation are called the regression coefficients.

Answer: True

39. One assumption underlying linear regression is that for each value of X there is a group of Y values that is normally distributed.

Answer: True

40. In order to visualize the regression equation line, we can draw a scatter diagram.

Answer: True


41. A regression equation is a mathematical equation that defines the relationship between two variables.

Answer: True

42. The equation for a straight line going through the plots on a scatter diagram is called a regression equation. It is alternately called an estimating equation and a predicting equation.

Answer: True

43. The regression equation is used to estimate a value of the dependent variable Y based on a selected value of the independent variable X.

Answer: True

44. In regression analysis, the predicted value of rarely agrees exactly with the actual Y value, i.e., we expect some prediction error.

Answer: True

45. Trying to predict weekly sales with a standard error of estimate of $1,955, we would conclude that 68 percent of the predictions would not be off more than $1,955, 95 percent would not be off by more $3,910, and 99.7 percent would not be off by more than $5,865.

Answer: True

46. The standard error of estimate is used to construct confidence intervals when the sample size is large and the scatter about the regression line is somewhat normally distributed.

Answer: True

47. A confidence interval can be determined for the mean value of Y for a given value of X.

Answer: True

48. A confidence interval can be determined for the mean value of X for a given value of Y.

Answer: False

49. The smaller the samples, the smaller the standard error of estimate.

Answer: False

50. Explained variation equals total variation minus unexplained variation.

Answer: True


51. In regression analysis, there is no difference in the width of a confidence interval and the width of a predictor interval.

Answer: False

52. A confidence interval is narrower than a prediction interval because a confidence interval estimates a mean Y for a given X.

Answer: True

53. The least squares method assumes the relationship between the dependent and independent variables is linear.

Answer: True

54. When analyzing data with regression, a transformation is necessary when the relationship between the dependent and independent variables is linear.

Answer: False

55. When analyzing a curvilinear relationship between dependent and independent variables, a transformation of the data is necessary.

Answer: True

56. A mathematical transformation can be used to change a curvilinear relationship between two variables to a linear relationship.

Answer: True

Multiple Choice

57. What is the chart called when the paired data (the dependent and independent variables) are plotted?

A) Scatter diagram

B) Bar chart

C) Pie chart

D) Histogram

Answer: A

58. What is the variable used to predict the value of another called?

A) Independent

B) Dependent

C) Correlation

D) Determination

Answer: A


59. Which of the following statements regarding the coefficient of correlation is true?

A) It ranges from –1.0 to +1.0 inclusive

B) It measures the strength of the relationship between two variables

C) A value of 0.00 indicates two variables are not related

D) All of the above

Answer: D

60. What does a coefficient of correlation of 0.70 infer?

A) Almost no correlation because 0.70 is close to 1.0

B) 70% of the variation in one variable is explained by the other

C) Coefficient of determination is 0.49

D) Coefficient of nondetermination is 0.30

Answer: C

61. What is the range of values for a coefficient of correlation?

A) 0 to +1.0

B) –3 to +3 inclusive

C) –1.0 to +1.0 inclusive

D) Unlimited range

Answer: C

62. The Pearson product-moment correlation coefficient, r, requires that variables are measured with:

A) An interval scale

B) A ratio scale

C) An ordinal scale

D) A nominal

E) Either A or B.

Answer: E

63. If the correlation between two variables is close to one, the association is

A) strong.

B) moderate.

C) weak.

D) none.

Answer: A

64. If the correlation coefficient between two variables equals zero, what can be said of the variables X and Y?

A) Not related

B) Dependent on each other

C) Highly related

D) All of the above are correct

Answer: A

Scrambling: Locked


65. What can we conclude if the coefficient of determination is 0.94?

A) Strength of relationship is 0.94

B) Direction of relationship is positive

C) 94% of total variation of one variable is explained by variation in the other variable.

D) All of the above are correct

Answer: C

Scrambling: Locked

66. If r = –1.00, what inferences can be made?

A) The dependent variable can be perfectly predicted by the independent variable

B) All of the variation in the dependent variable can be accounted for by the independent variable

C) High values of one variable are associated with low values of the other variable

D) Coefficient of determination is 100%.

E) All of the above are correct

Answer: E

67. If r = 0.65, what does the coefficient of determination equal?

A) 0.194

B) 0.423

C) 0.577

D) 0.806

Answer: B

68. What does the coefficient of determination equal if r = 0.89?

A) 0.94

B) 0.89

C) 0.79

D) 0.06

Answer: C

69. Which value of r indicates a stronger correlation than 0.40?

A) –0.30

B) –0.50

C) +0.38

D) 0

Answer: B

70. What is the range of values for the coefficient of determination?

A) –1 to +1 inclusive

B) –100% to +100% inclusive

C) –100% to 0% inclusive

D) 0% to 100% inclusive

Answer: D


71. If the decision in the hypothesis test of the population correlation coefficient is to reject the null hypothesis, what can we conclude about the correlation in the population?

A) It is zero

B) It could be zero

C) It is not zero

D) It equals the computed sample correlation

Answer: C

72. A hypothesis test is conducted at the .05 level of significance to test whether or not the population correlation is zero. If the sample consists of 25 observations and the correlation coefficient is 0.60, then what is the computed value of the test statistic?

A) 1.96

B) 2.07

C) 2.94

D) 3.60

Answer: D

73. In the regression equation, what does the letter "a" represent?

A) Y intercept

B) Slope of the line

C) Any value of the independent variable that is selected

D) None of the above

Answer: A

74. In the regression equation, what does the letter "b" represent?

A) Y intercept

B) Slope of the line

C) Any value of the independent variable that is selected

D) Value of Y when X=0

Answer: B

75. Suppose the least squares regression equation is = 1202 + 1,133X. When X = 3, what does equal?

A) 5,734

B) 8,000

C) 4,601

D) 4,050

Answer: C


76. What is the general form of the regression equation?

A) = ab

B) = a + bX

C) = a – bX

D) = abX

Answer: B

77. What is the measure that indicates how precise a prediction of Y is based on X or, conversely, how inaccurate the prediction might be?

A) Regression equation

B) Slope of the line

C) Standard error of estimate

D) Least squares principle

Answer: C

78. Which of the following are true assumptions underlying linear regression: 1) for each value of X, there is a group of Y values which is normally distributed; 2) the means of these normal distributions of Y values all lie on the straight line of regression; and/or 3) the standard deviations of these normal distributions are equal?

A) Only (1) and (2)

B) Only (1) and (3)

C) Only (2) and (3)

D) All of them

Answer: D

79. Based on the regression equation, we can

A) predict the value of the dependent variable given a value of the independent variable.

B) predict the value of the independent variable given a value of the dependent variable.

C) measure the association between two variables.

D) all of the above.

Answer: A

80. Which of the following is true about the standard error of estimate?

A) It is a measure of the accuracy of the prediction

B) It is based on squared vertical deviations between Y and

C) It cannot be negative

D) All of the above

Answer: D


81. If all the plots on a scatter diagram lie on a straight line, what is the standard error of estimate?

A) –1

B) +1

C) 0

D) Infinity

Answer: C