Chapter 13
Linear Regression and Correlation
True/False
1. If a scatter diagram shows very little scatter about a straight line drawn through the plots, it indicates a rather weak correlation.
Answer: False
2. A scatter diagram is a chart that portrays the correlation between a dependent variable and an independent variable.
Answer: True
3. An economist is interested in predicting the unemployment rate based on gross domestic product. Since the economist is interested in predicting unemployment, the independent variable is gross domestic product.
Answer: True
4. There are two variables in correlation analysis referred to as the dependent and determination variables.
Answer: False
5. Correlation analysis is a group of statistical techniques used to measure the strength of the relationship (correlation) between two variables.
Answer: True
6. The purpose of correlation analysis is to find how strong the relationship is between two variables.
Answer: True
7. Originated by Karl Pearson about 1900, the coefficient of correlation describes the strength of the relationship between two, interval or ratio-scaled variables.
Answer: True
8. The coefficient of correlation, r, is often referred to as Spearman's rho.
Answer: False
9. The coefficient of correlation r is often referred to as the Pearson product-moment correlation coefficient.
Answer: True
10. A correlation coefficient equal to –1 or +1 indicates perfect correlation.
Answer: True
11. The strength of the correlation between two variables depends on the sign of the coefficient of correlation.
Answer: False
12. A coefficient of correlation, r, close to 0 (say, 0.08) shows that the relationship between two variables is quite weak.
Answer: True
13. Correlation coefficients of –0.91 and +0.91 represent relationships between two variables that have equal strength but different directions.
Answer: True
14. A coefficient of correlation of –0.96 indicates a very weak negative correlation.
Answer: False
15. The coefficient of determination is the proportion of the total variation in the dependent variable Y that is explained or accounted for by its relationship with the independent variable X.
Answer: True
16. The coefficient of determination is found by taking the square root of the coefficient of correlation.
Answer: False
17. If the coefficient of correlation is –0.90, the coefficient of determination is –0.81.
Answer: False
18. If the coefficient of correlation is –0.50, the coefficient of determination is +0.25.
Answer: True
19. If the coefficient of correlation is 0.68, the coefficient of determination is 0.4624.
Answer: True
20. The correlation coefficient is the proportion of total variation in Y that is explained by X.
Answer: False
21. The coefficient of determination is the proportion of total variation in Y that is not explained by X.
Answer: False
22. The coefficient of determination is the proportion of total variation in Y that is explained by X.
Answer: True
23. Pearson's product-moment correlation coefficient, r, requires that the data be interval or ratio scaled, such as incomes and weights.
Answer: True
24. The standard error of estimate measures the accuracy of our prediction.
Answer: True
25. Pearson's coefficient of correlation can be used if the data is nominally scaled.
Answer: False
26. The coefficient of determination can only be positive.
Answer: True
27. If the coefficient of determination is expressed as a percent, its value is between 0% and 100%.
Answer: True
28. A t test is used to test the significance of the coefficient of correlation.
Answer: True
29. To test the significance of Pearson's r, we use the standard normal z distribution.
Answer: False
30. When testing the strength of the relationship between two variables, the null hypothesis is: .
Answer: True
31. When testing the strength of the relationship between two variables, the alternate hypothesis is: .
Answer: True
32. The basic question in testing the significance of ρ (rho) is to make a statistical inference about the true relationship between two variables.
Answer: True
33. One assumption underlying linear regression is that the Y values are statistically dependent. This means that in selecting a sample, the Y values chosen, for a particular X value, depend on the Y values for any other X value.
Answer: False
34. The technique used to measure the strength of the relationship between two variables using the coefficient of correlation and the coefficient of determination is called regression analysis.
Answer: False
35. A regression equation may be determined using a mathematical method called the least squares principle.
Answer: True
36. A regression equation found using the least squares principle is the best-fitting line because the sum of the squares of the vertical deviations between the actual and estimated values is minimized.
Answer: True
37. The least squares technique minimizes the sum of the squares of the vertical distances between the actual Y values and the predicted values of Y.
Answer: True
38. The values of a and b in the regression equation are called the regression coefficients.
Answer: True
39. One assumption underlying linear regression is that for each value of X there is a group of Y values that is normally distributed.
Answer: True
40. In order to visualize the regression equation line, we can draw a scatter diagram.
Answer: True
41. A regression equation is a mathematical equation that defines the relationship between two variables.
Answer: True
42. The equation for a straight line going through the plots on a scatter diagram is called a regression equation. It is alternately called an estimating equation and a predicting equation.
Answer: True
43. The regression equation is used to estimate a value of the dependent variable Y based on a selected value of the independent variable X.
Answer: True
44. In regression analysis, the predicted value of rarely agrees exactly with the actual Y value, i.e., we expect some prediction error.
Answer: True
45. Trying to predict weekly sales with a standard error of estimate of $1,955, we would conclude that 68 percent of the predictions would not be off more than $1,955, 95 percent would not be off by more $3,910, and 99.7 percent would not be off by more than $5,865.
Answer: True
46. The standard error of estimate is used to construct confidence intervals when the sample size is large and the scatter about the regression line is somewhat normally distributed.
Answer: True
47. A confidence interval can be determined for the mean value of Y for a given value of X.
Answer: True
48. A confidence interval can be determined for the mean value of X for a given value of Y.
Answer: False
49. The smaller the samples, the smaller the standard error of estimate.
Answer: False
50. Explained variation equals total variation minus unexplained variation.
Answer: True
51. In regression analysis, there is no difference in the width of a confidence interval and the width of a predictor interval.
Answer: False
52. A confidence interval is narrower than a prediction interval because a confidence interval estimates a mean Y for a given X.
Answer: True
53. The least squares method assumes the relationship between the dependent and independent variables is linear.
Answer: True
54. When analyzing data with regression, a transformation is necessary when the relationship between the dependent and independent variables is linear.
Answer: False
55. When analyzing a curvilinear relationship between dependent and independent variables, a transformation of the data is necessary.
Answer: True
56. A mathematical transformation can be used to change a curvilinear relationship between two variables to a linear relationship.
Answer: True
Multiple Choice
57. What is the chart called when the paired data (the dependent and independent variables) are plotted?
A) Scatter diagram
B) Bar chart
C) Pie chart
D) Histogram
Answer: A
58. What is the variable used to predict the value of another called?
A) Independent
B) Dependent
C) Correlation
D) Determination
Answer: A
59. Which of the following statements regarding the coefficient of correlation is true?
A) It ranges from –1.0 to +1.0 inclusive
B) It measures the strength of the relationship between two variables
C) A value of 0.00 indicates two variables are not related
D) All of the above
Answer: D
60. What does a coefficient of correlation of 0.70 infer?
A) Almost no correlation because 0.70 is close to 1.0
B) 70% of the variation in one variable is explained by the other
C) Coefficient of determination is 0.49
D) Coefficient of nondetermination is 0.30
Answer: C
61. What is the range of values for a coefficient of correlation?
A) 0 to +1.0
B) –3 to +3 inclusive
C) –1.0 to +1.0 inclusive
D) Unlimited range
Answer: C
62. The Pearson product-moment correlation coefficient, r, requires that variables are measured with:
A) An interval scale
B) A ratio scale
C) An ordinal scale
D) A nominal
E) Either A or B.
Answer: E
63. If the correlation between two variables is close to one, the association is
A) strong.
B) moderate.
C) weak.
D) none.
Answer: A
64. If the correlation coefficient between two variables equals zero, what can be said of the variables X and Y?
A) Not related
B) Dependent on each other
C) Highly related
D) All of the above are correct
Answer: A
Scrambling: Locked
65. What can we conclude if the coefficient of determination is 0.94?
A) Strength of relationship is 0.94
B) Direction of relationship is positive
C) 94% of total variation of one variable is explained by variation in the other variable.
D) All of the above are correct
Answer: C
Scrambling: Locked
66. If r = –1.00, what inferences can be made?
A) The dependent variable can be perfectly predicted by the independent variable
B) All of the variation in the dependent variable can be accounted for by the independent variable
C) High values of one variable are associated with low values of the other variable
D) Coefficient of determination is 100%.
E) All of the above are correct
Answer: E
67. If r = 0.65, what does the coefficient of determination equal?
A) 0.194
B) 0.423
C) 0.577
D) 0.806
Answer: B
68. What does the coefficient of determination equal if r = 0.89?
A) 0.94
B) 0.89
C) 0.79
D) 0.06
Answer: C
69. Which value of r indicates a stronger correlation than 0.40?
A) –0.30
B) –0.50
C) +0.38
D) 0
Answer: B
70. What is the range of values for the coefficient of determination?
A) –1 to +1 inclusive
B) –100% to +100% inclusive
C) –100% to 0% inclusive
D) 0% to 100% inclusive
Answer: D
71. If the decision in the hypothesis test of the population correlation coefficient is to reject the null hypothesis, what can we conclude about the correlation in the population?
A) It is zero
B) It could be zero
C) It is not zero
D) It equals the computed sample correlation
Answer: C
72. A hypothesis test is conducted at the .05 level of significance to test whether or not the population correlation is zero. If the sample consists of 25 observations and the correlation coefficient is 0.60, then what is the computed value of the test statistic?
A) 1.96
B) 2.07
C) 2.94
D) 3.60
Answer: D
73. In the regression equation, what does the letter "a" represent?
A) Y intercept
B) Slope of the line
C) Any value of the independent variable that is selected
D) None of the above
Answer: A
74. In the regression equation, what does the letter "b" represent?
A) Y intercept
B) Slope of the line
C) Any value of the independent variable that is selected
D) Value of Y when X=0
Answer: B
75. Suppose the least squares regression equation is = 1202 + 1,133X. When X = 3, what does equal?
A) 5,734
B) 8,000
C) 4,601
D) 4,050
Answer: C
76. What is the general form of the regression equation?
A) = ab
B) = a + bX
C) = a – bX
D) = abX
Answer: B
77. What is the measure that indicates how precise a prediction of Y is based on X or, conversely, how inaccurate the prediction might be?
A) Regression equation
B) Slope of the line
C) Standard error of estimate
D) Least squares principle
Answer: C
78. Which of the following are true assumptions underlying linear regression: 1) for each value of X, there is a group of Y values which is normally distributed; 2) the means of these normal distributions of Y values all lie on the straight line of regression; and/or 3) the standard deviations of these normal distributions are equal?
A) Only (1) and (2)
B) Only (1) and (3)
C) Only (2) and (3)
D) All of them
Answer: D
79. Based on the regression equation, we can
A) predict the value of the dependent variable given a value of the independent variable.
B) predict the value of the independent variable given a value of the dependent variable.
C) measure the association between two variables.
D) all of the above.
Answer: A
80. Which of the following is true about the standard error of estimate?
A) It is a measure of the accuracy of the prediction
B) It is based on squared vertical deviations between Y and
C) It cannot be negative
D) All of the above
Answer: D
81. If all the plots on a scatter diagram lie on a straight line, what is the standard error of estimate?
A) –1
B) +1
C) 0
D) Infinity
Answer: C