Practice Exam 3 Answer Key

For each of the following short scenarios, describe briefly and specifically how you would respond to the issue raised:

1. You are working on a sales forecasting project using multiple regression. You have your equation and wish to forecast sales. You need to have values to use for your independent variables. What alternatives are available for those values of the independent variables?

Use forecasts generated by economic/other organizations

Use naïve trend forecasts or forecasts from other known independent variables

Use information from financial statements for variables like square feet or number of stores

2. What problem does multicollinearity cause? What would you do if you found severe multicollinearity among your independent variables?

Problems with slopes (interpreting, signs, testing)/ Omit one or more variables that are highly correlated with each other (especially those with high P – values)

3. How would you diagnose an observation that might cause the regression equation to be misleading?

A high value of Cook’s D.

4. National Airlines recently announced a daily early morning non-stop flight between Chicago and Houston. Thirty flights were chosen; ten each from National and their two major competitors. Explain how you would analyze the data to determine whether a significant relationship exists between passenger load (defined as percent of unfilled seats) and airline.

One- factor ANOVA test for equal population means.

5. In the previous problem, explain how you would determine specifically if National had a lower percent of unfilled seats than its 2 major competitors.

Tukey simultaneous comparisons

6. Variables are removed from consideration early in the forecasting process by using the correlation matrix and by using VIF values? What is the difference between the 2 methods?

Correlations look at only 2 variables to see if they are highly related whereas VIF values determine how each X variable is related to all other X variables.
Data consists of 160 charge cardholders and is designed to study the yearly income and level of education. The 4 income levels are: High School (HS), some college (SomeC), Bachelor’s degree (BD) and advanced degree (AD).

7. We wish to determine whether the population mean income are equal for the 4 different education levels. Set up hypotheses and use Appendix A to help you write up a statement of how confident you are.

H0: µ1= µ2 = µ3 = µ4vs H1: Not all population means are equal

We can be almost 100% confident that differencesexist between the means

8. Use the Tukey pairwise comparisons to help you write up a series of statements detailing which degree holders have higher population means than which others.

We can be 99% confident that advanced degree holders have higher mean pop income than High school degree holders. We can be 95% confident that AD has higher mean than SomeC and BD higher than HS.

Appendix B details a seasonal and trend analysis of 24 quarters of revenues ($millions) data from Jack-in-the-Box.

9a. In the trendline what does the number9.3998 mean in terms of revenues?

Every quarter on the average, revenues incrase by $9,399,800.

b. Interpret the number 1.219 in Appendix B.

First quarter revenues are 1.219 times average quarterly revenues

First quarter revenues are 21.9% higher than an average quarter.

c. Calculate a seasonal forecast for the 25th quarter (a first quarter).

510.755 / 622.6103

The operations director for a television station wishes to study the issue of standby hours (union graphic artists are paid to be there but are not doing anything for those hours). The dependent variable is Standby (total number of hours per quarter). The independent variables are Total Staff ( people-days worked per quarter), Remote ( total number of hours worked by employees away from the TV station), and Total Labor (the total number of hours worked by all employees per quarter).

10. Look at Appendix C. The theory that the operations director came up with was that there should be a direct relationship between Standby hours and each of the predictors Total Staff and Total labor, but there should be an inverse relationship with Remote hours.

  1. Did the results come out as theory would predict?

Yes, all signs were as predicted.

b. Is there a significant relationship between Standby hours and this set of 3 variables? Set up hypotheses and come to a conclusion.

Null: All 3 pop slopes equal 0 Alt: At least one pop slope differs from 0

We can be 99.67% confident that there is a relationship with this set of 3 variables.

c. Is there a relationship with total staff? Set up hypotheses and come to a conclusion by writing a short statement detailing your confidence.

Null: Pop slope = 0 for Total Staff Alt: Pop slope ≠ 0

P –value = .00433 and so we can be 99.6% confident of a direct relationship.

d. Would you keep all 3 variables in the equation? Why or why not?

No, total labor P –value = .232, higher than .05

11. Look at Appendix D. Please ignore this question

a. Does it appear that there is a significant quarterly influence on Standby hours? Why or why not?

Since Q1 has a significant P-value, we can say that there is a significant quarterly influence.

b. Interpret the number 64.87813.

Quarter 1 Standby hours are 64.878 more than standby hours in quarter4, holding other variables constant.

12. Look at Appendix E.

  1. Does it appear that adding a squared term for Remote was useful in this case? Why or why not?

Since RemSq has a P-value of .0085, we can be 99.15% confident that a squared term was useful in this case.

b. Find the derivative of standby hours with respect to the variable Remote. If a given quarter had 200 remote hours what would the slope be equal to?

Slope = -.9157 + .002Remote

Slope = -.5157

c. Given these results in part a, work backwards to envision what the residuals from Appendix C must have looked like when plotted against Remote. Describe what you would haveseen in this plot.

Since the slope was negative for Remote and we moved up the ladder for Remote, the linearity plot must have been a rainbow shape.

13. Look at Appendices C, D. and E. If you were to pick one of those equations to predict standby hours, which would it be? Think of as many reasons as you can to support your choice.

E has lowest Standard error and Anova P- value; highest Rsq and adj RSQ.

For the following charts: State whether or not the process is in control or out of control, and describe the reason(s) why. 14. Out of control – Outside limits + more than 8 both below and above center line

15. Out of control by all 3 reasons

The following situations call for the use of a control chart. Look at each situation and decide which type of control chart is appropriate for the situation.

16. A cookie company sells chocolate chip cookies in shopping malls. The manager institutes a process where 50 cookies from each batch are examined carefully and the number of cookies that are unacceptable are noted. After 30 batches a control chart is prepared.

P chart

17. Consistency in the dough is a major concern for the cookie company. Every batch is sampled at random in 10 different places to record the density of the dough. They keep track of this data until they get 30 batches and then construct a control chart.

Mean and Range charts

18. They inspect all 25 shopping malls once each per month and record how many problems they see in the outlet. The problems can be of any kind such as the counter being dirty or spills on the floor or specials not being posted correctly. At the end of each month they construct a control chart of the 25 malls.

C chart
Appendix A

Appendix B: 24 quarters of revenues from Jack-in-the-Box ($Millions)

Appendix C:

Appendix D

Appendix E

1