Lakeway Enterprises

Regression Analysis Case

Georganna Russell, the controller of Lakeway Enterprises, a fabricator specializing in the manufacture of copper pipe, has undertaken a project to study the behavior of overhead cost. She has assembled data for this month from each of the firm’s twenty-eight manufacturing facilities. In addition, all the allocated components of the factory overhead account, such as head office costs, have been eliminated from the factory overhead account balances.

Georganna has asked you to develop a regression model to predict the level of manufacturing overhead based on the data in Table 1.

Required:

·  Determine the best regression equation possible given the data in Table 1 and the fact that overhead is to be the dependent variable. You should indicate why the regression model which you selected was determined to be the best and why other models were rejected. Factors which you should consider in your assessment of potential regression models include but are not limited to the following items: (1) whether the independent variable(s) can be used to predict the dependent variable; (2) the greatest amount of confidence which can be placed in the regression equation; and (3) the significance of the coefficients of determination and correlation.

·  Discuss key assumptions and limitations of the regression models which you considered.

·  Make a cost prediction using the regression equation which you selected if Susan expects to incur 2,000 labor-hours, 1,000 machine-hours, 500 tons of raw materials handled, and 70 production line setups. Next, if management desires to be at least 95 percent certain that actual costs will not exceed their estimate for overhead costs, what would be their estimate of overhead? Lastly, if management desires to be 90 percent certain of the actual overhead cost, what would be their estimate of overhead?

·  Discuss the problems of estimating overhead cost at the following projected activity level: 3,000 labor hours, 500 machine hours, 700 tons of raw materials handled, and
100 production line setups.

You are to use the Excel computer program and include the printout for all regression models considered.

Plant Number / Factory Overhead / Labor Hours / Machine Hours / Tons of Raw Material Handled / Production Line Setups
1 / 137,896 / 2,092 / 959 / 414 / 67
2 / 174,342 / 1,617 / 1,227 / 623 / 88
3 / 168,896 / 2,215 / 1,351 / 437 / 50
4 / 178,059 / 1,584 / 1,480 / 479 / 89
1 / 166,605 / 1,930 / 952 / 678 / 52
2 / 165,320 / 1,717 / 986 / 666 / 50
7 / 157,585 / 2,319 / 931 / 585 / 75
8 / 165,667 / 2,312 / 1,439 / 479 / 90
9 / 155,657 / 1,880 / 945 / 619 / 94
10 / 144,605 / 1,723 / 869 / 489 / 60
11 / 157,608 / 1,992 / 1,171 / 445 / 56
12 / 171,700 / 2,476 / 1,228 / 581 / 75
13 / 140,686 / 2,087 / 928 / 446 / 61
14 / 171,982 / 2,256 / 950 / 688 / 99
15 / 155,252 / 2,179 / 1,016 / 580 / 78
16 / 140,793 / 1,806 / 902 / 464 / 60
17 / 154,377 / 1,671 / 948 / 610 / 61
18 / 150,886 / 2,019 / 1,130 / 532 / 89
19 / 159,198 / 1,585 / 1,335 / 415 / 98
20 / 145,379 / 1,747 / 1,052 / 517 / 91
21 / 152,614 / 1,618 / 860 / 640 / 64
22 / 159,450 / 2,122 / 1,188 / 548 / 61
23 / 160,983 / 1,697 / 1,254 / 425 / 56
24 / 175,393 / 2,406 / 1,187 / 695 / 58
25 / 153,031 / 1,917 / 948 / 468 / 97
26 / 166,110 / 1,658 / 1,015 / 660 / 74
27 / 150,041 / 2,042 / 971 / 478 / 51
28 / 170,419 / 1,757 / 1,111 / 652 / 85

Table 1: Lakeway Enterprises

Solution

·  First we fit a regression line with Factory Overhead as the dependent variable and Labor Hours, Machine Hours, Tons of Raw Material Handled and Production Line Setup as independent variables. The regression results are given below:

SUMMARY OUTPUT
Regression Statistics
Multiple R / 0.92
R Square / 0.85
Adjusted R Square / 0.82
Standard Error / 4720.67
Observations / 28
ANOVA
df / SS / MS / F / Significance F
Regression / 4 / 3E+09 / 7E+08 / 3E+01 / 4E-09
Residual / 23 / 5E+08 / 2E+07
Total / 27 / 3E+09
Coefficients / Standard Error / t Stat / P-value
Intercept / 55244.91 / 11100.47 / 4.98 / 0.00
Labor Hours / 0.28 / 3.40 / 0.08 / 0.93
Machine Hours / 52.76 / 5.61 / 9.40 / 0.00
Tons of Raw Material Handled / 88.34 / 10.24 / 8.63 / 0.00
Production Line Setups / -31.90 / 56.49 / -0.56 / 0.58

From the above results we can see that from the ANOVA table we can see that the F value is 3E+01 with p-value almost equal to zero. Since the p-value is less than 0.05 therefore we reject the null hypothesis of insignificance of the regression model.

Here the estimated regression equation is:

Factory Overhead = 55244.91 + 0.28*Labor hours + 52.76*Machine Hours + 88.34*Tons of Raw Material Handled – 31.90*Production Line Setups

We can see that from the last table, for the significance of regression coefficients, the p-value of for the test is greater than 0.05 for Labor Hours and Production Line setups. Therefore, we conclude that the regression coefficient for Machine Hours and Tons of Raw Material Handled is significant at 5% level of confidence.

Therefore, in the next model we have used only Machine Hours and Tons of Raw Material Handled as the independent variables. The regression results are given below:

SUMMARY OUTPUT
Regression Statistics
Multiple R / 0.92
R Square / 0.85
Adjusted R Square / 0.83
Standard Error / 4561.22
Observations / 28
ANOVA
df / SS / MS / F / Significance F
Regression / 2 / 2.9E+09 / 1.4E+09 / 6.9E+01 / 6.4E-11
Residual / 25 / 5.2E+08 / 2.1E+07
Total / 27 / 3.4E+09
Coefficients / Standard Error / t Stat / P-value
Intercept / 54607.58 / 8921.06 / 6.12 / 0.00
Machine Hours / 52.05 / 5.23 / 9.96 / 0.00
Tons of Raw Material Handled / 87.69 / 9.79 / 8.96 / 0.00

From the above ANOVA Table we can see that the observed F value is 6.9E+01 with p-value almost equal to zero. Since the p-value is less than 0.05 therefore we reject the null hypothesis of insignificance of the regression model.

Here the estimated regression equation is:

Factory Overhead = 54607.58 + 52.05*Machine Hours + 87.69*Tons of Raw Material Handled

We can see that from the last table, for the significance of regression coefficients, the p-value of for the test is less than 0.05 for all the independent variables. Therefore, we conclude that the regression coefficient is significant at 5% level of confidence.

Here the R-square value is 0.85. Therefore, almost 85% of the total variability in Factory Overhead is explained by the regression model.

·  An assumption in regression model is of Normality of the dependent variable. The following figure shows the Normal p-p plot for Factory Overhead:

We can see that from the above figure the plot is almost linear. So the Normality is valid.

·  Cost prediction using the regression equation which you selected if Susan expects to incur 2,000 labor-hours, 1,000 machine-hours, 500 tons of raw materials handled, and 70 production line setups is 150501.9. The 95% confidence interval for predicted value is (141249.4, 159754.4). The 95% confidence interval for predicted value is (142736.4, 158267.4).

·  The problems of estimating overhead cost at the following projected activity level: 3,000 labor hours, 500 machine hours, 700 tons of raw materials handled, and
100 production line setups is that none of the independent variables has a value close to these in the sample data set.