An Artificial Neural Network Approach to Quantify Change Order Impact on Construction Productivity

Term Project Report for :

ECE/CS/ME 539(Fall 2001-Lecture 1)

Professor: Yu Hen Hu

Lee, Min-Jae

(9014179882)

Department of Civil and Environmental Engineering

University of Wisconsin - Madison

TABLE OF CONTENTS

1INTRODUCTION……………………………..…………………………………. 3

2background………………………………………………………………… 3

3Previous research and Problem Statement...………………. 5

4Data collection, treating, setup…………………………….…… 7

4.1Data Collection...………………………………………………….……… 7

4.2Data Setup for the Neural Network….……………….……………….... 8

5Neural Network approach...…………………………………….…… 9

5.1 Neural Network Design……………..……………………………….…… 9

5.2 Selection of Network Parameters……………..………………………… 10

5.3 Network Usage (Testing new case studies) ……………..……………… 13

6results and comparison…….………………………………...………. 14

7conclusion..………………………………...……………………….……… 15

8Data input and output features…………………………………… 15

9REFERENCE….……………………………………………………………….... 17

1. Introduction

A fact of life for a construction project is change. Changes result from the necessity to modify aspects of the construction project in reaction to circumstances that develop during the construction process. The changes may be small, well managed, and have little effect on the whole construction project. On the other hand, changes may be large, poorly managed, and have tremendous negative impacts on the construction project performance in terms of time and cost.

After the contract is awarded, owner only have right to change contract or scope of work. If there were mistakes in original design, owner has to change their original contract documents. When change orders happen, contractor usually has damage in their productivity. For example, when the project was stopped for sometime to change and fix their original design, contractor should pay for labor wage, equipment rental fee, etc. More seriously, if the delayed is continued for a while, contractor can lose next job. These kinds of problem happen frequently in construction business and often end up with court dispute (Claims). To solve this kind of problem, we need a reliable decision tool developed from historical data.

In this report, author tries to develop a well-trained “Artificial Neural Network” can make decision (was the project impacted by change order or not) by using 140 case study (historical data).

2. background

What is the change order? Change order can be defined as “any event, which results in a modification of the original scope, execution time or cost of work,” (Ibbs and Allen, 1995).

Why change ordersare occurred? First, due to the unique characteristics of the construction, there is no exactly same project. Second, the project should be completed within limited resources of time and money and this situation can cause change orders. The third is the contingency factors of construction projects. It’s inevitable and reality of construction. Construction Industry Institute (CII 2001) reports some of the influences can cause change orders: “Additions or deletions in project scope; changes in codes, laws, or standards; design optimization; project planning deficiencies; incomplete design documents; workers or material availability limitations; unknown site conditions; schedule compression; or unexpected weather problems.” Also, accident or damage could be causes of change orders. Because of these reasons, it is inevitable that changes may arise.

What is the problem with change orders? Construction contracts often include a change clause, which authorize the project owner to alter the work performed by the contractor with appropriate change order process. However, such change orders often lead to result in loss of productivity,furthermore disruption of the whole project due to inefficient labor usage or cumulative impact (ripple effect) of the change orders.

The idea behind cumulative impact is that the contractor is unable to predict the unforeseen impact of change on other areas of work. The construction industry is based upon sequential production. Any disruption to a task in the sequence will impact the remaining tasks even if the change order itself does not involve these tasks. This is commonly referred to as the ripple effect of changes.

Even though there has been some study about the impact of change order on labor productivity and change order management, the industry is still struggling with issues surrounding change order management and the numerous related court cases.

A recent report of state projects built in the state of Washington reviewed a total of 865 projects and found that 87% (752) of the projects were completed with a combined total of 6,413 change orders of various sizes with an estimated value of $94 million. The report stated that one-third of the total number of change orders, or $35.4 million, could have been avoided. Inadequate field investigation, unclear specifications, plan error, and design change or mistakes by the consulting engineer were cited as causes for these changes (Cambridge Systematics, Inc. 1998).

By clearly understanding the cause of productivity loss by change order and the relationship between the factors that can influence on productivity loss, stakeholders can manage the project change orders better and further, can avoid the loss of productivity.

3. Previous research and Problem Statement

Few studies have attempted to quantify the impact of change orders on project cost and schedule as well as labor productivity.

Leonard et al. (1991) provided a significant effort to quantify the effect of change orders on labor efficiency. The study used 90 cases that which involved disputes between owners and contractors. The change order impacts were divided into 3 categories: minor, medium, and high. Graphic results that related loss of efficiency to the percentage of changes were presented in each impact category. However, there were several deficiencies in this study with limited number of variables, data adjustment, combination of data, biased sample, and classification of impact.

The CII commissioned a study titled “Quantitative impact of Project Change” (Ibbs and Allen 1995). In this study, a total of 89 projects were obtained from CII member companies. Even though this study adopted a better methodology, the study showed several limitations with low R-square in model to explain the relationship between the variables. The study also failed to support the concept of change timing effect on loss of labor efficiency.

There was another CII study completed at the University of Wisconsin-Madison that used statistical methods to quantify the impact of change orders on labor productivity for mechanical and for electrical construction (Hanna et al. 2001). This study used questionnaires that were distributed to mechanical and electrical contractors, respectively. The data from the questionnaires was used in regression analysis to determine a model to predict the project was impacted by change order or not. This study identified limited factors that impact loss of labor efficiency, such as, percent change order hours, number of change orders, timing of the change orders and processing time, management experience, and overmanning. The problem with this study is this study showed the impact of limited factors and looked only at a limited number of qualitative variables. The present research searched over 70 possible factors that impact how change orders might affect productivity. However, because of the limitation of the regression methodology, this research concluded only several factors which is related with labor productivity. The results (Logistic Regression Model) show in below:



Table & Figure1: Logistic Regression Results

Factor / Coefficient / P-value
Constant
Mechanical or Electrical (x1)
Percent Change * Mech_or_Elec (x2)
Estimated/Actual Peak Manpower (x3)
Processing Time (x4)
Overmanning (x5)
Overtime (x6)
Peak Manpower / Average Manpower (x7)

Percent Change Orders Related to Design Issues (x8) / -6.997
-1.0939
3.889
-1.0371
0.6342
2.6433
1.1933
1.2048
0.017154 / 0.000
0.143
0.006
0.039
0.002
0.000
0.052
0.036
0.074

The problem with this model is logistic regression model does not give satisfied results. From the test, this model gives only 73% of accuracy. This means even though this model developed by 130 case studies, only 73% of them were detected appropriately. This model give wrong answer for last 27% cases. Industry need a model can give more accurate result.

There are two major difficulties in determining cumulative impact of change orders on project performance productivity. First, there are many input parameters that affect the loss of labor productivity. Second, many of these parameters are qualitative in nature and hard to quantify. For example, quality of bid document, quality of contractor’s pre-planning effort, the contractor’s project management experience, and the quality of the engineering designs, etc.

From the literature review, three methods could be used to predict the outcomes of engineering systems, namely regression analysis, artificial neural networks, and fuzzy logic. However, when handling a problem like quantifying the impact of change on productivity, each of these methods has its own limitations.

Regression analysis cannot retrieve the highly nonlinear input-output function, and moreover has limited success when dealing with many qualitative or noisy input variables.

Fuzzy logic is capable of expressing fuzzy knowledge about real-life systems. However, as the system complexity increases, it becomes difficult to use fuzzy logic to determine the right set of rules and membership functions. Moreover, fuzzy logic does not make use of real-life data to learn the system behavior.

Neural networks are capable of approximating any continuous function describing the system behavior through learning from examples. However, in complex systems, they face two problems. First, for a desired continuous function there are an optimum size and optimum interconnections of the neural network, which are a priori unknowns. On one hand, a much smaller network might not be able to achieve the goal of approximating the function. On the other hand, a much larger network can represent a pathological function that is far from being real, and can further be very slow to train. Second, neural networks may be very slow to learn.

4. Data collection, treating, setup

4.1 Data Collection

One hundred-forty case studies were collected from electrical and mechanical specialty contractors through a research conducted by the principal investigator for the Construction Industry Institute (CII). The database for this research consisted of sixty-five mechanical projects and seventy-five electrical projects from thirty-three mechanical contractors and thirty-five electrical contractors across the United States. The contractors represent a wide range of large and small contractors that perform both large and small amounts of work. The database also showed a low average number of litigations per year, indicating that the sample is not biased towards contractors that are troublesome.

These case studies include 70 independent factors that may impact project performance. Many of these factors are qualitative and were difficult to model using traditional statistical techniques.

4.2 Data Setup for the Neural Network

The data set have 140 case studies with 68 input features and one output feature. Input features are composed of numerical value input, binary input (0 or 1), and categorical input (1, 2, 3, 4, 5). Also, output feature have binary case (was the project impacted by change order = 1, was not impacted = 0). The brief descriptions of each feature are shown in chapter8 (data input output features).

One of the challenges of this project is how to reduce the number of input features. Some of the factors are very important and some of them dose not help to improve the result. In class, we learn about two methods to evaluate the features (Feature Dimension Reduction). First is “Irrelevant Feature Reduction”(remove uncorrelated features by calculating mean and variance of each feature dimension). Another method is “Redundant Feature Reduction”(If the value of a feature is linearly dependent on remaining features, then this feature can be removed). To perform this feature dimension reduction, we can use statistical method. Statistical significant test can detect significant features, which have strong relationship with output feature. So, we can remove irrelevant features. Also, statistical correlation test can calculate the correlation value between each feature. So, we can reduce redundant features. I performed these tasks by using statistical software Minitab® and reduce the number of features from 68 to 20. The summary of test values (r-value and p-value) is shown in below table2.

To perform the matlab® execution, divide the data set into two set. One is for training data set (130 case studies) and saved in impacttr.txt. Another is for the testing data set (10 case studies) and saved in impactte.txt.


Table2: Correlation and P-value for selected input features

5. Neural Network approach

5.1 Neural Network Design

Over the course of the ECE 539 semester, I had had exposure to many artificial neural network learning algorithms. Based on the results I saw during the semester for the various learning experiments conducted, I determined that back-propagation perceptron learning would be the best choice for my application. Below figure2 shows basic design of Neural Network.

Figure2: Building Block of the Neural Computing Network [Neural 1989]

I used a multi-layer perceptron neural network back-propagation algorithm that is developed by Professor Hu’s bp.m and sub programs (bpconfig.m - Initial configurations of a MLP, cvgtest.m - Convergence test routine, bpdisplay.m - Display intermediate and final result of BP training, bptest.m - Test classification, bptestap.m - Test approximation, rsample.m - random sample K out of Kr rows from matrix x, actfun.m and actfunp.m - Compute activation function and their derivatives, partunef.m - partition the training data samples x into a training set and a tuning set according to a user specified percentage ratio, fsplit.m - Random partition a matrix x row-wise according to ratio:1, scale.m, randomize.m). It uses a training file (impacttr) and testing file (impactte) with each row specifying each case and the last column specifying the result. As I mentioned before, I have coded bp.m to uses 20 inputs and one output.

Professor Hu’s program (bp.m) lets the user specify the name of the training, the name of the testing file, the number of hidden neurons to use, the learning rate, the momentum constant, the maximum number of epochs to run. These options will be used to select optimized network setting and iteration.

5.2 Selection of Network Parameters


To select the network parameters, I first ran experiments to get a rough idea of what parameters worked well. Then, I set each parameter constant, and varied only one of the parameters. I selected the number of hidden neurons, the learning rate, the momentum constant, the number of samples to include in each epoch, and the maximum number of epochs to run in this way. Below figure shows basic set up of the network.

First, I ran the network with default values(20-3-1 structure, =0.01, =0.8, 500 epoch) which was given in bp.m program. Then, I was disappointed too much about the result. The neural network dose not learn and I have low classification rate (53.0769). Below figure3 shows this training result.

Figure3: Initial training with default values (crate=53.0769)

To improve my result, I try to follow the procedure, which I learn from class. First, I run the network with different combination of learning rate and momentum values. Below table3 shows the results (each case repeated10 times).

Table3: Classification rate for each case


From the result of above table, we should pick the case with =0.01, =0.8. The reason why this case shows better result compare to previous case is this case increased the number of neurons in hidden layer (from 3 to5). From this perspective, we can figure out the number of hidden layers and neurons are more important in this study.

To improve the result, we need to optimize the number of hidden neurons in network. The summary of results show in below table 4.

Table4: Different Number of Neuron comparison

From the comparison of classification results, we should choose between 10 to 20 hidden neurons. Twenty-neuron case has the highest classification rate. But, also have big variance range. So, fifteen neurons could be a best choice.

To finalize the network selection, we also need to check number of layer variance. I tried several case and come up with several cases. Below table5 shows the comparison of these results.


Table5: Different Number of hidden layer comparison

Different numbers of hidden layer and neuron combination dose not give improved result. Since many hidden layers dose not help to improve the results, I better stay with one hidden layer case.

When I summarize above results, My optimal choice could be Multi layer perceptron structure (20-15-1) with 15-hidden neurons. Also, I can choose a learning rate=0.01), and momentum value (=0.8).

After some discussion with Professor and TA about my results, I realized that I could improve my classification rate a little bit by scaling input features within same range. Most of my input feature have range between “0 to100.” but, some of them have binary value (0 or 1) and some of them have larger scale (from 0 to 1000). So, I re-scaled some of the features into uniform scale (from 0 to 100) and get a little bit better result (crate = 91.5385). Below figure4 shows the error versus epoch for selected network.

Figure4: Error vs. Epoch for selected network


Below table6 shows the confusion matrix of classification. It indicate this network model miss-classified only 4 projects out of 58 unimpacted projects and miss-classified only 7 projects out of 72 impacted projects. So, it gives total of 91.54% accuracy.