In construction project management, there are several factors influencing the final project cost. Among various approaches, estimate at completion (EAC) is an essential approach utilized for final project estimation. The main merit of EAC is including the probability of the project performance and risk. In addition, EAC is extremely helpful for project managers to define and determine the critical throughout the project progress and determine the appropriate solutions to these problems. In this research, a relatively new intelligent model called deep neural network (DNN) is proposed to calculate the EAC. The proposed DNN model is authenticated against one of the predominated intelligent models conducted on the EAC prediction, namely, support vector regression model (SVR). In order to demonstrate the capability of the model in the engineering applications, historical project information obtained from fifteen projects in Iraq region is inspected in this research. The second phase of this research is about the integration of two input algorithms hybridized with the proposed and the comparable predictive intelligent models. These input optimization algorithms are genetic algorithm (GA) and brute force algorithm (BF). The aim of integrating these input optimization algorithms is to approximate the input attributes and investigate the highly influenced factors on the calculation of EAC. Overall, the enthusiasm of this study is to provide a robust intelligent model that estimates the project cost accurately over the traditional methods. Also, the second aim is to introduce a reliable methodology that can provide efficient and effective project cost control. The proposed GA-DNN is demonstrated as a reliable and robust intelligence model for EAC calculation.
The importance of early planning to final project outcomes is emphasized widely in the literature [
Cost project control is a crucial concern in the construction projects engineering. However, controlling cost is a time-consuming and difficult process. This is due to a high number of factors that affect the cost of projects and the influences of these factors which should be considered individually at each stage of the project [
A various regression-based approach has also been proposed as an alternative to the index-based approach as advantageous methodologies for performing cost estimation activities [
Recently, advanced methods are proposed to overcome the traditional methodologies drawbacks. For instance, Caron et al. [
In one of the earlier investigations conducted by Iranmanesh and Zarezadeh [
Although there have been several investigations since 2008 on the EAC estimation using soft computing models, the topic is still limited and required more attention by the experts, especially there is a limited number of studies about project cost control with AI techniques. It has been reported several limitations of the AI model exist such as black-box nature, the requirement of a significant amount of data, overfitting, models’ interaction, and time consumption [
The proposed GA-DNN model for the prediction of the EAC.
Considering the mentioned drawbacks and conclusions, it is imperative to design a fast and effective system which considers the issues of cost control during project execution for the prediction of project EAC by using AI methods. The aim of this study rallies on resolving the identified issues in project cost management through the collection of relevant historical data and studies about project cost management for the identification of the factors that significantly affect project cost. The historical data are collected from several construction projects located in the Iraq region. This project information is used to set up the trend of a project cost flow and the relationship between project EAC, and monthly costs were mapped based on historical knowledge and experience. Based on historical data, a new intelligent model called deep neural network (DNN) model is developed for the prediction and control of EAC variation during project execution. The suggested DNN model validated against the support vector regression (SVR) prediction model. The second phase of the current research devoted to the implementation of a hybrid evolutionary model called genetic algorithm (and brute force) integrated with deep neural network GA-DNN (and BF-DNN). The aim of applying the evolutionary phase as a prior stage for the predictive model is to allocate the correlated attributes to build the accurate predictive model. Again, the modeling of the hybrid intelligent model is authorized with the GA-SVR and BF-SVR. This step ensured that the identification of potential issues for effective measures to be timely implemented.
Several problems can be solved using the application of neural networks due to their ability to calculate any computable function. They are mainly useful in solving problems that can tolerate some levels of error or problems that are laden with several historical data but cannot be easily handled via the application of the hard and fast rules [
The construction of standard NNs requires the use of neurons to produce real-valued activations, and the NNs can behave as expected by adjusting the weights of the neurons. There may be several chains of computational stages during the training of NNs depending on the problem to be solved. Since 1980, backpropagation, an efficient gradient descent algorithm, has played a significant role in NNs by its capability of training ANN via a teacher-based supervised learning method [
The layer-wise-greedy-learning method was proposed by Hinton et al. [
Though several types of deep learning model exist, the focus of this discussion is on the deep neural networks that are constructed from multiple hidden layers often known as backpropagation neural networks. Deep learning is historically based on how to use backpropagation with gradient descent and a large number of nodes and hidden layers. This type of backpropagation neural network is indeed the first deep learning approach that showed a wide range of application. A typical DNN comprised of closely embedded input, output, and several hidden layers. The input and hidden layers are directly connected and operate together to weigh the input values to produce a new set of real numbers that will be transmitted to the output layer (Figure
(a) The slandered architecture of deep neural network description and (b) the support vector regression model structure.
The main merit of the DNN is that the deep multilayer neural network is made up of several levels of nonlinearities which made them applicable to the representation of highly nonlinear and/or highly varying functions. They can identify complicated patterns in data and can be applied in natural complex problems. The connection weights connections between the layers, as in the single layer neural network, are updated to ensure the closeness of the output value to the targeted output.
Figure
Note that the outcome of
Vapnik [
The optimization problem of the SVR model is usually elucidated using the Lagrangian multipliers, sequential minimal optimization [
GA is a very well-known optimization technique that can be classified as an evolutionary method based on biological process [
The proposed hybrid genetic algorithm deep neural network (GA-DNN) predictive model.
In Figure
In the evolutionary algorithm, the first step is the creation of a population of individuals which evolves over time. This initial step is known as the initialization phase of the GA. In the starting population, the individuals are randomly generated and represented as a bit vector like earlier described. These individuals can be created via tossing a coin for any available attribute, and based on the outcome of the probability toss, the attribute to be included in the population can be determined. There are no rules governing the size of the initial population; however, there must be at least 2 individuals in a GA population to proceed to the crossover phase. A perfect rule of thumb is the acceptance of between 5 and 30% of the total number of attributes as the size of the initial population. Having created the initial population, several steps need to be performed to reach the stopping criterion.
Brute force (BF) is a systematic selecting approach that solves problems which require the enumeration of all the possible features [
The current research is conducted on fifteen construction projects executed in Baghdad city, Iraq. The detailed information about those projects is provided in Table
The biodata of the inspected construction projections.
Project name | Total area (m2) | Underground floors | Ground floors | Buildings | Start date | Finish date | Duration (days) | Contract amount ($) | Prediction periods |
---|---|---|---|---|---|---|---|---|---|
A | 7,854 | 2 | 2 | 2 | 4/22/2007 | 4/17/2008 | 361 | 4,319,000 | 9 |
B | 6,238 | 1 | 1 | 1 | 01/03/2008 | 10/24/2009 | 295 | 3,512,900 | 12 |
C | 7,284 | 0 | 1 | 1 | 09/28/2006 | 08/31/2007 | 337 | 5,119,050 | 10 |
D | 6,824 | 0 | 2 | 1 | 02/15/2005 | 01/17/2006 | 336 | 4,519,050 | 13 |
E | 6,453 | 1 | 1 | 2 | 06/05/2005 | 07/02/2006 | 392 | 4,812,800 | 10 |
F | 7,471 | 2 | 1 | 2 | 10/05/2008 | 08/12/2009 | 312 | 3,627,300 | 12 |
G | 7,864 | 1 | 1 | 1 | 03/03/2009 | 01/27/2010 | 330 | 4,423,050 | 11 |
H | 6,678 | 1 | 1 | 1 | 05/01/2007 | 02/24/2008 | 299 | 3,627,300 | 9 |
I | 9,340 | 1 | 2 | 3 | 06/23/2007 | 08/13/2008 | 417 | 6,339,350 | 11 |
J | 10,628 | 0 | 2 | 2 | 10/09/2004 | 11/29/2005 | 416 | 6,128,150 | 13 |
K | 8,245 | 0 | 1 | 3 | 08/15/2010 | 07/13/2011 | 332 | 5,111,350 | 11 |
L | 8,782 | 0 | 2 | 3 | 01/10//2006 | 01/05/2007 | 360 | 3,914,600 | 12 |
M | 6,625 | 1 | 2 | 1 | 01/01/2008 | 02/28/2009 | 424 | 7,529,350 | 14 |
N | 5,441 | 1 | 1 | 2 | 08/25/2007 | 08/18/2008 | 359 | 5,223,100 | 13 |
O | 5,730 | 1 | 1 | 1 | 04/01/2005 | 02/27/2006 | 332 | 5,522,500 | 14 |
Total | 174 | ||||||||
Training | 131 | ||||||||
Testing | 43 |
The collected information of the projects includes cost variance (CV), schedule variance (SV), cost performance index (CPI), schedule performance index (SPI), subcontractor billed index, owner billed index, climate effect index, change order index, and construction price fluctuation (CCI). However, the estimate at completion is the main targeted variable to be estimated. The nine factors are used as predictors to determine the EAC. The 15 projects comprised 174 periods, 75% of the total periods (131 periods) are performed for the training phase, and 25% (43 periods) for the testing phase of the predictive models. The modeled historical data are processed through normalization linear scale between (0 and 1). This is for the purpose to supply the data for the programming environment with scaled numerical. The normalization is performed as follows:
Input-output variables system structure using the hybrid intelligent GA-DNN predictive model.
Following various engineering applications and within prediction problems [
As an advanced stage for the prediction process, the cost database of the selected projects was determined. The data represent the planned and the actual cost values for each month and the computed difference between them. The mathematical relationship between the nine (the abstracted input combinations) attributes and the EAC is explored using the potential of the AI expertise learning. The motivation of applying the AI models in computing the EAC is to overcome the drawbacks of the classical indexed formulations since AI models can mimic the human brain intelligence in solving complex real-life problems.
The primary prediction modeling was conducted for the stand-alone proposed DNN and its comparable SVR predictive model. Table
The numerical evaluation indicators for the DNN and SVR predictive models “stand-alone versions” over the testing modeling phase.
Predictive models | RMSE | MAE | MRE | NSE | SI | BIAS | WI |
---|---|---|---|---|---|---|---|
DNN | 0.1303 | 0.0778 | 5.7468 | 0.4964 | 0.8313 | 0.0162 | 0.7419 |
SVR | 0.1360 | 0.0857 | 11.4615 | 0.4514 | 0.8676 | -0.0026 | 0.6931 |
The enthusiasm on coupling the input selection approach to the predictive model is to explore the predominant input combination correlated to the EAC magnitude. Note that, this is highly magnificent to recognize the main influenced variables during the project progress that affect the variance of the EAC results. The nature-inspired genetic algorithm was hybridized with the DNN to abstract the suitable input combination. On the other hand, brute-force selection procedure is used as a benchmark for the GA comparison.
The input combination and the prediction skill results of the hybrid model GA-DNN are indicated in Tables
The input combination attributes used to determine the value of the EAC using GA-DNN model.
Number of inputs | Models | Input variables | |
---|---|---|---|
2 | Model 1 | CV, SV | EAC |
3 | Model 2 | CV, SV, CPI | EAC |
4 | Model 3 | CV, SV, CPI, subcontractor billed index | EAC |
5 | Model 4 | CV, SV, SPI, owner billed index, CCI | EAC |
6 | Model 5 | CV, SV, CPI, SPI, change order index, CCI | EAC |
7 | Model 6 | CV, SV, CPI, SPI, subcontractor billed index, owner billed index, CCI | EAC |
8 | Model 7 | CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order index, climate effect index | EAC |
The numerical evaluation indicators for the GA-DNN predictive model over the testing modeling phase.
Method | RMSE | MAE | MRE | NSE | SI | BIAS | WI |
---|---|---|---|---|---|---|---|
Model 1 | 0.0843 | 0.0555 | −0.1082 | 0.7890 | 0.5382 | 0.0229 | 0.8710 |
Model 2 |
|
|
|
|
|
|
|
Model 3 | 0.0910 | 0.0620 | 0.2943 | 0.7541 | 0.5809 | 0.0036 | 0.8722 |
Model 4 | 0.0919 | 0.0629 | −0.1093 | 0.7493 | 0.5865 | 0.0334 | 0.9106 |
Model 5 | 0.1010 | 0.0629 | 0.0671 | 0.7049 | 0.6268 | 0.0288 | 0.8561 |
Model 6 | 0.0777 | 0.0510 | −0.1594 | 0.8253 | 0.4823 | 0.0175 | 0.9182 |
Model 7 | 0.1002 | 0.0595 | 0.2029 | 0.7020 | 0.6394 | 0.0102 | 0.8404 |
The input combination attributes used to determine the value of the EAC using the BF-DNN model.
Number of inputs | Models | Input variables | |
---|---|---|---|
2 | Model 1 | CV, SV | EAC |
3 | Model 2 | CV, subcontractor billed index, CCI | EAC |
4 | Model 3 | CV, SV, CPI, SPI | EAC |
5 | Model 4 | CV, SV, SPI, subcontractor billed index, climate effect index | EAC |
6 | Model 5 | CV, SV, CPI, SPI, owner billed index, climate effect index | EAC |
7 | Model 6 | CV, SV, CPI, SPI, subcontractor billed index, change order index, CCI | EAC |
8 | Model 7 | CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order index, climate effect index | EAC |
The numerical evaluation indicators for the BF-DNN predictive model over the testing modeling phase.
Models | RMSE | MAE | MRE | NSE | SI | BIAS | WI |
---|---|---|---|---|---|---|---|
Model 1 | 0.0843 | 0.0555 | −0.1082 | 0.7890 | 0.5382 | 0.0229 | 0.8978 |
Model 2 | 0.0712 | 0.5721 | 0.4154 | 0.8494 | 0.4546 | −0.0086 | 0.9229 |
Model 3 | 0.0750 | 0.0537 | 0.2552 | 0.8333 | 0.4783 | 0.0020 | 0.9179 |
Model 4 | 0.0802 | 0.0527 | 0.0250 | 0.8091 | 0.5118 | 0.0155 | 0.9038 |
Model 5 | 0.0890 | 0.0594 | 0.4141 | 0.7651 | 0.5678 | 0.0034 | 0.8778 |
Model 6 |
|
|
|
|
|
|
|
Model 7 | 0.1009 | 0.0607 | 0.5762 | 0.6981 | 0.6437 | 0.0168 | 0.8654 |
The modeling input combinations and prediction skills results of the GA-SVR and BF-SVR are tabulated in Tables
The input combination attributes used to determine the value of the EAC using the GA-SVR model.
Number of inputs | Models | Type of input variables | |
---|---|---|---|
2 | Model 1 | CV, CCI | EAC |
3 | Model 2 | CV, subcontractor billed index, change order index | EAC |
4 | Model 3 | CV, CPI, owner billed index, CCI | EAC |
5 | Model 4 | CV, subcontractor billed index, owner billed index, change order index, CCI | EAC |
6 | Model 5 | CV, CPI, SPI, subcontractor billed index, change order index, CCI | EAC |
7 | Model 6 | CV, SV, CPI, SPI, change order index, CCI, climate effect index | EAC |
8 | Model 7 | CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order index, climate effect index | EAC |
The numerical evaluation indicators for the GA-SVR predictive model over the testing modeling phase.
Method | RMSE | MAE | MRE | NSE | SI | BIAS | WI |
---|---|---|---|---|---|---|---|
Model 1 | 0.0923 | 0.0571 | 0.3860 | 0.7473 | 0.5889 | −0.0064 | 0.8710 |
Model 2 | 0.0847 | 0.5820 | 0.3972 | 0.7874 | 0.5401 | −0.0016 | 0.8925 |
Model 3 |
|
|
|
|
|
|
|
Model 4 | 0.1014 | 0.0588 | 0.4669 | 0.6948 | 0.6471 | −0.0001 | 0.8359 |
Model 5 | 0.1056 | 0.0641 | 0.6324 | 0.6772 | 0.6555 | 0.0074 | 0.8241 |
Model 6 | 0.1096 | 0.0652 | 0.5087 | 0.6523 | 0.6803 | 0.0244 | 0.8359 |
Model 7 | 0.1121 | 0.0629 | 0.7884 | 0.6271 | 0.7154 | −0.0229 | 0.8187 |
The input combination attributes used to determine the value of the EAC using the BF-SVR model.
Number of inputs | Models | Type of input variables | |
---|---|---|---|
2 | Model 1 | CV, SV | EAC |
3 | Model 2 | CV, SV, CPI | EAC |
4 | Model 3 | CV, SV, CPI, SPI | EAC |
5 | Model 4 | CV, SV, CPI, SPI, subcontractor billed index | EAC |
6 | Model 5 | CV, SV, CPI, SPI, subcontractor billed index, owner billed index | EAC |
7 | Model 6 | CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order index | EAC |
8 | Model 7 | CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order index, CCI | EAC |
The numerical evaluation indicators for the BF-SVR predictive model over the testing modeling phase.
Method | RMSE | MAE | MRE | NSE | SI | BIAS | WI |
---|---|---|---|---|---|---|---|
Model 1 | 0.0853 | 0.0497 | 0.2128 | 0.7840 | 0.5445 | 0.0062 | 0.8871 |
Model 2 | 0.0778 | 0.5431 | 0.3569 | 0.8204 | 0.4964 | −0.0063 | 0.9147 |
Model 3 |
|
|
|
|
|
|
|
Model 4 | 0.0966 | 0.0576 | 0.3964 | 0.7235 | 0.6160 | −0.0004 | 0.8590 |
Model 5 | 0.0970 | 0.0501 | 0.4953 | 0.7208 | 0.6190 | 0.0070 | 0.8533 |
Model 6 | 0.0961 | 0.0593 | 0.4255 | 0.7262 | 0.6148 | 0.0228 | 0.8802 |
Model 7 | 0.1148 | 0.0638 | 0.5869 | 0.6088 | 0.7327 | 0.0316 | 0.8236 |
Scatter plot graphical exhibition is one of the excellent ways to visualize the correlation between the actual observations and predicted value. Figure
The scatter plot graphical visualization between the actual observation of EAC and the intelligence predictive models: (a) optimal input combination for GA-DNN; (b) optimal input combination for GA-SVR; (c) optimal input combination for BF-DNN; (d) optimal input combination for BF-SVR.
Figure
Two-dimensional standard deviation and correlation statistic (i.e., Taylor diagram): (a) stand-alone DNN and SVR models; (b) GA-DNN input combinations; (c) GA-SVR input combinations; (d) BF-DNN input combinations; (e) BF-SVR input combinations.
Actual observation of EAC and the optimal combination for (a) GA-DNN and GA-SVR and (b) BF-DNN and BF-SVR.
To conclude the discussion section of the application, the main contribution of the authors highlighted the robustness of the hybrid GA-DNN that denotes two modeling phases. GA indicates the evolutionary nature inspired for the feature input selection and the DNN model as predictive model. The proposed methodology displayed a very positive result in which contributes the construction engineering project managers to monitor the cost at completion during the whole progress of the project.
The applied methodology in the current research was inspired from the motivation of exploring new reliable approach for modeling EAC in construction projects. The proposed model distinguished itself by the capability of comprehending the actual mechanism of the related variables to the targeted variable with more solidity manners. This is a main essential perspective for practical implementation from construction project management. Overall, having the hybridization of the evolutionary optimization algorithm as a selective procedure, the prepredictive model (i.e., deep neural network) attained convincing results for the perspective of the scientific research and innovative modeling strategy exploration.
Based on the various statistical indicators, the best results indicated an outstanding evaluation performance with respect to the minimal absolute error measures and the best fit-of-goodness (RMSE and correlation value (
In this research, a new hybrid data-intelligence predictive model called GA-DNN is explored for facilitating the construction managers with the reliable and robust methodology that control project cost and attain accurate estimation for the EAC. The implementation of this methodology is provided as an automation system where the project activities can be monitored, controlled, and any defective consequences can be avoided. The intelligence system comprises two phases: (i) the evolutionary phase of the genetic algorithm to abstract the influenced input attributes for the modeled prediction matrix and (ii) the DNN prediction model that uses the abstracted variables for each input combination to module the EAC. The BF input section procedure is used as a benchmark for the GA optimizer and SVR as a comparable prediction model. The results confirmed the predictability of the DNN over the SVR stand-alone models. In addition, the hybridization with nature-inspired input algorithm selection boosted the prediction outcomes. The devotion for future research is highly applicable for the current study where this methodology can be implemented on other construction projects as a real-time application where the contribution can be recognized in the form of a practical solution. This can be distinguished as the advantage of monitoring the project life in more reliable manners and subjective to the status of the project.
The authors are very much thankful to the construction companies for providing the data of the current research.
The authors declare that they have no conflicts of interest.