In recent years, the cost index predictions of construction engineering projects are becoming important research topics in the field of construction management. Previous methods have limitations in reasonably reflecting the timeliness of engineering cost indexes. The recurrent neural network (RNN) belongs to a time series network, and the purpose of timeliness transfer calculation is achieved through the weight sharing of time steps. The long-term and short-term memory neural network (LSTM NN) solves the RNN limitations of the gradient vanishing and the inability to address long-term dependence under the premise of having the above advantages. The present study proposed a new framework based on LSTM, so as to explore the applicability and optimization mechanism of the algorithm in the field of cost indexes prediction. A survey was conducted in Shenzhen, China, where a total of 143 data samples were collected based on the index set for the corresponding time interval from May 2007 to March 2019. A prediction framework based on the LSTM model, which was trained by using these collected data, was established for the purpose of cost index predictions and test. The testing results showed that the proposed LSTM framework had obvious advantages in prediction because of the ability of processing high-dimensional feature vectors and the capability of selectively recording historical information. Compared with other advanced cost prediction methods, such as Support Vector Machine (SVM), this framework has advantages such as being able to capture long-distance dependent information and can provide short-term predictions of engineering cost indexes both effectively and accurately. This research extended current algorithm tools that can be used to forecast cost indexes and evaluated the optimization mechanism of the algorithm in order to improve the efficiency and accuracy of prediction, which have not been explored in current research knowledge.
Due to unique industry characteristics, construction projects require a large amount of capital investment [
The research of engineering cost prediction has a long history. The initial prediction form must be based on a complete design drawing, which is unable to meet the requirements of the actual scenarios in terms of time and efficiency. With the development of information technology, cost prediction began to break away from the limitation of drawings, and gradually a solution emerged to establish an information model based on various algorithms. At present, the cost prediction methods available based on information technology can be divided into two branches: construction cost prediction and cost indexes prediction. The types of construction cost prediction mainly include the prediction of bidding price [
The methods of cost predictions can be divided into two categories: causal analysis and time series analysis [
Statistical methods are also known as black box methods or time series methods which can be divided into the two categories of univariate and multivariate time series analyses [
The branch of machine learning which has been most widely used in the field of engineering cost prediction is mainly composed of neural networks, support vector machines (SVM), and k-nearest neighbour (KNN) algorithms. Juszczyk and Leśniak proposed a model based on the artificial neural networks (ANN) which are involved in radial basis functions (RBF) for the purpose of forecasting the indexes of site overhead costs. It was found that the prediction models had achieved satisfactory results [
As discussed in previous literature review, traditional prediction methods have their own limitations in predicting the construction cost index. For example, the causal methods require many explanatory variables to be predicted and cannot reflect the uncertain price fluctuations [
This study poses two research questions: how LSTM NN can be applied to predict engineering cost indexes and how various factors can affect model performance including input features, time series length, and model structures.
Applying LSTM NN in predictions of construction engineering cost indexes and exploring the optimization mechanism are mainly based on the following considerations: According to literature review and theoretical research analysis, the structure type, training cost, and calculation efficiency of LSTM NN are suitable for the processing of cost indexes data. However, the performance of LSTM NN in this field has not been explored in the previous research. The feature selection of an LSTM neural network model has a major influence on the prediction accuracy of the model. However, there is currently no standard selection criterion for the selection of the parameters of such a model.
The aims of this research were to explore the various theories and methods available for LSTM neural networks in the accurate predictions of construction engineering cost indexes and to evaluate the proposed model’s prediction performances. The first objective of this research is to investigate the research gaps in the field of cost prediction and the limitations of the current forecasting methods. The corresponding research methods are literature review and theoretical analysis methods. The second research objective is to determine a set of indicators suitable for the prediction of China’s cost indexes. Through literature review and expert argumentation, the content of the indicator set can reach a certain level of comprehensiveness. The third research goal is to verify the applicability of the LSTM NN, for which the case analysis method can be used to objectively judge the predictive performance of the model. The final research goal is to explore the optimization mechanism of the LSTM model. Different input features, time series lengths, and model structures are set by comparative analysis to improve prediction accuracy.
The indicators selection criteria of this study benefit from previous research of Zhang [
Coding, type, and data sources of the index set.
Indicator name | Brief explanation | Data source | |
---|---|---|---|
Gross domestic product (GDP) | GDP refers to the final result of the production activities of all resident units in a country (or region) calculated in accordance with the national market price in a certain period of time, and is often recognized as the more accurate indicator to measure the economic status of the country | National Bureau of Statistics of China | |
Consumer price index (CPI) | CPI is a macroeconomic indicator that reflects changes in the price levels of consumer goods and service items generally purchased by households | ||
Money supply (MS) | Money supply refers to the sum of cash and deposits in circulation at a certain point in time; money supply is one of the main economic statistical indicators compiled and published by the central banks of various countries | ||
Floor space started (FSS) | Floor space started is the construction area of each house newly started in the report period; the commencement of the house shall be subject to the date when starting to break the ground and dig the trench (foundation treatment or permanent pile driving) | China entrepreneur Investment Club | |
Crude oil price (COP) | Crude oil price is the first purchase price of a barrel of crude oil | ||
Loan rate (LR) | Loan rate is the interest rate charged to borrowers when banks and other financial institutions issue loans; this refers to the annual interest rate of the Bank of China’s loans | ||
Steel bar price (SBP) | Glass price (GLP) | The information price is the publicly announced average social price determined by the government cost management department based on the amount of various typical engineering materials and social supply; it is generally updated once a month | Shenzhen housing and Construction Bureau, Glodon Company Limited |
Concrete price (CP) | Wood price (WP) | ||
Cement price (CEP) | Block price (BP) | ||
Medium sand price (MSP) | Diesel price (DP) | ||
Gravel price (GRP) | Material cost index (MCI) | ||
Shenzhen engineering cost index | Shenzhen engineering cost index is an indicator that reflects the degree of impact of price changes on engineering costs over a certain period of time |
The original data used for the training of forecasting model in this research were collected from several different data resources, including the CEIC database (
Due to the fact that the original data only contained raw information which could potentially lead to problems related to noise, anomalous points, missing information, errors, and frequency differences, the data were preprocessed prior to being used for training the model. Boxplot is a method to describe data using five statistics in the data: minimum, first quartile, median, third quartile, and maximum [
Statistics of the primary selection index set.
Indicator name | Mean | Median | Maximum | Minimum | Std. Dev. |
---|---|---|---|---|---|
Shenzhen engineering cost index | 136.181 | 139.260 | 177.460 | 100.000 | 22.005 |
Material cost index | 112.432 | 113.110 | 134.290 | 99.520 | 7.528 |
Floor space started | 10300.070 | 10297.610 | 22178.700 | 2580.100 | 4547.512 |
Crude oil price | 572.126 | 540.860 | 948.800 | 231.870 | 188.525 |
Loan rate | 5.700 | 6.000 | 7.470 | 4.350 | 0.918 |
Consumer price index | 124.960 | 127.260 | 144.080 | 100.000 | 11.624 |
Money supply | 92.915 | 89.557 | 173.986 | 31.671 | 43.158 |
Steel bar price | 4225.671 | 4250.000 | 6050.000 | 2400.000 | 868.063 |
Concrete price | 385.115 | 360.720 | 540.920 | 314.780 | 57.445 |
Cement price | 491.685 | 495.000 | 630.000 | 49.000 | 68.999 |
Medium sand price | 97.531 | 95.000 | 146.000 | 54.000 | 34.170 |
Gravel price | 111.252 | 120.000 | 160.000 | 60.000 | 26.289 |
Glass price | 38.979 | 35.000 | 52.000 | 30.000 | 7.864 |
Wood price | 1169.238 | 1200.000 | 1499.000 | 1000.000 | 139.684 |
Block price | 279.203 | 290.000 | 290.000 | 233.000 | 14.054 |
Diesel price | 7.325 | 7.350 | 9.360 | 5.470 | 1.110 |
Gross domestic product | 14125.420 | 14035.170 | 25269.960 | 6015.410 | 5116.898 |
Based on normal machining learning theories and pervious research [
Due to the fact that cyclic calculations have unique types, the original data were required to be segmented by a time-step process, and the input of each time step was required to correspond to all of the index characteristics of a single time point. In addition, due to the powerful synthesis ability of the LSTM at different time steps, the index feature information of multiple time points could be gradually extracted and synthesized within the movements of the time steps, which allowed for the extracted feature vectors to be increasingly powerful in their expressions of the input data. Therefore, it was determined that during the data processing the data should be divided into blocks, with each block containing the input index characteristics of multiple time steps.
The data used in this study were divided into a set of input data representing every five-month period in order to predict the cost indexes of the following sixth month. The model structure of the LSTM was organized by groups, as shown in Figure
LSTM model data input structure.
During the process of developing deep neural networks of LSTM, a method of multilayered LSTM was applied for the structural combinations, as shown in Figure
Training structure of the LSTM model.
MXNet is one of Amazon’s most powerful deep learning frameworks. Currently, distributed machine learning platforms that support time series prediction based on LSTM include MXNet, PyTorch, and Caffe2. Compared with other deep learning frameworks, Mxnet has the advantages of strong readability, ease of learning, high parallel efficiency, and memory saving [
The LSTM model training process is shown in Figure
Training process of the LSTM model.
The loss function L2 was calculated by comparing the predicted values calculated each month with the actual values placed in the output layer. During the training phase, L2 was taken as the optimization goal, and the L2 loss function was defined as follows:
A momentum method was used as the model’s optimization algorithm. By introducing the intermediate variables, the gradients in the irrelevant direction were cancelled both positively and negatively, which overcame the problems of slow convergence or even nonconvergence caused by the gradients swinging back and forth in the nondescending direction which had been encountered in the traditional gradient descent methods. The updated parameter formulas for each iteration of the momentum method were as follows:
After the single batch gradient calculation was completed, the network parameters were updated by the momentum method optimizer. The Stochastic Gradient Descent Momentum (SGDM) was able to achieve faster parameter updates and the model had displayed improving converging ability. Finally, the trained model parameters were saved for future use during the prediction phase.
In addition, the detailed methods used in the training process included batch processing and flow training. With consideration given to the performance levels of the computers used in this study, the batch size of the training set was established as 5; the batch size of the test set was 1; and the learning rate was initialized to 0.1 after the in learning of 200 epochs was completed. Then, in order to avoid the model parameter values becoming too large, this method used a weight decay technique with a value of 5
After the development of the LSTM model, the decreases in the output for the loss function with the number of learning iterations could be used to determine the convergence and fitting effects of the model. As shown in Figure
Trend of the loss function of the LSTM training set.
The differences observed in the fitting effects between the predicted and real values are displayed in Figure
Prediction fitting of the LSTM model.
As can be seen in Figure
Prediction error trend of the LSTM model.
Prediction error and accuracy results of the LSTM model.
Model | Absolute maximum error | Absolute minimum error | MAE | MSE | MAPE | Model accuracy |
---|---|---|---|---|---|---|
LSTM | 2.03 | 0.7 | 0.96 | 1.03 | 0.71 | 99.29 |
For comparing the performance of the proposed model, this paper selects the current advanced SVM algorithm as the comparison object and trains the model based on the same dataset. The predicted results are shown in Figure
Prediction error trend of the SVM model.
Prediction error and accuracy results of the SVM model.
Model | Absolute maximum error | Absolute minimum error | MAE | MSE | MAPE (%) | Model accuracy |
---|---|---|---|---|---|---|
SVM | 7.23 | 0.74 | 2.83 | 10.51 | 1.99 | 98.01 |
Through comparison, it is found that LSTM has advantages in terms of both prediction accuracy and parameter adjustment. The accuracy of the SVM model is 98.01%, while that of the LSTM model is 99.29%, and the fitting effect of the LSTM model is better. The LSTM model’s fluctuation level of the absolute error and mean square error are smaller than that of the SVM model. In addition, the SVM model only involves two parameters, the penalty term “C” and the kernel function difference coefficient “gamma.” However, there is no universally accepted method for determining these. The conventional approach is to take values based on experience within a certain range, then gradually narrow the range by comparing the MSE after training to determine the stronger parameters. Although LSTM involves many parameters, and generally the input value, output value, and hidden layer, the neuron number must be adjusted. The weights and thresholds are randomly assigned, and the parameters are updated using SGDM. Taking these aspects together, the proposed prediction framework is shown to possess certain competitiveness.
The proposed framework can be applied in forecasting the short-term or long-term trend of macroeconomic situation that has great influence to the cost and financial budget of a construction project, in terms of the real practical scenarios including policy making of government departments, the investment decision-making of real estate enterprises, the rationality of technical and economic indicators of design unit, and the dispute settlement between the client and the general contractor.
Take the issue of contract risk between the client and the general contractor as an example. In the bidding stage, the contracting company usually gives harsh bidding conditions for the price adjustment of building materials, which often makes the construction units in a passive situation. The proposed framework can avoid the risk of the construction party to a certain extent. The specific steps are as follows. Firstly, the construction units can quickly establish a training team within the validity period of the tender to collect the indicators of the current period and previous years and use it as the original training data. Secondly, the team members predict the monthly engineering cost indexes during the construction phase based on the proposed model and the construction period. Finally, judge the rationality of the relevant requirements of the bidding documents according to the change range of the cost indexes between the completion period and the current period. If the requirements are reasonable, the construction units will normally participate in the bidding. Instead, they can apply to negotiate with general contractor or abandon the bid to minimize their own risks. In summary, the proposed framework has practical value to assist bidding decisions.
The aforementioned research results showed that the proposed LSTM neural network model was suitable in prediction applications of construction engineering cost indexes. However, during the process of creating the LSTM model, it was found that there was no standard method available for sample selections, parameter settings, setup of time series lengths, and the designing of the model structure. Generally speaking, the setup of the model was in accordance with previous experience. However, it was accepted that the selection of the various samples and other model settings would potentially affect the prediction performance of the model. Therefore, it was necessary in this study to discuss the mechanisms and optimization of the parameter selections and model settings for the development process of the LSTM model.
There are many factors which may potentially affect the predictions of construction project cost indexes. These factors can mainly be divided into four categories: economic, energy, construction market, and all indicators.
In the present study, in accordance with the aforementioned four groups of indicators, the following four models were established, and a basic model of all the indicators was used as a comparison model in order to explore the impacts of the input features on the engineering cost indexes. The prediction results of the other three models were obtained by modifying the input sample dimensions of the base model. The mean square error and prediction accuracy of the model were then successfully calculated. The results are shown in Table
Effects of the input features on the model’s accuracy.
Model | Dimension | Category | Indicators | MSE | Model accuracy |
---|---|---|---|---|---|
M1 | 4 | Economy | LR, CPI, MS, GDP | 3.92 | 98.71 |
M3 | 10 | Construction market | MCI, FSS, SBP, CP, CEP, MSP, GRP, GLP, WP, BP | 1.27 | 99.22 |
M12 | 6 | Economy + energy | MCI, CPI, MS, GDP, COP, DP | 3.16 | 98.85 |
M123 | 16 | All | All | 1.03 | 99.29 |
The absolute error values of the prediction results were calculated according to the prediction results of the 27 test sets. In order to compare the error values of the models and their stability, the absolute error values of the predictions of the four models were determined, as described in Figure
Predictions of the absolute error trends of the four models with different input features.
As previously illustrated in Table
This study then compared the four models in combination with Table
In summary, among the three types of indicators, economic indicators, energy indicators, and construction market indicators, the construction market indicators were found to have the most significant impacts on the predictions of the engineering cost indexes and could be used as effective information for the proposed model. It was observed that when increasing or decreasing the dimensions of the input features, the dimensions of the input data were small; appropriately increasing the effective information could potentially improve the prediction accuracy of the model. However, when the dimensions of the input data were larger, the prediction accuracy of the model could not be greatly improved. In such cases, even redundancy of the input information may occur, which could potentially reduce the accuracy of the model. Therefore, it was determined in this study that the economic, energy, and construction market indicators should be used as the input features for the proposed model, which would improve the prediction accuracy of the LSTM model. It was also believed that if the data collection was difficult, the construction market indicators could be directly used as the input features.
The length of the time series may also affect the prediction accuracy of a model. The length of a time series is usually obtained from the analysis of specific problems, and there currently is no standard determination method. In the present study, 16 indicators were used as the input variables, and the data were processed into time series of lengths of 3, 5, 7, and 10, respectively. Then, the model was established and trained. The results are shown in Table
Effects of the time series lengths on the accuracy rates of the models.
Model | Time series length | Mean square error | Model accuracy (%) |
---|---|---|---|
M123-d3 | 3 | 2.27 | 98.95 |
M123-d5 | 5 | 1.03 | 99.29 |
M123-d7 | 7 | 1.10 | 99.22 |
M123-d10 | 10 | 0.88 | 99.33 |
Predictions accuracy and absolute error trends of the models with different time lengths.
As can be seen in Table
It was observed in this study that when the time series length was excessively short and the effective information provided by the samples was insufficient, the proposed LSTM model could not learn the transformation rules of the training samples, which led to a low accuracy rate of the model. However, because the further the data were taken from the predictive period, the smaller the prediction impacts on the data of the prediction period would be, the prediction accuracy of the model was not significantly improved when the time series length had been increased. Moreover, the longer the time series was, the more noise it would contain, which is not conducive to the accurate predictions of the model. In summary, the time series length has a certain influence on the prediction accuracy, but the training cost is more sensitive to its change. With the increase of time series length, the improvement of training cost is much higher than the prediction accuracy. Therefore, it is necessary to select the appropriate time series length to improve the application efficiency of the model. In the present study, it was observed that the M123-d5 and M123-d10 Models exhibited the highest prediction accuracy. The accuracy rates of the two models were found to be analogous, although the training duration of Model M123-d10 was longer. Subsequently, the time series length was set as 5 in this research, in order to achieve an improved model performance.
For LSTM neural networks, the number of hidden layer neurons determines the structure of the neural network model. However, there is currently no unified method which can be applied to determine the number of neurons in a hidden layer. In this study, by comparing the prediction accuracy rates of the model under the conditions of various numbers of neurons in the hidden layer, the most suitable number of neurons was selected.
Therefore, on the basis that all 16 indicators were used as the input variables of the model and the time series length was set as 5, the number of hidden layer neurons was set to the value of 10 times between 10 and 150. The model’s training and the prediction results were successfully obtained, as shown in Table
Effects of the model’s structure on the accuracy results.
Model | Model structure | Mean absolute error | Mean square error | Accuracy (%) |
---|---|---|---|---|
M123-h10 | 16-10-1 | 1.11 | 1.37 | 99.18 |
M123-h20 | 16-20-1 | 1.07 | 1.28 | 99.21 |
M123-h30 | 16-30-1 | 1.07 | 1.23 | 99.21 |
M123-h40 | 16-40-1 | 0.85 | 0.89 | 99.34 |
M123-h50 | 16-50-1 | 0.96 | 1.02 | 99.29 |
M123-h60 | 16-60-1 | 1.02 | 1.14 | 99.24 |
M123-h70 | 16-70-1 | 1.05 | 1.26 | 99.21 |
M123-h80 | 16-80-1 | 1.06 | 1.22 | 99.21 |
M123-h90 | 16-90-1 | 1.01 | 1.15 | 99.27 |
M123-h100 | 16-100-1 | 1.25 | 1.67 | 99.11 |
M123-h110 | 16-110-1 | 1.05 | 1.16 | 99.24 |
M123-h120 | 16-120-1 | 1.14 | 1.41 | 99.17 |
M123-h130 | 16-130-1 | 1.16 | 1.45 | 99.15 |
M123-h140 | 16-140-1 | 1.09 | 1.29 | 99.20 |
M123-h150 | 16-150-1 | 1.13 | 1.48 | 99.18 |
It was observed in this study that when the number of hidden layer neurons increased from 10 to 40, the mean square errors of the model’s predictions gradually decreased until reaching a minimum value. In addition, the prediction accuracy rate of the model gradually increased until the maximum value was attained. Furthermore, as the number of hidden layer neurons continued to increase, the prediction accuracy rate of the model did not improve, and when the number of hidden layer neurons increased to approximately 100, then the accuracy of the model’s results tended to fluctuate. It was determined in this study that too many or too few hidden layer neurons would potentially reduce the prediction accuracy of the model. For example, if the number of hidden layer neurons was too small, then the underfitting of the model led to increased prediction errors. Meanwhile, if the number of hidden layer neurons was too great, then the prediction accuracy of the model tended to not be improved, which had a tendency to lead to the occurrence of unstable phenomena and overfitting of the model.
In this article, we proposed a prediction model based on an LSTM neural network, which is suitable for the short-term engineering cost indexes prediction or other cost data with temporal or spatial properties. The proposed model will be applied to the feasibility study stage or bidding stage of the project. It can provide accurate industry trends so that all engineering participants can evaluate the project risk in a comprehensive manner in advance, which is helpful to formulate relevant response plans. This research makes significant contributions in terms of new emerging tools and new AI algorithm for the traditional field of construction cost index prediction. Firstly, the new emerging tools are originally applied in this area after reviewing previous research results. Although LSTM NN has been used in prediction problems in other application areas, there is a lack of explorative research to train the algorithm model by using specific construction date and evaluate the forecasting results for the theoretically suitability of cost index prediction. Secondly, the new AI algorithm of LSTM NN has the ability to sort out the limitations of existing methods in cost index prediction. Since most of the traditional methods are not suitable for nonlinear fitting and have poor response to the timeliness of the data, LSTM NN has advantages in dealing with limitations of the gradient vanishing and the inability to address long-term dependence. Upon analysing the experimental results of the LSTM model, the following key findings are observed. (1) Sixteen prediction indicators can comprehensively and timely reflect the domestic economic, energy, and market conditions, which meet the requirements of capturing the fluctuation trend of the engineering cost indexes. (2) The proposed LSTM model has good fitting effect and small prediction error, which fully demonstrates the ability of the algorithm to utilize long-distance dependent information in sequence data. (3) Through the optimization mechanism in three aspects, the experience of model creation is successfully converted into principle standards, in which the optimization of the input features is the most critical. (4) Compared with other methods, the LSTM model possesses significant advantages in training cost, time series process, and short-term prediction accuracy. This model can be used to deal with similar time series, such as crowd or vehicle flow, and stock prices. Generally speaking, it was confirmed in this study that the LSTM neural networks were applicable and effective in regard to predictions of construction cost indexes. The obtained research findings of this study could potentially provide some guidance for subsequent researchers in selecting prediction algorithms and model parameters. However, the proposed method framework still has some limitations, such as the following. First, the data required for the research were mainly taken from four domestic databases, and the authenticity of these historical data lacks verification. Second, the index determination criteria used in this article lack authority, and different countries or organizations may involve various criteria. Finally, due to the limited amount of statistical data available in China at present, this study only validated the short-term prediction performance of LSTM. Based on the above limitations, our future work will focus on improving the structural layer of LSTM, in order to compensate for its disadvantages in the long-term prediction process.
The data used to support the findings of this study are available from the corresponding author upon request.
The authors declare that there are no conflicts of interest regarding the publication of this article.