The Improved Least Square Support Vector Machine Based on Wolf Pack Algorithm and Data Inconsistency Rate for Cost Prediction of Substation Projects

Long Yuan (Beijing) Wind Power Engineering & Consulting Co., Ltd., Beijing 100034, China School of Management, Hebei Geo University, Shijiazhuang 050031, China Strategy and Management Base of Mineral Resources in Hebei Province, Hebei GEO University, Shijiazhuang 050031, China School of Economics and Management, North China Electric Power University, Beijing 102206, China State Grid Zhejiang Electric Power Company, Hangzhou 310007, China


Introduction
Poor control over the cost of substation projects easily leads to the high cost, which seriously affects the economics and sustainability of power engineering projects [1]. e forecasting of cost level is an important part of the cost control of substation projects, and it also has important guiding significance for the cost saving of substation projects. However, the attributes of historical data indicators are numerous due to the influence of factors such as the overall planning of the power grid, total capacity, topographical features, design and construction level, and the comprehensive economic level of the construction area. Simultaneously, the number of construction projects in the same period is limited, and it is impossible to collect more comparable engineering projects in a short time, which leads to less sample data and higher difficulty for the cost forecasting of substation projects [2]. erefore, the construction of cost forecasting model and the realization of accurate cost forecasting results of substation projects are of great significance to the sustainability of power engineering investment.
At present, few scholars have studied the cost forecasting of substation projects, but many have studied the cost forecasting of other engineering projects. e forecasting methods are mainly divided into two categories: one is the traditional forecasting method and the other is the modern intelligent forecasting method. e traditional forecasting methods mainly include time series prediction [3], regression analysis [4], Bayesian model [5], and fuzzy prediction [6]. A time series method for cost forecasting of projects based on the bill of quantities pricing model was built in [3], which could control the forecasting error within 5%. In [4], a forecasting model based on the integral linear regression of multiple structures according to the features of building project cost was established, and the principal component factor method was introduced to solve the problem of multicollinearity among variables. In [5], a Bayesian project cost forecasting model that adaptively integrates preproject cost risk assessment and actual performance data into a range of possible project costs at a chosen confidence level is proposed. Gao et al. [6] utilized the significant cost theory to determine the significant cost factors affecting the cost and adopted the fuzzy c-means clustering method to predict the project cost. Basic theory and the verification way of this method are relatively mature and perfect, and the calculation process is also relatively simple. However, the method is often suitable for a single object, and the forecasting accuracy is not ideal. erefore, in the background of the rapid development of artificial intelligence technology, the use of intelligent forecasting methods to predict the cost of substation projects has a more important significance. Intelligent forecasting methods mainly include artificial neural networks (ANNs) and support vector machines (SVMs) [7]. e artificial neural networks, such as back propagation neural network (BPNN), radial basis function neural network (RBFNN), and general regression neural network (GRNN), are mainly used in the forecasting field [8]. A combination forecasting method for power grid engineering technical transformation projects was proposed in [9], which was based on Data Envelopment Analysis (DEA) and genetic BP neural network model. ese factors were used as input variables of BP neural network to establish a 3-layer structure, and the forecasting model was proved by training the actual engineering data. In [10], a cost forecasting model of engineering projects based on BPNN and RBFNN was constructed, in which the radial basis function (RBF) neural network was proved to have higher forecasting accuracy on the basis of the training test with sample data from 55 engineering projects. In [11], a forecasting model for transmission line ice coating based on generalized regression neural network and fruit fly optimization algorithm is proposed and good forecasting results are achieved. However, we can find that the artificial neural network has the disadvantage of slow convergence rate and it is easy to fall into local optimization through the above research. As a result, the forecasting accuracy is greatly reduced. In [12], it is proposed that the SVM model could avoid the neural network structure selection and local minima problems.
erefore, the SVM model has been widely used in the research of projects cost forecasting. e influencing factors of substation projects cost were determined through literature review and other methods and the cost forecasting model of substation projects was established based on the theory of SVM in [13]. In [14], parameters of the SVM forecasting model were optimized by the improved firefly algorithm, which achieved a higher accuracy of the forecasting effect. Compared with the neural network, a better forecasting performance of substation projects cost can be obtained by applying the SVM model. According to the research mentioned above, it can be noted that there are some shortcomings in the traditional support vector machine. When the traditional support vector machine is used for cost forecasting, the solution process needs to be converted into a quadratic programming process through the kernel function, which reduces the efficiency and the convergence precision is not high. However, the least squares support vector machine (LSSVM) avoids the quadratic programming process by using the least squares linear system as the loss function. Meanwhile, the forecasting problem is transformed into the equation sets and the inequality constraints are transformed into equality constraints by means of using the kernel function, which increases the accuracy and speed of the forecasting [15,16]. erefore, some scholars have tried to use LSSVM to carry out the cost forecasting of engineering projects [17,18]. In [17], the variable importance in projection (VIP) was used to extract the features of multiple factors that affect the project cost. And, the LSSVM was used as the nonlinear approximator to establish the cost forecasting model of engineering projects. Liu [18] proposed a cost forecasting model of projects based on chaotic theory and least squares support vector machine, which aimed at the time-varying and chaotic of the cost change.
Although LSSVM shows the better forecasting performance than SVM in cost forecasting, there is still the problem of blind selection of the penalty parameter and the kernel parameter [19]. erefore, it is necessary to select appropriate intelligent algorithms to optimize the parameters. e common intelligent algorithms mainly include genetic algorithm [20], ant colony algorithm [21], fruit fly optimization algorithm [22], and particle swarm algorithm [23]. Although the above algorithms all have their own advantages, there are still some corresponding flaws.
e genetic algorithm has the problems of easy precocity, complicated calculation, small processing scale, difficult to deal with nonlinear constraints, poor stability, and so on. e ant colony algorithm and the firefly algorithm cannot guarantee convergence to the best point, and it is easy to fall into a local optimum, leading to a decrease in the forecasting accuracy. And, the particle swarm algorithm cannot fully satisfy the optimization of parameters in the LSSVM due to the insufficient local search accuracy. Based on the above analysis, the wolf pack algorithm (WPA) was applied to optimize the parameters of LSSVM in this paper. e performance of WPA will not be affected by a small change in parameters and the selection of parameters is relatively easy, so it has a good global convergence and computational robustness, which is suitable for solving the high-dimensional, multipeak complex functions, especially [24].
Otherwise, the number of influencing factors of substation projects cost is very large. If all the influencing factors are used as the input indicators of the forecasting model, then a large amount of redundant data appears, so feature selection is also important. e feature selection refers to the identification and selection of appropriate input vectors in the forecasting model to reduce redundant data and improve the computational efficiency [25]. e DIR model refers that the feature set is divided into several subsets, and the minimum inconsistency rate is calculated by the feature subsets to determine the optimal feature subset and complete the feature selection [26]. Both [27,28] adopted the DIR model for feature selection and obtained effective forecasting results. e use of the DIR model for feature selection can eliminate redundant features based on the data set inconsistency, and the characteristics of the correlation between features are also taken into account at the same time. e selected optimal feature can represent all data information perfectly because the relationship between features is not ignored. erefore, the data inconsistency rate (DIR) was chosen for the feature selection in this paper.
According to the above research, an ILSSVM model integrated DIR with WPA is proposed. It is the first time to combine these three models in cost forecasting and several comparing methods are utilized to validate the effectiveness of the proposed hybrid model. e paper is organized as follows: Section 2 introduces the implementation process of DIR and ILSSVM optimized by FOA. Section 3 provides a case to validate the proposed model. Section 4 obtains the conclusion in this paper.
ere are three innovations in this paper: (1) DIR is applied to the selection of influencing factors of substation project cost prediction; (2) LSSVM model is improved by applying the wavelet kernel function to replace the traditional radial basis kernel function; (3) LSSVM parameters are optimized by WPA.

Wolf Pack Algorithm.
Wolf pack algorithm (WPA) is derived from the study of wolf hunting behaviors. During the hunting process, different divisions of labor are assigned to wolves according to their respective roles. Wolves in the pack are divided into three types: the leader wolf, the scout wolf, and the ferocious wolf. And, these three wolves work together to complete the hunting process. e wandering behavior, summoning behavior and siege behavior of WPA are the three main behaviors, which are bionic based on the behavioral features of the wolves in the hunting process. Simultaneously, the generation of leader wolf and the replacement of wolf pack are followed by the rules of "survival of the fittest" and "winner-take-all," respectively [29]. e optimal solution is obtained through continuous iterations of the bionic behaviors. e bionic model of wolf pack algorithm is as shown in Figure 1. e principle and steps of wolf pack algorithm are as follows: (1) Initialization of the Wolf Pack. Suppose that there are N artificial wolves in the D-dimensional space and the location of the ith wolf is shown as follows: e initial position is generated by the following equation: where rand is a uniformly distributed random number in [0, 1] and x max and x min are the upper and lower limits of the search space, respectively. (2) Generation of the Leader Wolf. e wolf at the optimal position of the target function value is chosen as the leader wolf. e leader wolf neither updates the position of the hunting process nor participates in hunting activities, but directly iterates. If (3) Close to the Prey. e location update of the wolf pack is promoted by the summoning behavior of the leader wolf. Driven by the summoning behavior, the new location is obtained by wolves based on the summons of the leader wolf. e updated position of wolf i at the dth dimension is shown as follows: where step a is the length of the wolf's stride in the search process, step b is the length of stride when the wolf moves to the target, x i d is the current position of the wolf i at the dth dimension, and x li d is the position of the leader wolf at the dth dimension. (4) Encirclement of the Prey. After discovering the prey, the surrounding wolves will complete the reclamation of the target prey according to the signals issued by the leader wolf. Equations of the reclamation and the reclamation steps are as follows: where t is the number of iterations, ra is the length of the wolf's stride during the reclamation, X i is the position of the leader that issued the signals, and X t i is the position of the wolf i in the tth iteration. (5) Update Mechanism of Wolves Competition. Wolves will be eliminated without food in the process of reclamation hunting; instead, the wolves that survived are the first to obtain food. Weak wolves that without food will be eliminated while an equal Mathematical Problems in Engineering number of new wolves that are randomly generated will be added into the wolf pack.

Data Inconsistency Rate (DIR).
e feature selection of many historical cost data of substation projects takes aim at distinguishing the most relevant data features, which makes the input vector of the power projects cost forecasting model with a strong pertinence and reduces the redundancy of the input information in order to improve the accuracy of the cost forecasting of the substation projects. e inconsistency rate of data can accurately describe the discrete characteristics of the input features. Different feature characteristics can be obtained by different division modes and different division modes can obtain different frequency distributions. e discrimination ability of data categories can be distinguished by the inconsistency rate of calculation. e higher the inconsistency rate of data, the stronger the classification ability of feature vectors.
It is necessary to know the specific calculation formula of the inconsistency rate for selecting the features by means of the inconsistency rate method. erefore, it is assumed that the collected cost data of substation projects have g features, such as main transformer capacity, floor space, and main transformer unit price. which are, respectively, represented by the values of G1, G2, . . . , Gg. Γ is the feature set and L is a subset of the feature set. Next, set the standard class Mwith ccategories and N data instances. e feature value corresponding to the feature Fi is represented as z ji . λ i is the class value of M, then the data instance can be represented as e calculation formula of the inconsistency rate is shown as follows: where f kl is the number of data instances that centrally belongs to the feature subset of the x k mode. x k is the data set with a total of P feature partition modes, where (k � 1, 2, . . . , p and p ≤ N). e feature selection on the basis of the inconsistency rate is shown as follows: (1) Initialize the optimal feature subset Γ to an empty set.
(2) Calculate the inconsistency rate of datasets G1, G2, . . . , Gg which belongs to the subset mode that consists of the subset Γ and each remaining feature.
(3) Calculate the inconsistency rate statistical table of the feature subsets and rank the inconsistent data from small to large.
(4) Select the feature subset L with the smallest number of features. If τ L ≈ τ Γ or τ L′ /τ L is the minimum of all adjacent inconsistency rate, then the optimal feature subset is L, where L ′ is the last feature subset adjacent to L.
By calculating the inconsistency rate, on the one hand, redundant features can be eliminated based on the inconsistency of data set. On the other hand, it can also take into account the correlation between features in the selection process, which has a better presentation of all data information through the selected optimal features.

Improved Least Squares Support Vector Machine (LSSVM)
2.3.1. Least Squares Support Vector Machine. Least squares support vector machine (LSSVM) is an extension of support vector machine (SVM). It constructs the optimal decision surface by projecting the input vector into the high-dimensional space nonlinearly. Next, the inequalities of SVM are inverted into the equation sets by applying the risk minimization principle, which reduces complexity and speeds up the rate of calculation [30].
is the given sample set, N is the total number of samples. e regression model of the sample is shown as follows: where ϕ( * ) is the function that projects training samples into a high-dimensional space, w is the weighted vector, and b is the offset parameter. e optimization problem of LSSVM can be converted into the following function to solve [31]: where c is the penalty coefficient, which is used to balance the complexity and accuracy of the model, and ξ i is the estimation error. e above equations can be solved by converting them into the Lagrangian function, which is detailed in 2.3.3.

Improved Method of LSSVM (1) Horizontal Weighting of Input Vectors.
e cost forecasting of substation projects is mostly a multiinput and single-output model. e values of the input vectors are horizontally distributed with the item serial number. And, the influence of the actual value of the substation projects cost influencing factors on the final forecasting value can be reflected by means of the weighted processing. erefore, the weighted processing of input vectors is shown as follows: where x i is the weighted input vector, x ki is the original input vector, k is the dimension number of input vector, and δ is a constant.
(2) Longitudinal Weighting of the Training Sample Sets. e cost forecasting value is not only related to the elements in the input vectors but also has a certain correlation with the sample groups, which means that the close sample has a greater influence on the forecasting value and the long-range sample has less influence on the forecasting value. erefore, it is necessary to reduce the influence of close samples on the forecasting model by assigning different degrees of subordination to the influencing factors of the current substation projects cost and also enhance the influence of the longrange samples on the forecasting model at the same time. e linear membership degree μ i is used to calculate the degree of the given membership and the equation is shown as follows [32]: where μ i is the degree of membership and β is a constant between [0, 1] and i � 1, 2, . . . , N. en the input sample set can be changed as follows: T � x 1 , y 1 , μ 1 x 2 , y 2 , μ 2 · · · x N , y N , μ N . (11) e determination of β affects the fitting performance of LSSVM directly [33]. e value of β can be obtained through the gray correlation coefficient, and the calculation formulas are shown as follows: In this paper, x 0 � Y, Y � y 1 , y 2 , . . . , y N since the forecasting model for cost prediction of substation projects is usually a multiinput and single-output model.

Weighted Least Squares Support Vector Machine.
e improvement in the above section is applied to the LSSVM to get the weighted least squares support vector machine. e objective function is described as follows: To solve the above problem, the Lagrangian function is established as follows: where α i is the Lagrange multiplier. e variables of the function are deduced and the derivation is equivalent to zero. e specific calculation is shown as follows: Convert the equations to the following problem by eliminating w andξ i : where e equation (17) is obtained as follows: Mathematical Problems in Engineering 5 where K(x i , x) is the kernel function. e wavelet kernel function is selected to replace the Gaussian kernel function in the standard least squares support vector machine and the construction of the wavelet kernel function will be detailed in the next section. e wavelet kernel function is brought into y(x) as follows: Finally, the regression model of the weighted least squares support vector machine is shown as follows: In this paper, the wavelet kernel function was used to replace the traditional radial basis kernel function, which is mainly based on the following considerations. (a) e wavelet kernel function has the excellent specialty of describing the data information step by step, and the LSSVM that uses the wavelet kernel function as the kernel function can simulate any function with high precision. However, the traditional Gaussian function is relatively less effective. (b) e wavelet kernel function is orthogonal or approximately orthogonal, while the traditional Gaussian kernel function is related or even redundant. (c) e wavelet kernel function can analyze and process the multiresolution of wavelet signals. erefore, the nonlinear processing ability of the wavelet kernel function is better than the Gaussian kernel function, which can improve the generalization ability and robustness of the LSSVM regression model.

Construction of the Wavelet Kernel Function.
e kernel function of LSSVM is the inner product of two input spatial data points in a spatial feature, and it has two obvious features. Firstly, k(x, x′) � k(x ′ , x) is the symmetric function of inner product kernel variables. Secondly, the sum of the kernel functions on the same plane is a constant. In short, only when the kernel function satisfies the following two theorems can it become the kernel function of the least squares support vector machine [34,35].
(1) Mercer's eorem. k(x, x ′ ) is the continuous symmetric core that can be extended to the following form: where λ i is a positive value. e following necessary and sufficient conditions need to be met to ensure complete convergence of the above extensions.
For all the functions of g( * ), the condition that Where g(x i ) is the feature function, λ i is the eigenvalue, and all of them are positive. erefore, it is known that the kernel function k(x, x ′ ) is a positive definite function.
(2) Smola and Scholkopf eorem. When the kernel function satisfies Mercer's theorem, k(x, x ′ ) can be used as the kernel equation of the least square support vector machine when it is proved to satisfy the following equation (22):

(3) Construction of the Wavelet Kernel Function. When the wavelet kernel equation satisfies conditions that
is the Fourier transform of ψ(x). en, ψ(x) can be defined as follows [36]: where σ is the shrinkage coefficient, m is the horizontal float coefficient, andσ > 0, m ∈ R. When f(x) satisfies f(x) ∈ L 2 (R), f(x) can be wavelet transformed as follows: where ψ * (x) is the complex conjugate function of ψ(x). e wavelet transform function W(σ, m) is reversible and can be used for reconstructing the original signals, and then the following formula can be obtained: where e wavelet decomposition theory is an infinite approximation to a set of functions based on linear combination of the wavelet functions. Suppose that ψ(x) is a one-dimensional function, then the multidimensional wavelet equation can be described according to the tensor theory as follows: In this way, the horizontal floating kernel function is constructed as follows: 6 Mathematical Problems in Engineering e kernel equation needs to satisfy the Fourier transform in the least squares support vector machine. erefore, the wavelet kernel function can be used as the kernel function of LSSVM when it satisfies the Fourier transform. It is necessary to prove that the following equation is established: In this paper, the Morlet wavelet mother function is chosen to prove the above equation, which ensures the generalization of the wavelet kernel function, namely, And, k(x, x ′ ) can be expressed as follows: where σ can be obtained through the sample fitting. x ∈ R N and σ, x i ∈ R N . It can be seen from the above equation that the multidimensional wavelet function can be used as a kernel function of the multidimensional least squares support vector machine. And, the mathematical proof process is shown as follows: Finally, the equation is obtained as follows: where |σ| ≠ 0, so F(x)(w) ≥ 0. In addition, it can be concluded that the wavelet kernel function can be used as the kernel function of least squares support vector machine.

DIR-ILSSVM Optimized by the WPA Algorithm
e cost forecasting model of substation projects combining WPA, DIR, and ILSSVM is constructed as illustrated in Figure 2. As is shown from the figure, the cost forecasting model of substation projects proposed in this paper mainly includes three sections. e first section is the feature selection based on the inconsistency rate. e second section is the sample training based on the ILSSVM model. e third section is the cost forecasting based on the ILSSVM model. When the established feature subset L cannot satisfy stopping conditions of the algorithm, the program will continue to loop until it reaches the desired precision, and an optimal feature subset is output. erefore, in the cost forecasting model of the substation projects constructed in this paper, the purpose of the first section is to find the optimal feature subset and the optimal regression model parameters by iterative calculations. e purpose of the second section is to calculate the forecasting accuracy of the training sample in each iteration, so that the fitness function value can be calculated. e third section will make use of the optimal feature subsets and parameters obtained in the above two sections. And, finally, the substation projects cost of the test sample can be forecasted through retraining of the ILSSVM regression model. Mathematical Problems in Engineering e specific steps for cost forecasting of substation projects are listed as follows: (1) Determine the Initial Candidate Feature Values. By combing the related references [37][38][39], in this paper, the candidate features of the influencing factors of substation projects cost are selected as follows, including the floor space, construction properties, substation voltage level, main transformer capacity, number of outlets on the high-voltage side, number of outlets on the low-voltage side, topography, duration, substation type, number of transformers, economic development level of the construction area, inflation rate, transformer single unit price, high-side circuit breaker unit price, high-side circuit breaker unit, low-voltage capacitor quantity, highvoltage fuse price, current transformer price, power capacitor price, reactor price, power bus price, arrester price, measuring instrument price, relay protection device price, signal system price, automatic device price, site leveling fee, foundation treatment fee, designer's skill level, number of accidents, deviation rate of project volume, construction progress level, and days of rain and snow. In the IR algorithm, the optimal feature subset needs to be initialized to an empty set, namely, Γ �.
(2) Initialize the Parameters of WPA. Set the total number of wolf N � 50, the number of iterations t � 100, step a � 1.5, step b � 0.8, q � 6, and h � 5. (3) Calculate the Inconsistency Rate. After step (1) and step (2) are completed, the candidate features are gradually substituted into the IR feature selection model. Calculate the inconsistency rate of datasets G 1 , G 2 , · · · , G g which belongs to the subset mode that consists of the subset Γ and each remaining feature. e feature G i corresponding to the minimum inconsistency rate is selected as the optimal feature. Γ � Γ, G i is the updated optimal feature set. (4) Determine the Optimal Feature Subset and Parameters of the Optimal Regression Model. e current feature subset is substituted into the ILSSVM model to calculate the forecasting accuracy r(j) in the learning process of the current cycle training sample, and the fitness function value Fitness(j) of each cycle can be obtained. e optimal feature subset can be determined through the comparison of fitness function values between each generation. Determine whether each iteration reaches the stopping condition of the algorithm. If it is not satisfied, the new feature subset is reinitialized and the new cycle is entered until the optimal subset of the global optimal feature is obtained. It is important to note that the parameters of the ILSSVM model also need to be optimized and the initial value of c and σ will be allocated randomly. e fitness function based on double factors of forecasting accuracy and feature selection quantity is constructed as follows: where Numfeature(x i ) is the optimal number of features selected for each iteration and a and b are constants between [0, 1]. e optimal feature quantity and the fitness function value are proportional to each other in each iteration, and the cost  forecasting accuracy of substation projects is inversely proportional to the fitness function value.

(5) Stop Optimization and Start the Cost Forecasting.
Circulation ends at the maximum number of iteration. Here, the optimal feature subset and the best value of c and σ can be substituted into ILSSVM model for cost forecasting of substation projects.

Collection and Processing of Data.
e relevant cost data of substation projects from 2015 to 2017 in different regions is collected in this paper, including 88 voltage level substation projects. e cost levels and the influencing factors of the first 66 substation projects are used as the training set, and the last 22 data sets are used as the test set. e collected original data of the substation projects cost are shown in Table 1.
e explanation of the processing of input indicators is shown as follows. e data of the construction properties of substation projects are mainly divided into three categories, in which the assignment of new substation is 1, the main transformer is 2, and t the interval project is 3. e data of the substation types are mainly divided into three categories, in which the assignment of the indoor type is 1, the half indoor type is 2, and the outdoor type is 3. e topography and landform are mainly divided into the following eight situations, in which the assignment of hillock is 1, slope is 2, plain is 3, flat ground is 4, paddy is 5, dry land is 6, mountain is 7, and muddy land is 8. e level of economic development in the construction area is based on the data of the local GNP and the technical level of the designers is based on the proportion of employees with bachelor degree or above in this project. e difference between the actual progress of the construction and the scheduled progress plan is selected to represent the construction progress level. And, the data are normalized according to the following formula (34): where x i is the actual value, x min and x max are the minimum and maximum values of the sample data, and y i is the normalized value.

Evaluation Indicators of the Forecasting Results.
Evaluation indicators of the cost forecasting results of substation projects adopted in this paper are shown as follows.
(1) Relative error (RE): (2) Root-mean-square error (RMSE): (3) Mean absolute percentage error (MAPE): (4) Average absolute error (AAE): In equations (35)- (38), x is the actual value of the substation projects cost, x is the forecasting value of the substation projects cost, and n is the number of data groups. e smaller the value of above indicators, the higher the accuracy of cost forecasting.

Feature Selection.
e main content of this section is the selection of the optimal feature subset based on the DIR model, which helps determine the input indicators of the forecasting model. In this paper, Matlab R2014b is used to carry on the programming computation. e test platform environment is based on Intel Core i5-6300U, 4G memory, and Windows 10 Pro edition system. e iterative process diagram of the extraction of training sample features based on the WPA-DIR-ILSSVM model is shown as Figure 3. e accuracy curve shown in the figure describes the forecasting accuracy of ILSSVM for training samples in different iterations. e fitness curve describes the fitness function value calculated during each iteration. e selected number is the optimal number of features calculated by the DIR model in the process of convergence. And, the reduced number of features refers to the number of features eliminated by the WPA algorithm during the convergence process.
From the figure above, it can be seen that the WPA algorithm converges when the number of iterations is 39 and the optimal fitness function value is − 0.91 at this time.
e forecasting accuracy of the training sample is 98.9% at the 39th iteration, which shows that the fitting ability of ILSSVM is enhanced through the learning and training of the algorithm and the forecasting accuracy of the training sample is the highest. Meanwhile, the number of selected features also tends to be stable when WPA reaches the 39th time. e algorithm eliminates 26 redundant features from the 33 candidate features, and the final selected features are construction properties, substation voltage level, main transformer capacity, substation type, number of transformers, main transformer single unit price, and floor space.

Cost Forecasting of Substation Projects and Result
Analysis.
e input vectors are substituted into the ILSSVM model for training and testing after obtaining all the optimal features of the sample data. In this paper, the self-written program is used to run and calculate in Matlab software. It is worth that the wavelet kernel function is chosen as the kernel function of the ILSSVM regression model in this paper. And, the important parameters of the model are obtained through optimization based on WPA algorithm to ensure the accuracy of ILSSVM. e parameter settings of WPA have been given in Section 3, and parameters of ILSSVM calculated by running the program are c � 43.0126 and σ � 19.0382. e unoptimized ILSSVM, LSSVM, SVM, and BP neural networks (BPNNs) are also selected to conduct the cost forecasting of substation projects, which helps prove the forecasting performance of the cost forecasting model proposed in this paper. e topological structure of the BPNN model is 7-5-1, and the transfer functions of the hidden layer and the output layer adopt the tansig function and purelin function, respectively. e maximum number of training is 200, the minimum error of the training target is 0.0001, the training rate is 0.1, and the initial weight and threshold are obtained by its own training. In the SVM model, the penalty parameter c is 10.276, the kernel function parameter σ is 0.0013 and the loss function parameter ε is 2.4375. In the LSSVM model, c is 10.108, σ is 0.0026, and ε is 0.0018. And, in the ILLSVM model, c is 16.263, σ is 0.0012, and ε is 0.0015.  Table 2. e forecasting results in Table 2 are plotted as shown in Figure 4 to facilitate the analysis more intuitively. e RE of each forecasting model is shown in Figure 5. It can be seen from Table 2 and Figures 4 and 5 that the forecasting error of all WPA-DIR-ILSSVM models is within [− 3%, 3%], and the minimum value of the absolute relative errors is 0.06% and the maximum value is 1.30%. Among them, the number of errors outside [− 1%, 1%] is 5, which are substation projects with serial numbers 67, 70, 72, 79, and 83, with relative errors of − 1.14% and 1.13%, 1.30%, 1.28%, and 1.08%, respectively. e relative errors of the 11 sample points in the ILSSVM forecasting model is controlled within the range of [− 3%, 3%], and the error range of 3 sample points is [− 2%, 2%], with the serial number of 70, 77, and 83, respectively. For the substation projects above, the relative errors are − 1.94%, 1.68%, and − 0.89%, respectively, and the minimum value of the absolute relative error is 0.89% and the maximum is 3.66%. e relative errors of the 4 sample points in the LSSVM forecasting model is controlled within the range of [− 3%, 3%], which are substation projects with serial numbers 68, 76, 72, 80, and 86, with relative errors of − 2.78%, 2.19%, − 2.74%, and 1.36%, respectively. However, all of them are outside the range of [− 1%, 1%], the minimum value of the absolute relative error is 1.36%, and the maximum is 6.27%. e minimum value of the absolute relative error in the SVM forecasting model is 1.44% and the maximum is 7.10%, in which the errors of most sample points are between [− 6%, − 4%] and [4%, 6%]. e minimum value of the absolute relative error in the BPNN forecasting model is 2.57% and the maximum is 7.71%, in which the errors of most sample points are between [− 7%, − 5%] and [5%, 7%]. Meanwhile, the fluctuation range of errors is relatively large in the BPNN model. According to the results of the absolute relative error, the WPA-DIR-ILSSVM model has the highest forecasting accuracy, next are the ILSSVM, LSSVM, and SVM models, and the worst is the BNPP model. erefore, the forecasting accuracy and stability of WPA-DIR-ILSSVM model are the best, which reflects that the WPA algorithm can effectively enhance the training and learning process so as to avoid falling into a local optimum and improve the global searching ability of ILSSVM. Simultaneously, satisfactory forecasting results can be obtained by using the input vectors based on the DIR model. Additionally, ILSSVM presents more satisfactory performance than LSSVM, SVM, and BPNN. is result indicates that the more accurate forecasting results can be achieved by means of the improvement of LSSVM. e RMSE, MAPE, and AAE of BPNN, SVM, LSSVM, ILSSVM, and WPA-DIR-ILSSVM are shown in Figure 6. From Figure 6, we can conclude that the RMSE, MAPE, and AAE of the proposed model are 0.8025%, 0.7159%, and 0.7157%, which are all the smallest among the above five models. erefore, the overall forecasting effect of ILSSVM is better than that of LSSVM, SVM, and BPNN, and the overall forecasting effect of LSSVM is better than that of SVM and BPNN, which indicates that the overall forecasting   performance of LSSVM is significantly improved after the weighted improvement. e forecasting accuracy of WPA-DIR-ILSSVM is better than that of ILSSVM, which proves that the parameters c and σ of ILSSVM selected by the WPA algorithm have a good optimization effect. And, the DIR model ensures the integrity of the input information while reducing redundant data, in which the ideal forecasting results are achieved. In addition, K-fold is utilized in this paper to improve generalization performance, which advances the accuracy of prediction.

Conclusions
is paper presents a hybrid cost forecasting model that combines DIR with ILSSVM optimized by WPA. First, in order to forecast the substation projects cost, the DIR combined with the WPA is employed to select the input feature. Furthermore, the WPA is also adopted to optimize the parameters of the ILSSVM. Finally, after obtaining the optimized input subset and the best value of c and σ, the proposed model is utilized for the cost forecasting of substation projects. Several conclusions based on the studies can be obtained as follows: (a) by the utilization of DIR, the influence of unrelated noises can be reduced and the forecasting performance can be effectively improved. (b) e optimization algorithm WPA adds the model with strong global searching capability and the ILLSVM model optimized by WPA shows good performance. (c) Based on the error valuation criteria, compared with LSSVM, a better forecasting result can be achieved based on ILSSVM, which shows that the method of improving LSSVM by replacing the traditional radial basis kernel function with wavelet kernel function is effective. (d) rough the example verification of substation projects in different regions, different voltage grades, and different scales, an ideal forecasting effect is obtained, which shows that the model proposed in this paper is more adaptable and stable. Hence, the proposed cost forecasting method of WPA-DIR-ILSSVM is effective and feasible, and it may be an effective alternative for the cost forecasting in the electric-power industry. However, more sample data are needed for verification. At the same time, adopting more intelligent models to forecast the substation projects cost is also our next work.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this article. Mathematical Problems in Engineering 13