Model Building for Regional Ecological Risk Prediction and Evaluation of Prediction Accuracy

The regional ecological risk model is built to predict the regional ecological risk level more accurately by using principal component analysis and optimizing standard BP neural network. Taking Xiangxi Tujia and Miao Autonomous Prefecture as an example, twelve primary factors affecting regional risk are selected. The sample data are processed by principal component analysis. The obtained main components are then used as input factors of the improved BP neural network, and the level of ecological risk is used as output factor. The results indicate that the error between the expected output and the actual output is 4.36% in 2016, 1.08% in 2017, and 5.18% in 2018, respectively, with all controlled within 6%. Compared with the prediction accuracy made by standard BP neural network without principal component analysis, the prediction accuracy made by improved BP neural network with principal component analysis is greatly improved. This comprehensive prediction model provides a better evaluation method for prediction of ecological risk level.


Introduction
Just like political security, economic security, and military security, regional ecological security constitutes an important part of national security. Accurate prediction of regional ecological risk is the key to the maintenance of regional ecological security. Before the ecological environment deteriorates, we should make accurate prediction for ecological risk level, take effective measures for controlling ecological risk, and guide the regional ecological system to return to the virtuous circle. Regional ecological risk prediction is a complex systematic project. e predication methods are various, and the evaluation indexes are dramatically different. In recent years, different scholars have put forward plenty of prediction methods, including fuzzy matter element method [1,2], artificial neural network method [3,4], grey sequence model [5], and probabilistic method [6]. e aforementioned methods mainly focus on partial evaluation indexes in the process of ecosystem evolution. e precision of these methods is not very high, indicating that the results cannot precisely reflect the actual situation. When the standard backpropagation (BP) neural network method is applied to the prediction of regional ecological risk level, it ignores the correlation among the input variables and may lead to large prediction error. Besides, due to the excessive input data, the efficiency of standard BP neural network method is also obviously decreased [7]. In view of these disadvantages, to predict the regional ecological risk level more accurately, a model which combines the principal component analysis method with improved BP neural network method is built in this paper. e principal components of the original sample data are analyzed by the SPSS software.
ese independent principal components can summarize most of the information of the raw data and can be used as the input factors for the improved BP neural network. In this way, the efficiency of this model can be greatly improved, which consequently increases the prediction accuracy of regional ecological risk.

Basic Principle of the Principal Component Analysis
Method. Principal component analysis (PCA) is a kind of data dimensionality reduction method [8]. In the process of the analysis, multiple indexes are transformed into several representative indexes, and there are few losses of data information in this process. e mathematical model of PCA is shown as follows [9][10][11][12].
is paper supposes a set of variables X � {X 1 , X 2 , . . ., X n }, which are used to describe the research subjects. If there are m evaluation subjects, the sample matrix can be built as follows: e original index data should be standardized owing to the differences in dimensions and orders of magnitude. e standardization matrix can thereafter be built.
According to formula (2), the correlation coefficient R ij between different variables can be calculated, and the covariance matrix R can be established.
If R ij is large, it indicates that the correlation between different variables is high and PCA should be conducted.
Based on the covariance matrix R, the eigenvalues, the principal component contribution rate, and the accumulative variance contribution rate can be calculated. e number of principal components can be determined. e load matrix of initial factor is established, which can be used to explain the principal components. μ represents the mean value of the random variable X, and the random variable X can be linear transformation. e principal components are unrelated linear combinations. e linear combinations of the initial variables are as follows:

Improved Backpropagation (BP) Neural Network.
Backpropagation (BP) neural network is a multilayer feed forward network, which is trained by the algorithm of error backpropagation [13]. In the forward propagation process, the input information is processed by the input layer and the hidden layer. e actual output of each neuron is calculated. If the actual output does not conform to the expected output in the output layer, the output error is reversely propagated in some way by the hidden layer. At the same time, the error is apportioned among all the units in the hidden layer and the error signal of each layer is obtained. Based on the error signal, the weight of each unit is corrected.
ere is a continuous cycle between the process of information forward propagation and the process of error backpropagation, which will stop when the squared error of the network reaches minimum [14]. Standard backpropagation algorithm is widely used [15][16][17]. However, there are some shortcomings in the standard backpropagation algorithm, such as long training time, and slow convergence speed.
e Levenberg-Marquardt algorithm is specifically used to minimize the squared error [18]. Essentially, L-M algorithm combines the gradient descent method with the Newton method. is algorithm can shorten the training time of neural network, accelerate the convergence rate of the network, and obtain accurate prediction results. e squared error of this algorithm is shown as follows: where p represents the sample of p and ε represents the vector, which consists of the element of ε p . e current location is ω 0 , and it moves to the new location of ω 1 . If the amount of movement is small, ε can be expanded into the first-order Taylor series.
where the element of Z is as follows: and the error function can be written in the following form: In order to achieve the minimum value of E, the derivative ofω 1 should be calculated. erefore, the following formula can be obtained: Since the step length may be too long, the squared error should be corrected by the following formula: e minimum value of ω 1 can be calculated by the following formula: When λ is very small, it becomes the Newton method. When λ is very large, it becomes the gradient descent method. e step length is λ − 1 . In the process of calculation, λ should be adjusted according to the actual situation. ere is a frequently used method. In the beginning, λ is arbitrarily selected. e changes of E should be analyzed in each step. If the error declines after using formula (10), ω 1 can be retained. λ should be reduced to this value, and these steps should be repeated. If the error increases, ω 0 can be maintained. λ should be increased tenfold, andω 1 should be recalculated.
is process repeats until E reaches the required precision [19].

Prediction Model Based on the PCA Method and Improved BP Neural Network.
e prediction model of regional ecological risk is built by combining the PCA method with improved BP neural network. Firstly, the original data related to ecological risk are collected and processed for correlation analysis by using the SPSS software. Secondly, after the original data are standardized by the SPSS software, the principal components (X 1 , X 2 , . . ., X k ) which contain vast majority of information of raw materials can be extracted by PCA. Lastly, the principal components (X 1 , X 2 , . . ., X k ) are used as the input factor for the improved BP neural network, and Y is used as the output factor. is model guarantees the precise prediction of regional ecological risk level. During this process, the input variables with correlation relations can be transformed into those with no correlation by using the PCA method. In this way, this model can reduce the dimensions of data and the number of input factors for the improved BP neural network. Compared with the standard BP neural network, the algorithm for the improved BP neural network is changed, which makes the training time obviously shortened, the convergence rate accelerated, and the prediction accuracy increased. In summary, this prediction model makes full use of the advantages of these two methods, which can effectively solve the classification problems in regional ecological risk assessment. Its structure is shown in Figure 1.

Case Study
Taking Xiangxi Tujia and Miao Autonomous Prefecture as an example, the ecological risk level in this area is predicted by the PCA method and improved BP neural network. Twelve factors affecting regional ecological risk are selected [20][21][22][23][24], including the density of population (I 1 ), pesticide usage of per hectare cultivated land (I 2 ), fertilizer usage of per hectare cultivated land (I 3 ), volume of wastewater discharged by every ten thousand yuan industrial output (I 4 ), volume of solid waste produced by every ten thousand yuan industrial output (I 5 ), domestic sewage discharged by per capita (I 6 ), energy consumption of every ten thousand yuan GDP (I 7 ), water consumption of every ten thousand yuan industrial output (I 8 ), the proportion of environmental investment in gross fixed assets formation (I 9 ), the standard discharge rate of industrial wastewater (I 10 ), the comprehensive utilization of solid waste (I 11 ), and the repeating utilization rate of industrial water (I 12 ). e data come from the relevant statistical materials about Xiangxi Tujia and Miao Autonomous Prefecture, which include Xiangxi statistical yearbook (2009-2018), the twelfth 5year plan in Xiangxi, and the network of Xiangxi statistical information. Specific data are shown in Table 1. Based on the twelve evaluation indexes, the regional ecological risk level is calculated by using the variable weight method and the grey correlation theory [24]. e evaluation results are also shown in Table 1. e numbers of 1, 2, 3, 4, and 5 represent the ecological risk level of I, II, III, IV, and V, which indicate great risk, large risk, normal risk, small risk, and no risk, respectively. e characteristics of each ecological risk level are presented in Table 2.

Correlation Analysis.
In order to prevent collinearity among different factors, which may cause errors in the grading results, the data shown in Table 2 are processed for correlation analysis by SPSS software. e correlation coefficient is calculated by the simple Pearson correlation coefficient. Significance test is carried out through the twotailed method. Based on the diagnosis results of Pearson correlation, the Pearson correlation coefficient matrix is established (Table 3). e results show that there is obvious collinearity among the density of population, pesticide usage of per hectare cultivated land, domestic sewage discharged by per capita, energy consumption of every ten thousand yuan GDP, the standard discharge rate of industrial wastewater, and the repeating utilization rate of industrial water. erefore, it is necessary to conduct PCA.

Principal Component Analysis.
e original data are standardized by SPSS software, and the results are shown in Table 4. e data shown in Table 4 are analyzed by PCA provided by SPSS software. e scree plot of PCA (Figure 2), the list of principal components (Table 5), and the load matrix of principal components (Table 6) can be obtained. Figure 2 indicates that the difference of eigenvalue between Component 1 and Component 2 is relatively large and the difference of eigenvalue among other components is small. It can be preliminarily determined that the first two components can be extracted from the vast majority of information. Table 5 shows that the eigenvalues of the first two components are both greater than 1 and they are able to explain 85.678% of the total variation. e results meet the requirement that the variance of principal components accounts for 75%-85% of the total variance. erefore, the first two components are selected as the principal components, which can replace the original variables.

Principal component analysis
Improved BP neural network  Table 6 shows the correlation coefficient between the original variables and the principal components, which expresses the loading of the two components F 1 and F 2 on each original variable. According to formula (3), the factor expressions for principal components can be described as follows: Table 1: Statistical data of regional ecological risk factors and ecological risk level.

Years
Statistical value of regional ecological risk factors Ecological risk level I 1 (person/km 2 ) I 2 (kg) I 3 (kg) I 4 (t) I 5 (t) I 6 (t) I 7 (ton of SCE) I 8 (t) I 9 (%) I 10 (%) I 11 (%) I 12     Based on the above factor expressions, the principal components of the standardized data can be calculated, which should be used as the input data for the improved BP neural network. e results are shown in Table 7.

Training and Prediction of Improved BP Neural Network.
In the improved BP neural network, the principal components F 1 and F 2 can be used as the input factor, and the regional ecological risk level R can be used as the output factor. e model can be established by using Matlab software. e data in Table 7 should be divided into two subsets-the training sample subsets (2009)(2010)(2011)(2012)(2013)(2014)(2015) and the prediction sample subsets (2016)(2017)(2018). In the process of constructing the improved BP neural network, the related parameters should be set as follows: the learning rate is 0.9 and the momentum factor is 0.7. e network structure can be finally constructed through the training, which includes two input nodes, ten hidden layer nodes, and one output node. e training process of the standard BP neural network without PCA is shown in Figure 3 while the training process of improved BP neural network with PCA is shown in Figure 4. ese two figures show that the learning steps of improved BP neural network with PCA are obviously reduced, and the training speed is significantly accelerated. e predictions are shown in Table 8. From 2016 to 2018, the ecological risk levels of Xiangxi Tujia and Miao Autonomous Prefecture are the levels of III,III, and IV. e relative error between the actual output and the desired output brought by improved BP neural network with PCA is less than 6%; the relative error brought by standard BP neural network without PCA is greater than 9%. Compared with the predictions made by the standard BP neural network without PCA, the predicted accuracy of improved BP neural network with PCA is greatly improved.     −0.853 0.356 Volume of wastewater discharged by every ten thousand yuan industrial output (I 4 ) 0.209 −0.871 Volume of solid waste produced by every ten thousand yuan industrial output (I 5 ) 0.925 −0.273 Domestic sewage discharged by per capita (I 6 ) 0.714 0.536 Energy consumption of every ten thousand yuan (I 7 ) −0.913 −0.366 Water consumption of every ten thousand yuan industrial output (I 8 ) 0.730 −0.513 e proportion of environmental investment in gross fixed assets formation (I 9 ) −0.591 0.512 e standard discharge rate of industrial wastewater (I 10 ) 0.774 0.521 e comprehensive utilization of solid waste (I 11 ) −0.655 0.676 e repeating utilization rate of industrial water (I 12 ) 0.522 0.785

Conclusions
In this paper, twelve factors affecting regional ecological risk are selected. e principal components of the original sample data are analyzed by SPSS software. In this way, the correlation between different indexes is eliminated, and the number of input variables in neural network is reduced. e improved BP neural network is used to predict the regional ecological risk level, which speeds up the training speed and improves the prediction accuracy. e relative error between the actual output and the desired output brought by improved BP neural network with PCA is 4.36%, 1.08%, and 5.18%, respectively, all controlled within 6%. Compared with the prediction accuracy of standard BP neural network without PCA, the prediction accuracy of improved BP neural network with PCA is obviously improved.
Based on the prediction model combining the principal components analysis method with improved BP neural network, the ecological risk level in Xiangxi Tujia and Miao Autonomous Prefecture can be predicted. e predicted results are consistent with the expected output of the network. It shows that the prediction model is reasonable and feasible and is a better solution for regional ecological risk prediction.

Conflicts of Interest
e authors declare that they have no conflicts of interest.  Advances in Civil Engineering 7