Crack Prediction Based on Wavelet Correlation Analysis Least Squares Support Vector Machine for Stone Cultural Relics

Preventive protection of cultural relics is to make use of all the science and technology beneficial to the research and protection of archaeological heritage to predict the disease of cultural relics. The existing preventive cultural relics protection system has made some achievements in environmental monitoring, but the analysis and utilization of large data of cultural relics are still insufficient. In this paper, under the idea of multisource information fusion, a least squares support vector machine regression method based on multivariate time series wavelet correlation analysis is proposed to achieve accurate crack prediction of stone cultural relics. Firstly, the correlation of multivariate time series of stone cultural relics are quantitatively analyzed and the validity of characteristic variables of the crack is discriminated by wavelet correlation analysis; then, a least squares support vector machine prediction model is constructed based on the correlation obtained from the analysis; finally, the good performance of the method is verified by using the environmental monitoring data of the rock mass fracture in the North Qianfo Cliff of Dafo Temple in Binzhou City of Shaanxi Province. The experimental results show that the proposed method is more effective than the traditional backpropagation neural network, support vector machine, and relevance vector machine regression methods. This method is universal and easy to implement for multisource data prediction of nonmovable cultural relics diseases. It provides a scientific theoretical reference for the preventive protection of cultural relics.


Introduction
Immovable cultural relics refer to ancient cultural sites, ancient tombs, ancient buildings, cave temples, stone carvings, murals, important modern historical sites, and representative buildings. e protection of cultural heritage is not only the foundation of maintaining the world cultural diversity and inheriting human civilization but also the responsibility of mankind [1,2].
Immovable cultural relics diseases are mainly influenced by natural and man-made environmental factors [3]. ese datasets of environmental factors are monitored by sensors and usually stored in the form of multiple time series; these environmental factors include wind speed, temperature, humidity, and light, which are usually characterized by high dimensions, complex correlations, and nonlinearity. It is found that nonmovable cultural relics diseases have a specific correlation with the environment in which they live. However, at present, research studies usually start only from a single attribute and carry out the visual display of multiple time series and diseases of cultural relics, without in-depth research on correlation analysis and multiattribute fusion of a large number of multisource data collected, so the disease degree of cultural relics cannot be accurately predicted.
On the one hand, according to different disciplines, current correlation analysis research methods can be divided into data mining methods, statistical methods, wavelet correlation analysis (WCA), correlation analysis based on mutual information [4][5][6][7][8][9], and so forth. WCA is based on the wavelet transform with a finite length of attenuation [10][11][12][13][14][15]. In this way, not only the frequency can be obtained, but also the time can be located; that is, the time-frequency analysis of the signal can be carried out [16,17]. erefore, in view of the nonlinear and nonstationary characteristics of the immovable relic disease data series, the quantitative correlation between the independent variables of environmental characteristics and the dependent variables of the disease can be obtained by WCA. On the other hand, the disease prediction of immovable cultural relics is still in its infancy; then, the existing prediction methods are all based on the backpropagation neural network (BPNN) [18,19], support vector machine (SVM) [20,21], and relevance vector machine (RVM) [22,23]. However, BPNN is of the gradient descent method to obtain the minimum value of the objective function, which is easy to fall into the local optimum. For SVM, the VC dimension theory of statistical learning theory and structural risk minimization is adopted to avoid the problem of local optimal [24], but when the sample data are too much, the training time of the prediction model will be too long. For RVM, the single kernel function results in the mutual exclusion of learning ability and generalization ability. erefore, with the rapid development of artificial intelligence technology, it is urgent to introduce a multisource information fusion algorithm to quickly and efficiently predict diseases accurately [25][26][27].
is paper proposes a least squares support vector machine (LSSVM) regression method based on WCA of multiple time series, which predicts accurately the diseases of immovable cultural relics. e WCA was used to identify the validity of the characteristic variables of cultural relics in addition to realize the normalization of data and dimension reduction of independent variables. LSSVM transforms the quadratic convex optimization problem of SVM into solving linear equations, which greatly reduces the difficulty of training. Moreover, through the experiment proved the superiority of the method proposed by this paper, the precise prediction of stone cultural relics crack has been realized, and a solid theoretical guidance has been provided for the concept of cultural relics protection, which is both "rescue" and "preventive".
Compared with the existing crack prediction methods of stone cultural relics, this paper has the following innovation points.
(i) e main factors affecting stone cultural relics crack are obtained quantitatively by the WCA method, and the dimension reduction of data is realized without affecting the prediction accuracy of the model.
(ii) A method of stone cultural relics crack prediction based on wavelet correlation analysis least squares support vector machine is proposed, and the prediction model is established to achieve accurate prediction of stone cultural relics crack degree.
(iii) e proposed method in this paper is compared with the traditional immovable cultural relics disease prediction methods based on BPNN, SVM, and RVM. Experimental results show that this method not only reduces the consumption of prediction but also has higher prediction accuracy than other traditional machine learning models.

Preprocessing of Stone Cultural Relics Environment Data
e monitoring data of stone cultural relics can be divided into independent variable and dependent variable multivariate time series. e independent variable sequence refers to the environmental data, while the dependent variable sequence can be regarded as the prediction quantity based on the independent variable sequence (such as the degree and type of disease). In this paper, the disease refers to the crack. Due to the diversity, complexity, and redundancy of environmental data, as well as the different units and value ranges of each variable attribute, the preprocessing of environmental data of stone cultural relics can be divided into two steps: (I) data normalization [28] and (II) dimension reduction of independent variables.

Normalization of Environmental Data of Stone Cultural
Relics. Let the independent variable matrix of the environmental data be X(t), where x i (t), i � 1, . . . , n is the observed column variable of environmental attributes (see, e.g., temperature, humidity, and other parameters) and is an independent variable in the multivariate time series; t � 1, . . . , T is the time variable. Similarly, let the dependent variable matrix of cultural relics disease be X(t), where y j (t), j � 1, . . . , m is a column variable of relics disease data (see, e.g., crack) and a dependent variable in a multivariate time series. e normalized processing formula of the data is as follows: e normalized multivariate time series matrix is represented by X ′ (t) and Y ′ (t): e normalized data not only maintain the diversity of its properties, but also eliminate differences in value range between different attributes.

Independent Variable Dimension Reduction Based on WCA.
e diseases of immovable cultural relics are affected by a variety of environmental variables, and the multivariate time series composed of these environmental variables are characterized by nonlinearity, uncertainty, redundancy, and nonstationarity. In order to reduce the consumption of predictive modeling, it is necessary to determine the principal components related to diseases. erefore, dimension reduction of independent variables based on WCA is required for environment variables.

Wavelet Correlation Analysis
Method. L 2 (R) is defined as the space of measurable and square-integrable functions on the real axis. For time series x(t) ∈ L 2 (R), the continuous wavelet transform formula [29] is where W x (a, b) is obtained by continuous wavelet transformation of the wavelet coefficient of x(t), a is the timescale factor, b is the time-position factor, and ψ * a,b (t) is the complex conjugate function of ψ a,b (t). Among them, where ψ a,b (t) is the wavelet function; that is, the wavelet basis function ψ(t) is scaled and shifted. Wenyi Chen gave the definition of the mutual wavelet transform of two time series [30], and the formula is where W x (a, b) represents the wavelet transform of time series x(t) in scale factor a and time-position factor b,W y (a, b) represents the wavelet transform of time series y(t) in scale factor a and time position factor b, and W * y (a, b) is the complex conjugate of W y (a, b). Scholars like Yanfang Sang and Dong Wang gave the definition of wavelet correlation number WR xy (a, k) [10,26,28]: where Wcov xy (a, k) is the cross covariance of wavelet, and the formula is where WR xy (a, k) is the wavelet correlation coefficient of x(t) and y(t) at scale factor a and hysteresis factor k, E[·] is the mean value of variables in parentheses, and R(·) is the real part of results in parentheses. e range of the above formula WR xy (a, k) is between [−1, 1], which represents the correlation coefficient between the independent variable x(t) and the dependent variable y(t). "−1" means completely negative correlation, "0" means noncorrelation, and "1" means completely positive correlation.
Based on the theoretical knowledge of wavelet correlation analysis and the collected multicharacteristic environmental variables, the variables of multitime series can be reduced. e reduced variables are the key factors affecting the diseases of cultural relics. Modeling on this basis can greatly reduce the consumption of prediction without significantly affecting the prediction results.

Data Dimensionality Reduction Based on WCA and
Stepwise Regression Forward Algorithm. e wavelet correlation analysis method is adopted to calculate the correlation coefficient matrix between the normalized multivariate time series independent variable X ′ (t) and the dependent variable Y ′ (t) using formula (8), denoted as R xy (n × m), as follows [28]: where R nm is a column vector, where each element represents, respectively, the correlation coefficients of each variable in X ′ (t) and Y ′ (t). e dimensionality reduction process of the independent variables adopts the algorithm of stepwise regression forward, P records as the set of normalized variables, N is the number of input variables, and H is the set of independent variables after dimension reduction. According to the theory of wavelet correlation analysis, the dimensionality reduction process of multivariate time series is as follows: (i) Initialization operation: set normalized X ′ (t) as P and H as empty set. (ii) Based on formula (8), the wavelet correlation coefficient between input and output variables is calculated, and R xy (n × m) is obtained.
According to the prediction model, the root mean square error RMSE(H) of the prediction results is obtained by calculating the set of independent variables H as the input variable, where in which A refers to the number of test samples, y r is the real value, and y p is the predictive value.
(v) Select one X i of the remaining variables in the set P to be added to H and calculate RMSE (vi) Repeat steps iv and v until all variables in P are traversed to complete the dimension reduction operation. e set H of independent variables after dimension reduction is obtained, and the variables are the input of the prediction model. e purpose of dimensionality reduction of the independent variable is to reduce redundant variables, which improves the accuracy of the prediction model and provides scientific theoretical guidance for reducing equipment investment in data collection.

Stone Cultural Relics Crack Prediction
SVM, which originated from the new statistical theory [31][32][33] in the 1990s, is a commonly used model in machine learning. It can not only be used for the classification of nonlinear samples but also be applied to the field of Mathematical Problems in Engineering 3 regression prediction. It adopts the VC dimension theory of statistical learning theory and minimizes the structural risk, avoiding the problem of local optimization. However, the more the sample data is, the longer the training time of the prediction model becomes. erefore, this paper proposes to use the LSSVM to accurately predict the crack of stone cultural relics [34][35][36][37]. e loss function is not only related to partial support vector samples, but also to learn all samples, so as to correct the fitting error, which greatly improves the prediction accuracy. LSSVM transforms the quadratic convex optimization problem of the support vector machine into solving linear equations, which greatly reduces the difficulty of training [38][39][40][41][42][43]. References [41,42] discussed the use of the LSSVM model for power module fault prediction and short-term traffic flow prediction. e prediction accuracy and training efficiency of the LSSVM model are much higher than those of standard SVM and BPNN, and reference [43] compared the prediction performance of tunnel boring machine in rock tunnel construction with LSSVM and RVM. It is clear that the method proposed in this paper is also effective and feasible in fault prediction and engineering.

LSSVM Prediction Modeling.
Under the condition that the learning samples are known, the LSSVM algorithm can map the nonlinearity samples to the high-dimensional feature space through the kernel function. e optimal decision function is constructed in high-dimensional feature space: where w T is the weight matrix, ϕ(x i ) is the kernel function, b is the displacement, e i is the relaxation factor, and x i is the input sample; the LSSVM model controls the error of all samples learning; and according to the principle of structural risk minimization, the regression problem is expressed as a constrained optimization problem: where c is the regularization parameter, J(w, e) is the objective function, and the following Lagrangian functions are constructed: where α i is the Lagrange multiplier. Find the partial derivatives of w, b, e, α for L(w, b, e; α), set the partial derivatives equal to 0, and then obtain the linear equations: where 1 T �→ � [1, . . . , 1], α T � [α 1 , . . . , α N ], y T � [y 1 , . . . , y N ], P refers to the inner product of the kernel function, and e kernel function is to map the nonlinear sample set in a low-dimensional space to a high-dimensional feature space, so as to transform the nonlinear problem in the low-dimensional space into a linear problem which is in a high-dimensional space [42]. e kernel function avoids the time-consuming inner product operation in a high-dimensional space [40]. At present, the kernel functions commonly used include linear kernel function, polynomial kernel function, and radial basis function (RBF). [40][41][42]. Among them, linear and polynomial kernel functions belong to global kernel functions, which have strong generalization ability, but its ability of nonlinear approximation is poor, and learning abilities are not strong. RBF kernel function belongs to the local kernel function, which has strong nonlinear approximation and learning abilities. Selecting the RBF kernel function to train the LSSVM model can obtain better overall performance than other types of the kernel function. erefore, the RBF kernel function is used to train the LSSVM model in this paper. e specific form of the RBF kernel function is as follows: where σ is the parameter of the kernel function, according to formula (14): en, the analytical solution of LSSVM is According to the above model theory, the process first maps nonlinear sample data to high-dimensional space through the nuclear functions and then by LSSVM regression model to forecast that the LSSVM not only has the advantages of SVM but also solves the inequality constraints of the standard SVM and the secondary planning problem into the solution of the linear equation system. is improves the speed of computation and can be applied to many engineering fields.

Wavelet Correlation Analysis LSSVM-Based Crack Prediction.
e structure of the model for crack prediction of stone cultural relics based on wavelet correlation analysis least squares support vector machine is shown in Figure 1.
Firstly, WCA is used to obtain the main factors affecting the crack of stone cultural relics, and the training set and test set are divided according to the main factors. en, the RBF kernel function is selected as the kernel function of LSSVM, and the prediction model is trained repeatedly with the training set, and the regularization parameter c and kernel function parameter σ are obtained, which are 86.68 and 9.06, respectively. And the test set verifies the quality of the model.

Effect Analysis of Dimension Reduction of Independent Variables Based on WCA
In order to verify the effectiveness of the proposed wavelet correlation analysis LSSVM-based method for crack prediction (WCA-LSSVM-CP), it is compared with the existing method based on BPNN (WCA-BPNN-CP), standard SVM (WCA-SVM-CP), and RVM (WCA-RVM-CP). e case data set adopts the environmental monitoring data of rock mass fractures provided by Shaanxi PI Culture and Education Technology Co., LTD., including the temperature and humidity, rainfall, SO2, O3, NO2, organic volatile, light intensity, wind speed, wind direction, and crack disease data. Figure 2 shows the site map of North Qianfo Cliff of Dafo Temple, with the red box marked as the study area. e monitoring data of rock fracture environment of Qianfo Cliff in the north of Dafo Temple is obtained through the knowledge cloud platform, which uses the Internet of things and big data technology to monitor the disease monitoring points of stone cultural relics, according to the sensors, the data collected by the acquisition instrument are uploaded to the knowledge cloud platform by wireless transmission, and the long-term monitoring data are stored in the cloud platform. e monitoring of rock crack data is collected by fracture meter sensor, as shown in Figure 3. Table 1 shows the monitoring data of cultural relics, a total of 460 sets [28]. e value range of the monitoring data varies greatly and the units of the attributes of each variable are different. erefore, in order to ensure the accuracy of the analysis, it is necessary to normalize the multivariate time series X(t) and Y(t) and reduce the dimension.
After normalization of the cultural heritage monitoring data, by wavelet correlation analysis, the correlation coefficient matrix R xy between environmental variable multivariate time series and heritage disease was obtained, and the results of wavelet correlation analysis are shown in Figure 4.
In Figure 4, the abscissa is the serial number of the attributes of 13 environmental variables, and the ordinate is the wavelet correlation coefficient between the environmental variables and the cracks in cultural relics [28].
en, according to the algorithm of stepwise regression forward, the environment variable corresponding to the largest wavelet correlation coefficient is selected as the basic attribute, and all the environment variable attributes are traversed in turn according to the predicted root mean square error. Finally, the attributes of environmental variables corresponding to numbers 2, 4, 6, 11, and 13 were obtained, which corresponded to temperature, ultraviolet intensity, precipitation intensity, humidity, and dew point, respectively, and were composed into a multidimensional time series after dimension reduction, denoted as X″(t).

Crack Prediction Results and Discussion.
e environment variable X ″ (t) after dimension reduction was used as the input to LSSVM, BPNN, and the standard SVM and RVM prediction model. e first 405 sets of data were selected as the training samples, and the last 55 groups were the test samples. e prediction results are obtained by simulation, as shown in Figures 5 and 6.

Begin
Quantitative correlation between environmental variables and crack opening using WCA.
Calculating the key factors affecting crack opening by stepwise regression forward algorithm.
The key environmental factors are divided into training set and test set, and the training set is used to train LSSVM prediction model.
The kernel function of the model is selected as the radial basis function, and the kernel parameters and regularization parameters of the LSSVM prediction model are debugged.
Whether the prediction accuracy is achieved?
The parameters of the prediction model are obtained and the LSSVM prediction model is established.
Test set is input into the prediction model.      Mathematical Problems in Engineering

Mathematical Problems in Engineering
As can be seen from Figures 5 and 6, after the dimension reduction of environmental variables is used as the input of the prediction model through wavelet correlation analysis, there are significant differences among the five methods. e error concentration of WCA-BPNN-CP is within 0.2 mm, and the absolute error of WCA-SVM-CP is less than 0.15 mm, and that of the maximum value of WCA-RVM-CP is about 0.05 mm. Although the LSSVM-CP and WCA-RVM-CP are close to WCA-LSSVM-CP, the absolute error median of WCA-LSSVM-CP is lower than that of LSSVM-CP and WCA-RVM-CP, which indicates that the prediction model after WCA not only obtains the main environmental factors affecting the crack opening disease of stone cultural relics but also improves the prediction accuracy.

Prediction Performance Evaluation.
In order to further study the predictive performance of the model quantitatively, under the condition of keeping the training samples and test samples unchanged, root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and determination coefficient (R 2 ) were used to evaluate the prediction performance of the above model: where f i is the predicted value, y i refers to the real value, and y represents the average value; thus, Table 2 summarizes the comparison results of performance indexes of the five prediction models after dimension reduction of environmental variables in the wavelet correlation analysis. Moreover, the coefficient of variation (CV) is used to analyze the stability of cultural relics crack data. e coefficient of variation is an index reflecting the degree of data  dispersion [44][45][46]. By calculating and comparing the coefficient of variation of the machine learning prediction models in this paper, the performance of the prediction model is analyzed. e coefficient of variation is calculated as follows: where σ * and c represent the standard deviation and average value of the fracture sequence of the cultural relics disease, respectively. e CV of predicted values of each model is also shown in Table 2.
It can be seen from Table 2 that, under the same conditions, in terms of the determination coefficient R 2 , R 2 of WCA-LSSVM-CP is close to 1, and the effect is the best, R 2 of WCA-RVM-CP model is 0.9870, LSSVM-CP is 0.9823, and that of WCA-SVM-CP is 0.8583, followed by the WCA-BPNN-CP model. e RMSE, MAE, and MAPE of WCA-RVM-CP model are basically 1/4 of theirs, compared with BPNN model and SVM model; however, that of WCA-LSSVM-CP is less than WCA-RVM-CP. In other words, WCA-LSSVM-CP has the least error. From the four performance evaluation indexes, the predicted values of WCA-LSSVM-CP are closer to the real value. Finally, the CV of the proposed method is 21.72%, which is the closest to the true value of the CV; that is to say, the stability of WCA-LSSVM-CP predicted values can reflect the real stability of the cultural relics crack. erefore, it shows that the WCA-LSSVM-CP model has better prediction performance and is more suitable for the prediction of the degree of stone cultural relics crack.

Conclusions
Aiming at the deficiency of analyzing the large data of stone cultural relics, this paper proposes the LSSVM method for predicting the crack of stone cultural relics based on multivariate time series WCA. Firstly, the cultural relics monitoring data is normalized, and then the WCA and stepwise regression forward algorithm are used to reduce the dimension of the environmental variables. On the premise that the prediction accuracy is not significantly affected, the independent variables and the prediction consumption are reduced. Secondly, the WCA-LSSVM-CP method is proposed to predict the crack of stone cultural relics. Finally, the real case is used for effect comparison, and the experimental results show that the method in this paper can predict the crack of stone cultural relics accurately.
e parameters values of the WCA-LSSVM-CP model presented in this paper are only for the case of the Dafo Temple. Like most machine learning methods, the model parameters of LSSVM are the key factors affecting the prediction effect of the model, which need to be determined through data experiments for specific scenarios.
Finally, the proposed method is suitable for disease prediction of immovable cultural relics by multivariate time series environmental variables. It is universal and easy to implement, which provides a scientific theoretical reference for the preventive protection of cultural relics and can be extended to other aspects of cultural relics protection.
Data Availability e data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.