A Hybrid Model for Prediction in Asphalt Pavement Performance Based on Support Vector Machine and Grey Relation Analysis

. Pavement performance prediction is a crucial issue in big data maintenance. This paper develops a hybrid grey relation analysis (GRA) and support vector machine regression (SVR) technique to predict pavement performance. The prediction model can solve the shortcomings of the traditional model including a single consideration factor, a short prediction period, and easy overﬁtting. GAR is employed in selecting the main factors aﬀecting the performance of asphalt pavement. The SVR is performed to predict the performance. Finally, the data collected from the weather station installed on Guangyun Expressway were adopted to verify the validity of the GRA-SVR model. Meanwhile, the contrast with the grey model (GM (1, 1)), genetic algorithm optimization BP [[parms resize(1),pos(50,50),size(200,200),bgcol(156)]]081%, − 0.823%, 1.270%, and − 4.569%, respectively. The study concluded that the nonlinear and multivariate prediction model established by GRA-SVR has higher precision and operability, which can be used in long-period pavement performance prediction.


Introduction
Big data maintenance is a central issue in highway management.Highway maintenance mileage accounted for 97.7% of the mileage of traffic in China by the end of 2018.Notably, the expressway has been transferred from the construction to the maintenance period.With the popularity of big data technology, roads have entered the era of big data maintenance.However, the reason why the performance of asphalt pavement is a vital component of maintenance management and operation is that the rational allocation of maintenance decision-making and maintenance funds are determined by an accurate prediction model in the later period.erefore, the scientific establishment of the pavement performance prediction model is significant for asphalt pavement maintenance and can provide a model for big data maintenance.e pavement management system (PMS) is applied for road life cycle management.However, it generally uses analytical tools and statistical methods to predict pavement performance [1].Predicting of pavement performance is critical, but it is very complex, because the performance of asphalt pavement is affected by the combination of structural design, material properties, construction quality traffic load natural factors, and maintenance [2].
e pavement performance prediction model is a relationship that characterizes the variation of pavement performance with time, material, and traffic load [3].
ere are different methods available for the determination of pavement performance; many scholars have attempted to develop a scientifically derived accurate model.ere are four types of prediction models: uncertainty model, certainty model, dynamic model, and bionic model [4].
(i) Uncertainty model: the commonly used model is the grey theoretical model that has the characteristics of a small amount of data, high prediction accuracy, and a simple calculation method.erefore, it is widely used in pavement performance prediction.For example, Zhang et al. [5], Shen and Du [6], Wang and Li [7], and Zhang and Ji [8] used this model to predict pavement smoothness and rutting.Peng et al. [9] applied Weibull distribution to pavement performance prediction and obtained ideal results.
(ii) Certainty model: it is an empirical method; it takes advantage of using traditional regression as a tool to fit the data that come from experiments and finite element mechanics to get the form and parameters of the model.For example, Sun and Liu [10] proposed the decay equation of asphalt pavement performance which was obtained through engineering experiment.Abed et al. [11] investigated the variability effect of thickness and stiffness of pavement layers; they used the Monte Carlo method to obtain the probability distribution function of pavement performance by using the parameters.Gong et al. [12] proposed a regularized regression method to estimate the asphalt concrete moduli with data available from the long-term pavement performance (LTPP) database.
(iii) Bionic model: this model has high prediction accuracy.Yang et al. [13] used the genetic neural network model to estimate rutting and driving quality.Bianchini and Bandini [1] proposed the neuro-fuzzy hybrid model to predict the present serviceability index (PSI).Ferreira and Lima Cavalcante [14] and Beltran and Romo [15] presented the application of artificial neural networks (ANN) in pavement performance.
(iv) Dynamic model: it is based on the traditional model.For instance, Shen et al. [16] improved the traditional grey model and proposed a dynamic grey model.To date, various pavement performance prediction models have been proposed by scholars, but the models still have defects.For example, the grey model just adopts the time factor and does not take into account other factors such as natural environment and traffic load which may have maximum impact.And, as the forecast period increases, the stability and accuracy of the prediction are decreased.Weibull distribution model is only suitable for small sample data prediction.e certainty model is mainly determined by factors like the initial performance index of asphalt pavement and the road age.It is simple and convenient to use for it does not consider the reasonable dynamic data.It only can predict short period performance.
e genetic neural network and ANN model are prone to overfitting when data are insufficient.e dynamic prediction model can make full use of the later data to predict longer periods.Simultaneously, the reason why the model can only consider the impact of time on pavement performance is that the model is based on time series.Hence, a new model is needed to be devised to be applied to pavement performance prediction.
Recently, support vector machines have been applied in various fields.Zhao et al. [20] proposed a k-means and SVM hybrid model for the development of an electric vehicle urban driving cycle.Hoang et al. [21] used it to recognize the pavement crack.Wang et al. [22] proposed a support vector machine online model for predicting metro ridership.Karballaeezadeh et al. [23] applied this model to the prediction of road residual life and compared the model with an artificial neural network (ANN) and multilayer perceptron (MLP) models.
e results show that the support vector machine model has the highest accuracy.
e factors affecting the performance of asphalt pavement were processed firstly by GRA.
e SVR with advantages of minimizing structural risk and strong generalization performance was then used to establish a hyperplane as a decision surface.Finally, the asphalt pavement performance prediction model was established to provide a model that can be applied to maintenance decision-making, maintenance fund investment, and big data of pavement maintenance.e structure of this paper is as follows.Section 2 mainly introduces the main principles of GRA-SVR.Section 3 contains the modeling process of the whole model.Section 4 mainly uses the model to verify the example.Finally, the results are analyzed.

Basic Principle of Grey Relation Analysis.
e grey system theory holds that the complex objective systems which are all ordered and discrete data must contain inherent laws [24].
ere are many factors affecting the performance of asphalt pavement, but the effects of various factors are not very clear, so that we can call the factors grey.
erefore, GRA is used to quantitatively reflect the correlation between asphalt pavement performance and various factors.is method can find the main factors from many factors that affect pavement performance.
e corresponding statistical data of the influencing factors in the system are converted into geometric curves by the method, and the closer the curve geometry is to the dependent variable, the greater the degree of association is [25].

Basic Principle of Support Vector Machine Regression.
SVR is a model derived from the support vector machine (SVM) proposed by VAPNIK [26].e SVM model is a machine learning method that mainly solves the 2 Journal of Advanced Transportation classification problems of small samples, nonlinearities, and high-dimensional data [27,28].Its principle is based on the VC theory of statistical principle and structural risk minimization, and the optimal solution in data mining is sought by establishing an optimal hyperplane [29].Usually, we reduce the dimension of the sample to simplify the problem, while the SVM method is the opposite.It uses the kernel function to map the sample points to high-dimensional and even infinite-dimensional space to deal with linear problems as shown in Figure 1.
Regression is essentially similar to classification.e SVM classification model is to manage a plane so that the support vectors of the two classification sets or all the data are farthest from the classification plane, and the SVR model is to find a regression plane so that all data of a collection could be closest to the plane, as shown in Figure 2. e SVR can predict the prediction vector of the test data by establishing a nonlinear relationship between the data tested in the training data and the support vector.Most of the various influencing factors of asphalt pavement performance are nonlinear.e specific method is as follows.
Assume the sample set (x 1 , y 1 ), (x 2 , y 2 ), . . ., (x l , y l ), x ∈ R n , y ∈ R, x ∈ R n , y ∈ R, en, y and x in the sample set can be expressed as follows [2]: where w and b are the coefficients of the hyperplane.
If the original data fit well with the support vector machine regression, then min 1/2‖w 2 ‖ is as follows [2]: where ε is a positive number.Equation ( 1) is transformed into (3) by introducing the Lagrangian logarithm [2]: where a I and a * i are the sample support vectors, which take a value of zero in most cases.
e above process is the linear regression principle of SVR, but the effects of the factors including rainfall, traffic volume, maximum temperature, and minimum temperature for the pavement performance are nonlinear.When dealing with the nonlinear problem of the SVR, the sample x i is mapped to a high-dimensional space by ψ: x ⟶ H.An optimal hyperplane should be constructed to solve the "dimensionality disaster"; the inner space operation is implemented using the original spatial parameters when ψ is unknown.e internal kernel function K(x i , x j ) � ψ(x i ) × ψ(x j ) can be obtained when the kernel function satisfies the condition of Mercer [30].At the same time, Lagrange changes are introduced to get equation ( 4) [31]: Finally, the transformed regression function [31] is as follows: is method can avoid overfitting caused by traditional methods.SVR nonlinear regression fitting could control the fitting process by increasing the dimension.e high generalization performance that is closely related to the choice of kernel function is a big advantage of SVR.
Commonly used kernel functions are listed as follows [32]: (1) Linear kernel function: μ, r, and p are parameters of the kernel function.However, each type of kernel function has different advantages and disadvantages: ① Linear kernel functions are used to generalize linear samples.② Polynomial kernel functions are mostly used to process text data.③ Although Sigmoid kernel function has higher accuracy, it is complicated, which increases the complexity of the whole model.erefore, in this paper, the RBF kernel function is used for support vector machine regression prediction.

Construction of GRA-SVR Asphalt Pavement
Performance Model

Selection of the Best Parameters.
It is important to select the appropriate penalty parameter c and kernel Journal of Advanced Transportation function parameter g to ensure the accuracy of the entire model when using SVR for prediction.erefore, the CV method is generally adopted to solve this problem, which is a statistical analysis method for verifying the performance of the model.e principle is to group the original data and divide them into verification and training sets.In this way, it is possible to effectively avoid the states of underlearning and overlearning and ultimately obtain the accuracy.Common CV methods are as follows: (1) Hold-Out Method: the method randomly divides the data into two categories: one is the training set used to train the model, and the other is the verification set used to verify the model [20].e final accuracy is the performance metric of the model.
(2) LOO-CV: assuming there are N samples in the original data, that is why the model is called N-CV, so each sample is an independent verification set, and the remaining N-1 samples are training sets; thus, N models were obtained.e average accuracy of the final validation set is used as a performance indicator for the model.However, due to the high computational cost, the model has difficulties in practical operation.
(3) K-CV: the original data are equally divided into K groups.e data of each group are used as verification set once, and the remaining data of other K-1 groups are used as a training set; therefore, K models are obtained.en, the average of the classification accuracy calculated from the final verification set of those K models is used as the performance index of this model [33]. is method is more accurate due to the fact that it can effectively avoid the states of underlearning and overlearning.
According to the comparable selection of the three methods, the K-CV model is finally adopted to cross-validate and select the best penalty parameter c and function parameter g. e specific method is as follows.Firstly, the parameters c and g are limited to a specific range, and then the K-CV model is used for the training set in the range to obtain the accuracy.Finally, the parameters c and g which make the training set with the highest accuracy are selected as the optimal parameters.e concrete implementation can be implemented using the libsvm3.20 tool.

Construction of Asphalt Pavement Performance Model.
e pavement performance is affected by many factors.e factors, acting on performance, are uncertain and nonlinear.Hence, the performance and factors integrate a grey system.erefore, the grey correlation analysis can be used as an attribute processor to select several important influencing factors, and then the SVR is used to perform the regression prediction.rough the establishment of the comprehensive model GRA-SVR to predict the trend of pavement performance under the influence of various factors, the specific modeling process is shown in Figure 3.
Specific steps are as follows: (1) Select dependent and independent variables.
(2) Establish a raw data matrix: 3, 4. x i (k) represents a certain level of the first influencing factor.
(3) Data normalization.(4) Calculating the difference sequence [34]is as follows: (5) Achieving the largest and smallest difference of the sequence [34] is as equation (7).Write the maximum value as M and the minimum value as N: (6) Calculating the correlation coefficient of each sample [35] is as follows: ξ is called the resolution coefficient.When ξ ≤ 0.5463, the resolution is the best.Usually, the value of ξ is 0.5, which is also taken in this paper.4 Journal of Advanced Transportation (7) Calculating the correlation between each influencing factor and the system [35] is as follows: (8) Choose the factors that have a greater influence on pavement performance.(9) To improve the accuracy and training speed of the model and prevent big numbers of consuming decimals during the calculation process, the data should be normalized and processed to the interval [0, 1].(10) RBF which is researched has a high precision [36,37], and this paper selects the RBF kernel function to predict the performance.( 11) K-CV model is used to cross-validate and select the best penalty parameter c and function parameter g. (12) Using the optimal parameters for SVR fitting, the prediction data are obtained.

Data Acquisition.
is paper is based on the highway from Guangzhou to Yunfu (Guangyun highway) and the installed weather station in 2010, and it can collect the climate data including road temperature, humidity, wind speed, and solar radiation.
e installation details and pavement structure are shown in Figures 4 and 5.Among them, the pavement temperature detection uses the ZDR-41 temperature sensor, subgrade temperature, and humidity testing to use a 5TE sensor (see Figure 6).e climate of Guangdong province is humid and the temperature is extremely high, rising to 41 °C.Under the influence of large traffic volume, the rutting is serious as shown in Figure 7.
e RDI prediction models GRA-SVR, PPI, GA-BP, and GM (1, 1) were established to analyze the accuracy of each model, which were based on the RDI, maintenance funds, traffic volume, and data collected by the weather station from 2011 to 2018 (see Table 1 for the survey results).e factors, pavement structure, and materials should be considered in performance prediction.Usually, the pavement structure needs to be calculated as a numerical value.To address this issue, the structures number [12,[38][39][40] is usually adopted.However, it needs to be calculated in two cases as follows: (i) Different structures: in this case, the thickness and material of each layer of the road are different.e structural number [41] (SN) is adopted according to the AASHTO guide for design of pavement structures.e road network level performance prediction can apply this case.e specific calculation method is as follows: where a i is i th layer coefficient; this parameter needs to be obtained through experiments, D i is i th layer thickness, and m i is the i th layer drainage coefficient.(ii) Same structure: the performance of the pavement material can be affected by the environment, and the structural bearing capacity is changed.e pavement structural bearing capacity can be expressed by the pavement structure strength ratio (SSR) [42].e specific calculation method is where l 0 is pavement deflection standard value (0.01 mm), where l is pavement measurement  is paper relies on engineering only one pavement structure, so the calculation of SSR represents the influence of pavement structure on pavement performance.

Grey Relation Analysis.
e correlation of the data can be analyzed in Table 2; the correlation degree of each influencing factor can be obtained, as shown in Table 2.
e effects of various factors on rutting are sorted as follows: Generally, the greater the degree of relevance, the better the correlation of factors to the main direction of system development, that is, the greater the influence of this factor on the evaluation index.When c > 0.8 is well correlated, when c � 0.6∼0.8, the correlation is good.We can see that c of these 18 factors is greater than 0.6, indicating that these factors have an impact on the rutting.Among the 19 factors, c of 12 factors is greater than 0.8, indicating that these 12 factors have a strong influence on the formation of rutting.So, the better relevant factors that have the greatest impact were selected to establish the model, and the other factors were removed.e selected results are as follows: Equivalent single axle loads > maintenance funds > pavement structure strength ratio > mean value of soil moisture > highest temperature in the middle surface > highest temperature in the road surface > annual cumulative total radiation > annual average rainfall > lowest temperature in middle surface > highest temperature in the upper surface > lowest temperature of upper surface > highest temperature in lower surface.e following can be observed from the above analysis: (1) e primary factor, the formation of rutting, is the equivalent single axle loads.e greater equivalent single axle loads are, the more serious the rutting is.e reason is that, under the action of traffic load, large shear stress will be generated in the asphalt pavement, which will cause irreversible cumulative deformation in the surface layer.
(2) e maintenance funds have a significant repairing effect on the rutting.For example, in this section of the highway, the maintenance funds were RMB 81,500 in 2013.e traffic volume and rainfall increased, but the rutting disease was significantly improved in 2014.

Journal of Advanced Transportation
(3) e degree of relevance SSN is 0.9301.It shows that SSN has a greater impact on the rutting.e specific reason is that water, solar radiation, and temperature have an impact on the pavement material, and the structural bearing capacity is insufficient, resulting in the occurrence of rutting.
(4) e annual cumulative radiation ages the asphalt and accelerates the formation of the rutting.After the aging of the asphalt, the overall shear resistance of the asphalt surface layer is reduced, resulting in a decrease in the rutting resistance.For example, the annual cumulative radiation was the largest in 2015, and the rutting in 2016 was more serious.
(5) e maximum shear stress generally occurs in the midsurface, and the rainfall and wind speed accelerate the heat dissipation of the highest temperature of the environment and road surface.Based on the above factors, the influence of the highest temperature of the middle layer on the formation of the rutting is greater than the highest temperature of the road surface and the upper layer.
(6) Under the action of traffic load, the water infiltrated into the asphalt surface layer by soil and rainfall will become high-pressure water, which will reduce the bond behavior between asphalt and aggregate, resulting in lower pavement strength and lower resistance to rutting.
(7) e lowest temperature of the road surface would cause other diseases on the asphalt pavement, which indirectly lead to the occurrence of rutting.
e dimensionally reduced data are normalized by software, and the processing results are shown in Table 3.

Penalty Parameter Selection.
In this paper, the optimal penalty parameter c and function parameter g are solved by K-CV cross-validation model to select the best penalty parameter c and function parameter g (see Figure 8).e axis of abscissa indicates the value of c after taking the base 2 logarithm.e ordinate axis represents the value of g after taking the base 2 logarithm.Contour lines indicate errors in the range of c and g.When the error is the smallest, the corresponding c and g are the best.First, c and g are initially selected.e range of c is within 2 ∧ (− 6) ∼ 2 ∧ (6) and that of g is within 2 ∧ (− 8) ∼ 2 ∧ (8).When the error is 0.0572, the optimal penalty parameter is c � 64.0 and g � 0.0039.By primary election, the range of values for c can be reduced to 2 ∧ (− 3) ∼ 2 ∧ (2) and g can be reduced to 2 ∧ (− 4) ∼ 2 ∧ (4)(see Figure 9).At the same time, reduce the interval between the contour and the three-dimensional view.When the error is 0.0605, the optimal penalty parameter is c � 4.0 and g � 0.0884.

Results and Discussion
e GRA-SVR, GM (1, 1) [43], GA-BP [44], and PPI model were applied and compared to predict the RDI of 2018 which was based on the training set consisting of various factors and RDI from 2011 to 2017.e PPI [10]model is as follows: where PPI is the performance index; PPI 0 is the initial performance index; y is the road age; α and β are mode parameters.In this paper PPI 0 � 94; y � 8; α � 13.2; β � 1.409.4, the accuracy comparison was shown in Table 5, sand the corresponding variation trend and actual value of different models were shown in Figures 10 and 11.
e evaluation parameters of the four models obtained from Table 5 in predicting RDI are as follows: e GRA-SVR and GA-BP models all showed good performance in terms of the overall correlation and deviation of the predicted value from the true value.However, with respect to relative error in 2018, GRA-SVR is the best, followed by GM (1,1).Figure 11 shows the relative errors of the predicted and true values for the four models from 2011 to 2018.It can be observed that the relative error of the GA-BP model is the smallest, higher than GRA-SVR in 2016, and higher than GM (1, 1) in 2018 from 2011 to 2015. is is  because the model is prone to overfitting for samples with small data, resulting in reduced prediction accuracy.e trends of the predicted and actual values from different model RDIs were depicted in Figure 10(a).It can be seen that the GRA-SVR and GA-BP models display nonlinear trends, which are close to the actual value.e other two models show a linear relationship, which is different from the actual value.
All four models have good accuracy in short period prediction (see Figure 10(b)), but the accuracy would change with the prediction period increasing (see Figure 10(c)); the GRA-SVR model has the highest prediction accuracy because the old data were replaced by the new prediction data as the new training set.
e GA-BP takes second place.irdly, the GM (1, 1) model just used the data of 7 years, and the accuracy reduced as the new data are not replenished in time with the time increases.e PPI model has the worst prediction accuracy, which was due to the fact that the model only uses the first-year data for prediction.As the prediction period increases, the controllability of the model decreases.In order to verify the accuracy of the model, the pavement surface condition index (PCI) and pavement skidding resistance index (SRI) prediction applied this model.e relative error was − 0.115% and 0.111%, respectively.
For the GRA-SVR and GA-BP model modeling process, more important factors that affect the production of rutting should be considered, so the modeling process is more complex than the other two models, but the prediction results are stable.e PPI model just considers the age and regional conditions, and the main factors affecting the pavement performance were unutilized; therefore, the prediction accuracy is lower.In the GM (1, 1) model, the time factor was only considered, whose prediction accuracy depends greatly on the accuracy of the annual data.If the data of a certain year are deviated, the whole system trend will have a large error, and the ease of operation of the model is between the other models.erefore, the GRA-SVR model is suitable for multivariate, long-period, and nonlinear prediction of pavement performance.e accuracy, prediction period, and operability of the three models are compared and analyzed.e results are shown in Table 6.
Overall, our study establishes the model that has offered better performance than other models.However, there are also limitations.In the future study, we want to choose the best parameters with better methods including genetic algorithm and particle swarm optimization.ese algorithms are also widely used in other fields.If we find a better optimization method, we can make the prediction accuracy higher.We will build the database with more road information.en, the GRA-SVR model at the computing terminal is used to predict the performance.Some decision model is applied to maintenance decision.Finally, the results  12 Journal of Advanced Transportation are uploading the pavement management system (see Figure 12).We firmly believe that this will have far-reaching implications for road maintenance projects.

Conclusion
In this study, a GRA-SVR predictive hybrid model, combining the grey correlation analysis with support vector machine regression, was proposed for the first time to be applied to predict the performance of asphalt pavement.e main conclusions are drawn as follows: (1) e main factors including equivalent single axle loads, maintenance funds, highest temperature in the middle surface, pavement structure strength ratio, average value of soil moisture, highest temperature in the road surface, lowest temperature in the road surface, highest temperature in the upper surface, annual average rainfall, annual cumulative total radiation, highest temperature in the upper surface, annual average rainfall, lowest temperature of upper surface, highest temperature in lower surface, lowest temperature in lower surface, and annual maximum wind speed are well correlated in pavement performance.(2) Compared with other models, the GRA-SVR model is highly accurate and time-independent, which makes it suitable for short and long period predictions.
In conclusion, the GRA-SVR model is applicable for a multivariate, long period, and nonlinear performance of pavement prediction and is restricted by the amount of data.It is reliable for asphalt pavement maintenance decisionmaking.At the same time, this model can also be applied to big data road maintenance prediction.

Figure 5 :
Figure 5: Weather station layout.(a) Drilling cores of asphalt pavement, (b) installation of temperature sensor in a pavement structure, (c) installation of temperature sensor on road surface, and (d) bracket mounting.

Figure 8 :
Figure 8: Best primary selection of penalty parameters.(a) Parameters c and g versus the accuracy rate in two dimensions; (b) parameters c and g versus the accuracy rate in three dimensions.

Figure 9 :
Figure 9: Best final selection of penalty parameters.(a) Parameters c and g versus the accuracy rate in two dimensions and (b) parameters c and g versus the accuracy rate in three dimensions.

Figure 10 :
Figure 10: Trend charts of RDI predicted value of different models.

Figure 11 :
Figure 11: Trend charts of the actual value of different models.

Table 3 :
Standardized data after normalization.

Table 4 :
Comparison of predicted and actual values of RDI.

Table 5 :
Precision comparison of forecast results for the three models.

Table 6 :
Performance comparison of four models.
★ means performance in general, ★★ means better performance, and ★★★ means the best performance.