Soft Computing Models to Predict Pavement Roughness : A Comparative Study

Pavement roughness as a critical determinant of public satisfaction can potentially play a major role in road or highway resource allocation to competing pavement resurfacing projects. With this in mind, the aim of the present paper is to develop an accurate model for the prediction of pavement roughness in terms of the International Roughness Index (IRI) using artificial neural networks (ANNs) and support vector machines (SVMs). )e modeling is based on pavement roughness data collected periodically for a highvolume motorway during a seven-year period, on a yearly basis. )e comparative study of the developed models concludes that the performance of the ANNmodel is slightly better compared to the SVM in terms of prediction accuracy. Further, the analysis results produce evidence in support of the statement that both models are capable to predict accurately pavement roughness; hence, they are deemed useful for supporting decision making of pavement maintenance and rehabilitation strategies.


Background and Objectives
Road pavements deteriorate under the combined effect of traffic loading and environmental conditions.Performance is a general term describing the way pavements' conditions change or satisfy their intended function offering an at least acceptable level of service to the road users over their design life.Over the past few decades, road agencies have established performance indicators to assess the effectiveness and efficiency of their service provision.Amongst others, an important indicator of pavement performance is ride quality. is is a rather subjective measure of performance that depends on (i) the physical properties of the pavement surface, (ii) the mechanical characteristics of the ride vehicle, and (iii) the standards of the road users concerning the acceptability of the perceived ride quality.Due to the subjectivity of the ride quality assessment, a lot of researchers had worked in the past to establish an objective indicator of pavement performance.Starting at the early 1960's with the development of present serviceability index (PSI) [1,2], nowadays the International Roughness Index (IRI) seems to have the broadest application for the assessment of ride quality [3][4][5].
IRI is considered to be a good indicator of pavement performance in respect to road roughness.It is developed in order to be linear, portable, and stable with time.It is portable since it can be measured with a wide range of equipment giving the same results, and stable with time since it is defined as a mathematical transform of a measured profile; thus, it is not affected by the measurement procedure nor the characteristics of the vehicle used for profile measurement [6].IRI is based on the concept of a true longitudinal profile, rather than the physical properties of a particular type of instrument [7].
Following the identification and quantification of ride quality through IRI, several studies have been performed to identify the variables affecting roughness development in time, especially when a pavement management system (PMS) is concerned.Pavement management activities include the collection of roughness data in terms of IRI that is used as input into the planning and prioritization of road infrastructure work programs [8,9].e IRI data also provide input into a variety of engineering-economic analyses, which assist in determining the future road network condition affected by a range of infrastructure funding scenarios.
Prediction models are indispensable components of a PMS, and their prediction accuracy is vital for the efficient management of the road infrastructure [10].For instance, by minimizing the prediction error of pavement roughness, agencies can achieve significant budget savings through timely intervention and accurate planning.e fundamental calculation of future pavement condition is commonly based on a pavement age versus pavement roughness relationship.However, roughness-age relationships commonly used in pavement deterioration and economic modeling do not take into account the pavement's historical performance, but an "average" rate of roughness progression is commonly assigned to each pavement based on its current age or current roughness measurement instead.In addition, the analysis of available roughness time series data with regression techniques accounting for the effect of time in roughness progression is usually extremely difficult and requires an a priori definition of the form of the regression equation.erefore, the use of simple statistical approaches such as linear regression maybe is not an appropriate means to model and predict pavement roughness and so other approaches such as soft computing methods should be considered for this purpose.
Several researchers utilized artificial neural network (ANN) technique for developing predictive models for pavement structural and functional conditions.For instance, in a recent study, it was demonstrated that the ANN technique can model more accurately compared to traditional regression methods the in situ pavement structural condition in terms of critical tensile strains ε r at the bottom of asphalt layers.is model combined with the establishment of ε r trigger values is deemed useful for analyzing, ranking, and prioritizing sections with reduced asphalt layers' fatigue life that urges for maintenance and rehabilitation treatments [11].Further, with regard to the pavement functional condition assessment, it has been demonstrated that an ANN model can be an effective and accurate way to predict surface roughness based on the backpropagation learning method and by exploiting experimental measurements obtained from the pavement surfaces [12][13][14][15][16].In addition, the support vector machine (SVM) method appeared as a promising soft computing technique derived from statistical learning theory that is not often used to solve pavement engineering problems [17].
In this paper, a data set consisting of pavement roughness data of a high-volume motorway is collected during a seven-year period, on a yearly basis and modeled using ANN and SVM techniques.A comparative study of the developed models is intended to investigate and document their capability and accuracy in predicting pavement surface roughness, towards the need to adopt effective and practical tools in planning pavement maintenance activities.

Methodology
2.1.Artificial Neural Networks.Artificial neural networks, among soft computing methods, given their flexibility and adaptability, have been applied to solve complex problems in the field of civil engineering [18][19][20][21][22]. ANNs offer quite accurate solution to develop models for complex data sets with nonlinear behaviours and thus is capable of overcoming many of the limitations of traditional methods.ANN is the computational intelligence system that mimics the human brain behaviour; this computing system consists of neurons, which are simple, interconnected, and adaptive processing units [23].
An artificial neuron receives information (signals) from many sources referred to as inputs (x i ), processes it, and then transfers the filtered signal to other neurons.e inputs are received through synapses (connections) and scaled by an adjustable factor called weight (w ij ).Hence, the signal transferred through the connection is equal to a portion of the original signal.e larger the value of a weight, the stronger the incoming signal and hence the more influential the corresponding input.e summation of all the weighted signal amounts is formed to compose the activation function (f), which corresponds to preselected transfer functions such as hyperbolic tangent, logistic sigmoid, and exponential.A filtered output (y i ) is then generated through the mapping of the transfer function and is mathematically represented as follows: ere are different types of ANNs including the backpropagation-type neural networks which are one of the most ANNs used in the civil engineering applications because of its powerfulness, versatility, and simplicity [24]. is ANN type refers to multilayered, feed-forward neural networks trained using backpropagation algorithms and can approximate any continuous function; hence, they are considered as universal approximators [25].Generally, an ANN consists of an input layer, an output layer, and one or several hidden layers in between, as shown in Figure 1.Each layer contains a set of neurons, which receive signals from many inputs and being connected with each other through connections.e training process involves appropriately setting the ANN architecture, namely, determining the optimum number of neurons in the hidden layer(s) of the network, aiming to "learn" the mapping defined by a representative set of inputoutput data.On this purpose, input data examples are propagated forward through each layer of the network to emerge as outputs.e response error (between the network output and desired output) is then distributed backwards, and the synaptic weights are adjusted to minimize the error.Usually, a statistical criterion is used for the training of a neural network such that the chosen criterion is optimized by adapting the network's synaptic weights (w). is process is called supervised learning and is illustrated in Figure 2.
Generally, ANN may suffer from overtraining.A common method to avoid overtraining is to split the data set into training, testing, and validation subsets and evaluate the generalization performance of the model with data that are not included in the training set.

Support Vector Machines.
In the 1990s, a new type of learning algorithm was developed, based on results from statistical learning theory by Vapnik [26]: the support vector 2 Advances in Civil Engineering machines.is gave rise to a new class of theoretically elegant learning machines that use a central concept of SVM-kernels-for a number of learning tasks.Kernel machines provide a modular framework that can be adapted to di erent tasks and domains by the choice of the kernel function (e.g., linear, polynomial, and radial basis function) and the base algorithm.SVM is a type of learning algorithm based on statistical learning theory and the structural risk minimization (SRM) principle, which can be adjusted to map the input-output relationship for a nonlinear system.Initially, SVMs were used as classi ers focused on optical character recognition and object recognition tasks.However, with the introduction of ε-insensitive loss function, SVMs have been extended to solve nonlinear regression estimation problems and they have been shown to exhibit excellent performance [27].
Particularly, for regression approximation, SVM principles can be summarized as follows: (a) SVMs use a set of linear functions de ned in a high-dimensional space, (b) SVMs carry out risk minimization using loss functions, and (c) SVMs use a risk function consisting of the empirical error and a regularization term which is derived from the SRM.ese characteristics assumed to be the main distinct di erence compared to regression methods by conventional ANN.It is this di erence which equips SVM with a greater ability to generalize, as prediction error and model complexity are simultaneously minimized, which is the goal in statistical learning.
is study uses the SVM as a regression technique by introducing a ε-insensitive loss function.In this section, a brief introduction on how to construct SVM for regression problem is presented and shown schematically in Figure 3.More details can be found in many publications [26][27][28].
Suppose we are given training data (x 1 , y 1 ), . . ., (x n , y n )} ⊂ X × R, where X denotes the space of the input patterns (e.g., X R d ).In ε-SV regression [26], our goal is to nd a function f(x) that has at most ε deviation from the actually obtained targets y i for all the training data, and at the same time, it is as at as possible.In other words, we do not care about errors as long as they are less than ε, but will not accept any deviation larger than this.For instance, when we apply the ε-SV regression with radial basis functions, where vectors x j are inputs from the training data.e vector of unknown parameters w is determined to minimize the function which is as follows: Hidden layer

Output layer
Output  Advances in Civil Engineering 3 e parameter C > 0 determines the trade-o between the atness of f(•) and amount up to which deviations greater than ε are tolerated.e dual of this optimization problem is solved using convex programming techniques [27].

Models' Performance Measures.
e performance of the developed models can be evaluated in terms of statistical measures of goodness of t.In the present study, several statistical measures, namely, the correlation coe cient (r), coe cient of determination (R 2 ), the mean absolute error (MAE), and the root mean square percentage error (RMSPE), were used.
ese are de ned by the following equations: where N number of IRIs in the sample and y i,exp and y i,est , respectively, represent the experimental and estimated values of IRI while y i,exp and y i,est represent their mean, respectively.r and R 2 should be close to 1 for better correlation between experimental and predicted IRI values.e MAE and RMSPE provide measures of the model's ability to predict the experimental values.Hence, lower values of MAE and RMSPE indicate better model performance.

Experimental Data.
As it has been mentioned, pavement roughness data were collected periodically for a high-volume motorway during a seven-year period, on a yearly basis.e measurement system that was used is a laser pro ler system, as shown in Figure 4. is system is vehicle-mounted laserbased instrumentation equipment capable to record data at vehicle speeds up to 100 km/h [29].
To meet the objectives of the present investigation, the analysis was focused on the data collected along the heavytra cked lane of 5 km in length motorway section that is homogeneous in terms of pavement structure, environmental conditions, and tra c. e IRI values were calculated with an interval of 10 m as the average of the measured IRIs along the wheel paths.Figure 5 presents the variability of IRI values in terms of coe cient of variation (CV) for each measurement year.e CV values range from 35% to 39%, for the average IRI values ranging from 0.36 to 3.63.ese numbers indicate signi cant variability of pavement roughness within the pavement section under investigation.However, it is obvious that the variation of IRI values is similar every year, within the 7-year measurement period.
It should be noted that during monitoring no major distresses, such as cracking, rutting, and potholes, were observed on pavement under investigation; therefore, roughness progression over time is considered to be una ected from pavement surface condition [30].Another point worth mentioning is that the speci c motorway has been built and operated on a public-private partnership (PPP) basis, meaning, among others, that timely preservation and maintenance/rehabilitation of pavement are imperative to minimizing costs and maximizing bene ts.Hence, the development of pavement performance prediction models can be useful to support decision making   Advances in Civil Engineering process concerning maintenance priorities and strategies.And roughness is one of the most important pavement performance indicators.For the model development, the methodological approach as described in [11] was followed in terms of data splitting (i.e., training: 50%, testing: 25%, and validation: 25%), activation functions (i.e., hyperbolic tangent, logistic sigmoid, and exponential), and backpropagation algorithms (i.e., quasi-Newton and scaled conjugate gradients) used.A network with one hidden layer was exclusively chosen for all models trained in this study due to their ability to approximately realize any continuous mapping [31].

Results and Discussion
Trial networks with the varying number of neurons in the hidden layer were trained to evaluate the performance of di erent network architectures.e neural networks were optimized using the training sample set.ere are several techniques to combat the problem of over tting and tackling the generalization issue.e test data sample was used as a means to halt training to mitigate over tting and improve the generalization ability of the developed networks.e error function in terms of the Sum of Squares (SoS) error between the target and prediction IRI t outputs was monitored during the training process.Normally, the error decreases during the initial phase of training.However, when the network begins to over t (a situation arising when ANN works well only with the training data) the data, the error on the test set will typically begin to increase.Training was stopped when the test error increased for a speci ed number of iterations, and the weights at the minimum of the test error were saved.
Figure 6 shows a comparison between the experimental (i.e., IRI t measured) and predicted IRI values using the ANN model (i.e., IRI t ANN) for both training and test data, while the SoS error remains almost constant (i.e., 0.030 and 0.031, resp.).e R 2 values corresponding to training and test data are 0.92 and 0.93, respectively, and hence, they indicate that the ANN model can explain 92% and 93% of the variability in IRI t with 95% con dence level.In addition, the prediction errors are statistically normally distributed, as depicted in Figure 7.As can be also seen in Figure 7, prediction errors are mainly concentrated between −0.1 and 0.1 m/km.
To ensure the good generalization ability of a trained neural network, once each network was developed and tested, it was validated using the validation data set.Just as like the test sample, a validation sample is never used for training the neural network.Using the validation data set, ANN-based predictions of IRI were carried out and compared with the measured IRI. Figure 8 shows scatter plots of the experimental and predicted IRI values using the ANN model and exhibits good distribution of data points around the line of equality.Moreover, the MAE and RMSPE were calculated.e related values (i.e., 6.9% and 8.3%) combined Overall, the analysis results produce evidence in support of the statement that the ANN models have good convergence and could be e ectively developed to rapidly predict one future year of the pavement roughness condition in terms of the IRI.

SVM Model.
Additionally, the SVM method was used to model the IRI progression.
e data were divided into training and validation subsets: 75% were used as the training data and the remaining 25% were considered as validation data set in such a manner that direct assessment of the prediction performance can be performed between the soft computing methods employed.e six input parameters (i.e., IRI t-6 , IRI t-5 , IRI t-4 , IRI t-3 , IRI t-2 , and IRI t-1 ) were used in the development of the SVM regression model with the output variable IRI t .One of the important steps in SVM model development is to determine the optimal modeling parameters, such as the kernel function K and parameters C and ε.Concerning the kernel type, preliminary analyses showed that the use of radial basis function (RBF), as described in (2), gives the highest prediction performance.
e parameters of C, ε, and other RBF kernel-speci c parameters (i.e., c) that maximize the predictive power of the SVM model were chosen by vefold cross-validation and trial-and-error approach.e SVM model optimal parameters used in this study are summarized in Table 1.
e training data subset was used for SVM model learning, and the validation data subset was used to examine the prediction accuracy of the developed SVM model.Figure 9 illustrates the experimentally measured (i.e., IRI t measured) and predicted (i.e., IRI t SVM) outputs corresponding to the training data samples.
According to Figure 9, it seems that the results of SVM modeling are in good agreement with the experimental values.R 2 > 0.9 suggests that a strong correlation exists between the predicted and measured IRI values.For the assessment of the tted model's prediction accuracy, the MAE and RMSPE were calculated as 8.2% and 9.9%, respectively.e statistical evaluation results reveal that the SVM model has good prediction ability.Moreover, the prediction errors are statistically normally distributed and are mainly concentrated between −0.1 and 0.1 m/km, as illustrated in Figure 10.
Using the validation data set, SVM-based predictions of IRI were performed and compared with the measured IRI. Figure 11 shows scatter plots of the experimental and predicted IRI values using the SVM model and exhibits good distribution of data points around the line of equality.Also, the MAE and RMSPE were calculated as 7.7% and 8.9%, respectively.It is worthwhile to mention that these values are   is nding supports the former statement for the good generalization ability of SVM models. 2 illustrates a comparison between the developed ANN-and SVM-based roughness prediction models.e comparison is carried out in terms of statistical performance measures.For the rst level of comparison, R 2 , MAE, and RMSPE were calculated for each of them using training data sets.For the second level of comparison, the validation data sets were used, and R 2 , MAE, and RMSPE were calculated for each model.

Comparison of Models. Table
From the results reported in Table 2 corresponding to the model development using training data sets, it can be observed that the ANN and SVM models exhibit high R 2 and low MAE and RMSPE values.Speci cally, both models show R 2 > 0.9 and MAE/RMSPE < 10%, although the goodness-oft statistics of the ANN model are slightly better compared with those corresponding to SVM model.It is noteworthy to mention that though the statistical measures for the training data set suggest that ANN and SVM models predict roughness with high accuracy, the error values (MAE and RMSPE) of both models for the validation data are even lower; this is more evident for the ANN model.Hence, the models' performance on the training and validation data suggests that they have both very good predictive ability and generalization performance.

Conclusions
e present study evaluated two advanced computational tools for modeling roughness progression in terms of IRI that developed based on ANN and SVM techniques.ese computational tools showed the ability to build accurate models with high predictive capabilities for prediction of IRI on the basis of available roughness time series data.Based on statistical performance measures, similar prediction results were obtained by the ANN and SVM models suggesting also good generalization capabilities, although the ANN statistical metrics were slightly better.On the whole, the analysis results produce evidence in support of the statement that both models are capable to predict accurately pavement roughness.Of course, it can be reasoned that the degree to which a model is an accurate representation of roads with di erent characteristics from the perspective of the intended uses of the model should be taken into consideration.However, in any case, it seems that for the prediction of pavement roughness, even for the short term, modeling provides the possibility to analyze, design, plan, project, rank, and optimize the choice of alternatives, allocating costs, and apportioning funds for pavement maintenance activities.

Figure 8 :
Figure 8: Measured versus predicted IRI t using the ANN model: validation data.

Figure 9 :Figure 10 :
Figure 9: Measured versus predicted IRI t using the SVM model: training data.