Analysis and Prediction of Overloaded Extra-Heavy Vehicles for Highway Safety Using Machine Learning

Along with the prosperity and rapid development of the national economy, the transportation industry has rapidly developed in China. However, overloaded vehicles have been causing frequent traffic accidents. ,us, to alleviate or resolve the corresponding problems associated with highway engineering safety and the market economy, an improved technique for overload management is urgently required. In this study, to analyze the overload data on expressways and highways in China, we developed a machine learning model by comparing the performances of cluster analysis, backpropagation neural network (BPNN), generalized regression neural network (GRNN), andwavelet neural network (WNN) in analyzing global and local time series overload data. In a case study, our results revealed the trends of overloading on highways in Jiangsu Province. Given sufficient data, BPNNperformed better thanGRNNandWNN.As the amount of training data increased, GRNNperformed better, but the runtime increased.WNNhad the shortest runtime among the three methods and could reflect the future trends of the overload rate in the monthly data prediction of overload. Our model provides information with potential value for expressway network management departments through data mining. ,is information could help management departments allocate resources reasonably and optimize the information utilization rate.


Introduction
Along with the prosperity and rapid development of the national economy, the transportation industry has also rapidly developed in China. Nevertheless, the related overloading management system has not developed accordingly and the enforcement of transportation laws has not become stricter. us, transport operators hope to expand their profits by overloading their vehicles; however, overloading leads to frequent traffic accidents. According to the Highway Bureau of the Chinese Ministry of Transport, more than 80% of truck road traffic accidents are caused by overloaded transport. Overloading continuously causes road damage [1] and traffic accidents [2], disrupts the normal economic order of the logistics market, and affects the healthy development of the highway transportation economy [3]. us, to alleviate or resolve the problems associated with highway engineering safety and market economy, an improved technique for overload management is urgently required.
In this study, we established a machine learning model for highway overrun and overload. We explored the overload characteristics through cluster analysis and compared the performances of backpropagation neural network (BPNN), a commonly employed method for training neural network, generalized regression neural network (GRNN), a radial basis function neural network, and wavelet neural network (WNN), a novel neural network that combines classical sigmoid neural networks with wavelet analysis (WA), to determine the best method to mine, analyze, and predict the overload data. Furthermore, we provide information with more potential value like overload characteristics, overload rate, and overload trend, for highway network management, and explore the application of data mining analysis results in daily and instant overload management work using cluster analysis and neural networks (NNs).
is study makes a significant contribution in assisting highway management departments to reasonably allocate resources, optimize information utilization, improve the transportation management methods and management efficiency, and alleviate highway engineering safety problems. Furthermore, this study aids in economic and environmental development and provides a practical reference for future work on overload management.

Background
Rapid advancements in the field of science and technology have enabled the constant updation of software and hardware for overload control; consequently, toll collection centers at all levels in China are able to store increasing amounts of detailed historical data [4]. ese massive amounts of data contain abundant valuable information. Conventional research methods for analyzing the characteristics of overloading in China mainly include literature review, statistical prediction model, and game analysis model. For example, Li and Wang analyzed the characteristics of freeway overload vehicles via cluster analysis [5]. Ryu et al. analyzed vehicle overrun behaviors using game theory and the risk preference model [6]. In addition, some scholars now study the related characteristics by building data models. Cheng et al. established a behavior judgment and prediction model and studied the vehicle detour behavior [7]. Zhang analyzed the impact of overload on road capacity through VISSIM simulation [8]. It is seen that most of the current research is based on data modeling to conduct data mining; thus, we explore which model should be used in the research as follows.
Past research has revealed that the overload characteristics [4,9] should first be clarified to further study the problem of overload management. Machine learning, a typical data mining technique, uses the "black box" principle to build a machine learning model to address complex and diverse data [10]. In the field of traffic management, machine learning is currently applied to feature mining, prediction, response, and timely disposal of traffic events. For example, Muhammed Yasin and Ahmet proposed a highway traffic accident prediction model based on an artificial neural network (ANN) to analyze the characteristics of traffic accidents [11]. Li et al. proposed a situational awareness machine learning model to thoroughly investigate the factors affecting the fatigue degree of traffic management personnel [12]. Mohammed Abdulhafedh et al. reviewed data mining and machine learning methods for sustainable intelligent urban traffic classification and indicated that machine learning is extremely effective for the identification and classification of traffic states [13]. erefore, machine learning can be used for data mining, studying the characteristics of overloading, and forecasting overloading patterns to thoroughly exploit the massive amounts of historical data in the existing domestic highway toll collection centers at all levels; this complies with the current demand for off-site law enforcement and technology. is could significantly improve highway transportation management and alleviate highway engineering safety problems.
In terms of research on traffic transport characteristics, the currently adopted machine learning methods include support vector machine (SVM), logistic regression (LR), cluster analysis, and ANN.
SVM has been successfully employed for classification, regression, and pattern recognition [14]. Sangare et al. highlighted that SVM can demonstrate better performance with low data volume; however, it lacks the ability to automatically identify relevant features, and its calculation cost is high [15]. Lin and Li suggested that NNs are better than SVM in predicting traffic congestion [16]. e LR is a statistical method used to predict event probability and is a significant model for classified data [17]. However, it is sensitive to the multicollinearity of independent variables and affects regression results. Moreover, LR would be highly suited for feature mining through the analysis of the influencing factors. [18]. Because this research has made sufficient data available, one need not consider to use SVM solely because it performs better than other methods with less data. Furthermore, considering its high computational cost and poor prediction performance compared with that of NNs, SVM is not a practical choice. Moreover, independent variables affecting the overload may cause multicollinearity problems; thus, the regression results of LR are also affected. erefore, SVM and LR have limitations in mining the characteristics of overload and the prediction of overload.
Cluster analysis is a typical and extremely effective method of exploring characteristics and laws. It can discover the internal distribution structure of data without prior knowledge regarding the correct results [19]. Yassin and Pooja used Kmeans clustering to extract hidden information from traffic accident data and created training sets; the accuracy of this method was 99.86% [19].Žunić et al. used K-means clustering to analyze the impact of road, environment, vehicles, and drivers on traffic accidents and highlighted that clustering has been applied in several cases and instances in current professional and scientific practices in the field of traffic management [20]. e most commonly employed clustering analysis methods include the hierarchical clustering algorithm and dividing-based clustering algorithm. According to the multiperiod original data of this research system, clustering and K-means clustering, which are the most classic, universal, and commonly used algorithms in the field of transportation, are selected and are combined. K-means is easy to implement; however, the number of classification classes must be specified in advance. e use of system clustering can solve this problem; concurrently, K-means can also compensate for any defects in the inaccurate termination conditions of system clustering.
ANN can be used to analyze nonlinear factors, and ANN models demonstrate high accuracy and degree of fit. e commonly used ANNs include BPNN, GRNN, and WNN. For example, Wang et al. established a BPNN model to estimate the probability of collision results for different traffic accidents. e model accuracy was greater than 90%, and the model exhibited a good fitting effect [21]. Zhang and Zhang believed that the GRNN model has higher accuracy and reliability than historical average and vector autoregression (VAR) in making short-term traffic flow predictions [22]. Hou et al. suggested that WNN has favorable adaptive ability and self-learning ability for forecasting short-term traffic volume [23]. BPNN is relatively mature in network theory and performance and has high self-learning and adaptive abilities, which enables it to extract "reasonable rules" between data in the event of unseen patterns and to adequately mine the data. In contrast to the global approximation algorithm of BPNN, GRNN is characterized by an optimal approximation, which can process unstable data and avoid the problem of the local optimum that could occur due to the BPNN, whereas WNN has significant advantages over sequential data, implying that these three NNs have different functions [3,24]. erefore, when sufficient historical data are available for training, BPNN, GRNN, and WNN can suitably adapt to the characteristics of overload behavior, such as the large impact of timing sequence, large randomness, and nonlinear influencing factors, and are considerably suitable for solving the problem of feature mining and prediction modeling of overload vehicles.
Most of the current research is based on data modeling for data mining; therefore, this study explores the models that can be used and finally selects two models of machine learning: clustering analysis and NN. Machine learning has been used in the field of transportation research but rarely used in the analysis of overload characteristics. erefore, as discussed earlier, we explore the overload characteristics through cluster analysis and compare the performances of BP, GRNN, and WNN to determine the best method to mine, analyze, and predict overload data.

Experimental Data.
e expressway data of Jiangsu Province (15 565 data units) and the national provincial highway data of Liyang city (30 764 data units), from 2018 to 2019, were collected from the expressway transportation management department of the Jiangsu Province. e datasets included "License plate," "Lane," "Time," "Gross weight of vehicles and goods," "Axle number," "Rated weight," "Overload standard," "Overload rate," and "Overload interval." e details of these datasets are as follows.
License plate. License plate is the number plate identifying the vehicle, and the dataset is in the form of "苏 DETXXX," where "苏" represents Jiangsu Province and "X" is an integer. Lane. A lane is the road that a vehicle travels on. If there are three lanes, the one closest to the central zoning is defined as the first lane, the middle lane is the second lane, and the outer lane is the third lane. Time. Time refers to the time when the car passes through the detection station; the dataset is in the form of "year/month/day, hour: minutes." Gross weight of vehicles and goods. is refers to the total weight of the vehicle and cargo; the unit is ton (e.g., "49 t").
Axle number. Axle number refers to the total number of axles configured in the lower part of the vehicle underframe. Vehicles are usually divided into classes of one to six axles. Rated weight. Rated weight refers to the maximum allowable vehicle load under the condition of ensuring safe driving. e unit is ton (e.g., "49 t"). Overload standard. Overload standard determines whether the vehicle weight exceeds the rated weight indicated on the driving license. If it exceeds the rated weight, it is termed overweight. overload rate � actual weight − rated weight rated weight . (1) Overload interval. Overload interval is divided into "less than 5%," "5-30%," "30-50%," "50-100%," and "more than 100%."

Data Preprocessing.
e raw datasets in this research were not correctly formatted and could not be comprehended by computing machines. Before building the model, this problem had to be resolved to ensure favorable data quality. Consequently, data cleaning, missing data handling, encoding, and normalization were performed. Character data, such as "Maximum wheelbase," of the province highway missed 2007 data in total, and the national highway data do not contain this character data; thus, we deleted these character data. Character data such as "Time," "License plate," and "Detection station type" were converted to numerical data ( Table 1). All the data were normalized between zero and one to eliminate the differences in the orders of magnitude, which also can reduce the possibility of overfitting in the BPNN model.

Data
Extraction. Data were extracted from raw data that were processed and were divided into expressway data and national and provincial highway data.
Cluster analysis extracts two-dimensional (2D) data for analysis. "Overload rate" is the concrete data of "Overload interval"; thus, "Overload interval" was discarded. e attributes were "License plate," "Gross weight of goods and vehicles," "Overload standard (rated weight)," "Axle number," and "Overload rate," whereas "Overload standard" is similar to the information represented by "Gross weight of goods and vehicles"; thus, we selected "Gross weight of goods and vehicles" to avoid duplication. e NN can also extract 2D data for analysis. Because of the lack of data for December and speed data from June to November in 2019, for the national and provincial data, only the national and provincial data of Liyang city in 2018 were selected for analysis. In May 2019, there were 738 expressway data points in Jiangsu Province in hours. In this research, the national and provincial highway data of Liyang city in 2018 were selected according to monthly, quarterly, and annual data. e monthly data were measured in hours and contributed 289 pieces of data for the 12 months of 2018. e quarterly data were measured in hours, and 964 data points were available in the first quarter of 2018. e annual data were measured in hours, and 3367 pieces of data were available for 306 days in 2018.
In summary, cluster analysis was performed to extract the attributes "Time," "Gross weight of vehicles and goods," "Axle number," and "Overload rate" to analyze the data. e BPNN and GRNN extracted "Time," "Gross weight of goods and vehicles," "Axle number," and "Overload rate" from Jiangsu expressway and the provincial highway data. WNN was only used to process 2D time-series data, i.e., the "Overload rate" data.

Cluster Analysis.
Clustering, which was proposed in the mid-20 th century, is a typical research method of unsupervised learning. It aims to group similar observations into multiple clusters according to the observations of multiple variables of each individual, to maximize the similarity of samples in each cluster, maximize the differences between unknown groups, and reveal the inherent distribution structure of data. e concept underlying cluster analysis originated from traditional taxonomy, in which complex classification problems are encountered. Unlike traditional classification, the data types that need to be divided in cluster analysis are typically unknown and are defined by analyzing specific cluster results.
(1) Systematic clustering: systematic clustering aims to classify each sample step by step according to the criterion of the distance function calculation until the samples satisfy the requirements of the systematic clustering algorithm. Among them, the distance calculation formulas and criteria are primarily used to calculate the average distance between samples and clusters formed in each iteration and between them and each sample. Different distance calculation formulas and criteria yield different distance calculation results. e common distance calculation criteria include the shortest distance method, longest distance method, intermediate distance method, barycenter method, and class average distance method.
(2) K-means clustering: K-means is considered a crucial clustering algorithm [25], which has an iterative process for datasets. In this iterative process, the datasets are divided into "k" predefined nonoverlapping clusters or subgroups. e data points between clusters are as similar as possible, and the distance between clusters is retained for the maximum time possible. It assigns data points to clusters, so that the sum of the square of the distance between the cluster centroid and data points is the smallest.
Here, the cluster centroid is the arithmetic average of the data points in the cluster.
In this study, systematic clustering and K-means clustering were combined for clustering analysis; the specific process is shown in Figure 1. K-means clustering needs to determine the number of clustering classes "k" before clustering, and this research aims to cluster the overloading data of the national and provincial highways in Liyang city in every quarter from 2018 to 2019. erefore, we used systematic clustering to obtain the clustering tree diagram for eight quarters from 2018 to 2019 and determine the same number of clustering classes "k" in eight quarters, taking into account the continuity of the data. On this basis, the clustering results were obtained by K-means analysis.

Neural Network
(1) BPNN ANN is a supervised machine learning algorithm [26]. e BPNN is called an error backpropagation ANN.
is is a commonly employed method for training NNs, which assists in calculating the gradient of the loss function relative to all weights in the network and enhancing the connectivity between layers to obtain the optimal solution [27]. A BPNN usually has three or more layers. Each layer of a BPNN comprises several neurons. In this research, a BPNN was used to build a machine learning model for the expressway data and to predict the overload rate and trend. e algorithm flow of the BPNN can be divided into three steps Figure 2: BPNN establishment, training, and prediction. First, the specific structure of the BPNN was determined according to the characteristics of the fitting function. Because there are four input parameters-"Time," "Gross weight of vehicles and goods," "Axle number," and "Type of inspection station"-and only one output parameter, namely, "Overload rate," the number of nodes in the three layers of the BPNN was "4-?-1"; this implies there are four input layer nodes, the number of hidden layer nodes is unknown, and there is one output layer node. In reality, we can only use data from the first few months that have occurred to predict subsequent results. erefore, to consider actual situations, in this research, we arranged the data in chronological order and systematically divided them into training data and testing data in a 10 : 1 ratio. e number of hidden layer nodes can be calculated as follows: where L denotes the number of hidden layer nodes, n denotes the number of neurons in the input layer, m denotes the number of neurons in the output layer, and a is a constant having a value between 1 and 10.
Substituting n � 4 and m � 1 in equation (2), we obtain a value between � 5 √ + 1 and � 5 √ + 10 because the number of nodes L is an integer between 4 and 12. If the fitted nonlinear function is simple, the prediction error of the BPNN decreases with an increase in the number of nodes. If the function is complex, the error decreases first and then increases.
To explore the appropriate number of hidden nodes, a loop was written in the code to evaluate the goodness of fit of the prediction by the degree of fitting to obtain the optimal number of hidden nodes. Goodness of fit refers to the fitting accuracy of the regression line of the model to the observed value. Generally, the determinable coefficient (also called the deterministic coefficient) is used to determine the goodness of fit. e closer the value is to one, the better the fitting degree of the model is; in contrast, the closer it is to zero, the worse the fitting degree of the model is. In this research, the prediction ability of the model was evaluated by calculating the coefficient of determination, as follows: where R 2 is the determinable coefficient, y i is the i th predicted value, y i is the i th true value, and l denotes the number of samples. (2) Generalized regression neural network GRNN was proposed by Donald F. Specht, an American scholar, in 1991. It is essentially a radial basis function NN but differs from the conventional radial basis function NN. It is a type of strong regression tool with strong nonlinear mapping ability and flexible network structure and can be employed in regression, prediction, and classification research [28]. In this research, the GRNN was used to build a machine learning model for the expressway data and to predict the overload rate and trend. e algorithm flow of the GRNN can be divided into three steps Figure 3-GRNN establishment, training, and prediction. First, the GRNN structure is determined according to the characteristics of the fitting function. e number of neuron nodes in the input layer is equal to the dimension of the learning samples. e number of neuron nodes in the mode layer is equal to the number of learning samples, i.e., "Time," "Gross weight of vehicle and goods," "Axle number," and "Type of inspection station," whereas that in the output layer is equal to the dimension of the output vector of the learning samples, i.e., "Overload rate." erefore, the GRNN does not need to determine the number of hidden layer nodes. Unlike the BPNN, the GRNN only needs to confirm the SPREAD parameter and thus exhibits considerable computational advantages. In this study, the parameters of SPREAD were determined by loop training. e loop code starts with "for SPREAD-� 0.1 : 0.1 : 2," which means SPREAD is assigned Mobile Information Systems from 0.1 and the step distance is 0.1, and each loop increases by 0.1 until SPREAD � 2. Similar to the manner followed for BPNN, for GRNN, the data are arranged in chronological order and systematically divided into training data and testing data in a 10:1 ratio. (3) Wavelet neural network WNN is a novel NN that combines classical sigmoid NNs with wavelet analysis (WA) [29]. e concept of WA was developed considering the shortcomings of the Fourier transform, which cannot accurately evaluate the time a certain signal occurs. In contrast, WA can analyze the local characteristics of signals through wavelet basis function transformation and select the direction of signals in two dimensions. e WNN can perform time-frequency local analysis, has a fast convergence speed, and can effectively avoid falling into a local optimum.
In this study, WNN was used to build a machine learning model for expressway, national, and provincial highway data and predict the overload rate and trend. e algorithm flow of the WNN can be divided into three steps Figure 4: WNN establishment, training, and prediction.
First, the structure of the WNN was determined according to the characteristics of the fitting function. In this study, according to the periodic characteristics of overload and goodness of fit of the operation results, the number of neuron nodes in the three layers of the NN was determined to be "4-4-1." Four nodes in the input layer indicate the overload rate at four time points before the predicted time node. e hidden layer nodes were composed of wavelet basis functions.
ere was one node in the output layer, which denoted the overload rate to be predicted. Similar to the manner followed for BPNN and GRNN, for WNN, the data are arranged in chronological order and systematically divided into training data and testing data in a 10:1 ratio.   Table 2 presents the clustering results when the number of clustering classes was six, i.e., K � 6. "Gross weight of vehicle and goods" gives the average gross weight of the vehicles and goods at the time of entering highway. "Category of gross weight" and "Proportion of gross weight" are the proportions of the components of the average gross weight. "Axis number" and "Proportion of axis number" are the proportions of the components of the axis number at the time of entering the highway. "Overload rate" gives the average overload rate of the vehicles at the time of entering highway.

Results
"Proportion of vehicles" shows the percentage of vehicles passing at the time of entering the highway in the total number of vehicles. It can be observed that, on the expressways in Jiangsu Province, vehicles with six axles and weighing above 49 t entering the expressways at 23:00-04:00 and 12:00-14:00 are more likely to exhibit overload behavior, and the overload rate of these vehicles is higher than that of others during this period. Most vehicles enter the expressway at 05:00 and 17: 00-22:00, among which six-axle vehicles weighing above 49 t are more likely to be overloaded. Among the vehicles entering the expressway between 06:00-07:00 and 15:00-16: 00, two-axle vehicles weighing 18-27 t are more likely to be overloaded. Among the vehicles entering the expressway between 08:00-11:00, four-axle vehicles weighing 37-43 t and 43-49 t are more likely to be overloaded. Table 3 presents the clustering results of the national and provincial data in the eight quarters of 2018 and 2019. e definitions of "Gross weight of vehicle and goods," "Category of gross weight," "Proportion of gross weight," "Axis number," "Proportion of axis number," "Overload rate," and "Proportion of vehicles" are the same as above. e analysis is as follows.
e data clustering results of the national and provincial highways in the four quarters of 2018 were as follows: (1) most vehicles entered the provincial highway at 00:00-08:00 and 16:00-20:00 in the first quarter, among which six-axle vehicles weighing above 49 t were more likely to be overloaded; the overload rate of six-axle vehicles weighing above 49 t and entering the provincial highway at 02:00-03:00 was higher. Among the vehicles entering the national provincial highway at 11:00, 2-axle, 3-axle, and 4-axle vehicles weighing  Mobile Information Systems 27-36 t and 37-49 t were more likely to be overloaded. (2) In the second quarter, most vehicles entered the highway between 00:00-08:00 and between 14:00-21:00, among which 6-axle vehicles weighing above 49 t were more likely to be overloaded; the overload rate of 6-axle vehicles weighing above 49 t and entering the highway between 21:00-23:00 was high. Among the vehicles entering the national provincial highway from 09:00 to 13:00, 3-axle vehicles weighing 27-36 t and 37-49 t were more likely to be overloaded. (3) In the third quarter, most vehicles entered the highway between 00:00-8:00 and between 16:00-20:00, among which 6-axle vehicles weighing above 49 t were more likely to be overloaded; the overload rate of 6-axle vehicles weighing above 49 t and entering the highway between 21:00-23:00 was high. Among the vehicles entering the highway between 09:00-15: 00, 2-axle, 3-axle, and 4-axle vehicles weighing 27-36 t and 37-49 t were more likely to be overloaded. (4) In the fourth quarter, most vehicles entered the highway between 00: 00-08:00 and 13:00-22:00, among which 6-axle vehicles weighing above 49 t were more likely to be overloaded. At 23: 00, the overload rate of 6-axle vehicles weighing above 49 t and entering the highway was high. Among the vehicles entering the highway between 09:00-12:00, 4-axle vehicles weighing 27-36 t and 37-49 t were more likely to be overloaded. e data clustering results of the national and provincial highways in the four quarters of 2019 were as follows: (1) in the first quarter, most vehicles entered the national and provincial highway between 04:00-13:00 and between 18: 00-23:00, among which 6-axle vehicles weighing above 49 t were more likely to be overloaded; the overload rate of 6-axle vehicles weighing above 49 t and entering the provincial highway between 00:00-03:00 was high. Among the vehicles entering the highway between 14:00-17:00, 2-axle vehicles weighing 18-27 t and 27-36 t were more likely to be overloaded. (2) In the second quarter, most vehicles entered the highway between 00:00-15:00, among which 6-axle vehicles weighing above 49 t were more likely to be overloaded. From 00:00 to 03:00, the overload rate of 6-axle vehicles weighing above 49 t and entering the highway was high. Among the vehicles entering the national provincial highway between 06:00-15:00, 4-axle vehicles weighing 37-43 t were more likely to be overloaded. Among the vehicles entering the national highways between 16:00-23:00, 4-axle and 6-axle vehicles weighing above 49 t were more likely to be overloaded. (3) In the third quarter, most vehicles entered the highway between 04:00-05:00 and between 14:00-18:00, among which 6-axle vehicles weighing above 49 t were more likely to be overloaded. From 02:00 to 03:00, the overload rate of 6-axle vehicles weighing above 49 t and entering the national provincial highway was high. Among the vehicles entering the highways between 22:00-01:00 and between 07: 00-13:00, 4-axle vehicles weighing 37-43 t were more likely to be overloaded. Among the vehicles entering the highway at 06:00, 2-axle and 3-axle vehicles weighing 27-36 t were more likely to be overloaded. (4) In the fourth quarter, most vehicles entered the highway between 08:00-15:00, among which 6-axle vehicles weighing above 49 t were more likely to be overloaded. e overload rate of 6-axle vehicles weighing above 49 t and entering the highway between 00: 00-04:00 was high. Among the vehicles entering the highway from 05:00 to 07:00, 2-axle vehicles weighing 18-27 t and 27-36 t were more likely to be overloaded. Among the vehicles entering the highway from 16:00 to 23:00, 4-axle and 6-axle vehicles weighing above 49 t were more likely to be overloaded.   Clustering results of national and provincial highway data when K � 4 (1). Clustering results of national and provincial highway data when K �   and 9(a) show that the predicted overload rate fits the actual overload rate well when using BPNN with the fitting curves "y � 0.99x + 0.0068," "y � 0.97x + 0.0088," "y � 0.96x + 0.011," and "y � 0.97x + 0.0021," respectively. e "R 2 " values are always larger than 0.95; this indicates the validity of the BPNN model. Figures 6(b), 7(b), 8(b), and 9(b) show the fitting degree of the predicted overload rate and the actual overload rate when using GRNN with the fitting curves "y � 0.53x + 0.045," "y � 0.49x + 0.036," "y � 0.83x + 0.0236," and "y � 0.86x + 0.0126," respectively. e "R 2 " values are approximately 0.5, which does not correspond to a fitting degree as good as that exhibited by the results obtained using BPNN. Figures 6(c), 7(c), 8(c), and 9(c) show the fitting degree of the predicted overload rate and the actual overload rate when using WNN with the fitting curves "y � 0.16x + 0.151," "y � 0.18x + 0.1748," "y � 0.17x + 0.198," and "y � 0.02x + 0.217," respectively. e "R 2 " values are always less than 0.2, which corresponds to the lowest fitting degree among those of the three methods. Tables 4-6 present the prediction results of the BPNN, GRNN, and WNN, respectively, indicating the data and results in the process of machine learning. All data are divided into training data and test data in a ratio of 10:1 and run according to the process described in Section 3.2. e prediction results of BPNN exhibit a high degree of fitting of all the validation data, training data, and test data, which also shows that the model performed well without overfitting. GRNN prediction required a long runtime owing to its inclusion of cross-validation and cycle training. WNN could not accurately predict the overload rate, and there was a gap between its prediction value and the actual value. It is speculated that WNN is typically used for short-term timeseries prediction, whereas quarterly data and annual data prediction belong to long-term time-series prediction; thus, the prediction results are not ideal, regardless of the overload rate and overload trend. Although the overload rate obtained from monthly data was not sufficiently accurate, it could reflect the overload trend. Furthermore, the WNN omitted attributes that were useful for prediction by BPNN, such as "License plate," "Lane," "Gross weight of vehicles and goods," "Axle number," and "Speed," and only used "Overload rate" and short-term time series for the prediction. us, the prediction results reveal the weak correlation between the overload rate and short-term time.

Discussion
According to the clustering results of the expressways in Jiangsu Province, 6-axle vehicles weighing above 49 t and entering the expressways between 23:00-4:00 and 12:00-14: 00 are more likely to exhibit overloaded behavior, and the overload rate of these vehicles is higher than that of others   is is speculated to be because the working hours of overload management in 2018 and 2019 had changed. (2) In 2018 and 2019, the overload rate in the first quarter was lower than that in the other three quarters. Compared with that in 2018, the overload rate in 2019 was reduced. is is speculated to be because Changzhou performed several joint actions of overloading management in 2019, because of which the overloading governance had been strengthened. (3) Between 2018 and 2019, there were no evident changes in the "Gross weight of vehicles and goods" and "Axle number." Table 7 summarizes the clustering results of national highways in 2018 and 2019. In 2018, most overloaded vehicles entered the highways between 00:00-08:00 and between 16:00-20:00, and vehicles with higher overload rates were concentrated between 21:00-23:00 and between 02:    15:00, and vehicles with a high overload rate were concentrated between 00:00-03:00. In 2018-2019, most overloaded vehicles were 6-axle vehicles weighing above 49 t, and the vehicles with high overload rates were also 6-axle vehicles weighing above 49 t.
Comparing the prediction results of the BPNN, GRNN, and WNN, the common evaluation indexes of the regression prediction model include mean-squared error (MSE), the deterministic coefficient R 2 , and goodness of fit. e numerical value of the deterministic coefficient R 2 reflects the goodness of fit. erefore, we selected the deterministic coefficient, MSE, and the running time t of the entire code as the model performance criteria. In addition, this study used the root-mean-squared error (RMSE), mean error (ME), mean absolute error (MAE), mean absolute percentage error (MAPE), and root-mean-squared percentage error (RMSPE) as five different performance criteria Table 8. e averages of (1 − R 2 ), MSE, RMSE, ME, MAE, MAPE, and RMSPE are calculated Table 9 to eliminate the prediction error of the BPNN, GRNN, and WNN [30].
e average values and all the seven performance criteria show that the prediction of BPNN is better than that of GRNN and WNN. Its performance is more accurate, especially for the "highway data" and "national and provincial highway data (month)." is is because the amount of data increases with time. Although the overall fitting degree is high, the individual prediction errors are greater, and certain performance criteria also increase with the increase in the number of errors. To clearly evaluate the adequacy of this performance, the performance of BPNN in this study was compared with that in existing studies. For example, Kumar used an ANN to predict the short-term traffic flow of nonurban expressways. In this study, when the amount of data was 160, the fitting degree was 0.9984, and when the data amount was 480, the fitting degree was 0.9988 [31]. e fitting degree of the BPNN in this study was larger than 0.90 and was nearly 0.99, which shows that the BPNN in this study has a very high degree of fitting and the model performs well.
To detect the defects of GRNN and WNN, the performances of GRNN and WNN were also compared with those of other studies. Zhang and Zhang used the spatial relationship of traffic flow near U.S. Highway and developed two multivariate forecasting approaches, in which the GRNN was also used [22]. In this research, 91.32% of the forecasting error was less than 20%. Tang et al. used WNN to predict the short-term traffic flow with a fitting degree of 0.9453 and used 230 datasets in modeling [31]. ey used the WNN to predict short-term traffic flow with a fitting degree of 0.936, 864 datasets for modeling, and running time of 0.995. Compared with that of Tang et al. [32], the fitting degree of WNN in this research was extremely low. However, when the datasets were the same, the running time was similar to that in other studies. e reason is that although the subjects of the study are different, the traffic flow has a strong correlation with time. In terms of overload, compared with other attributes such as "Time," "License plate," "Lane," "Gross weight of vehicles and goods," "Axle number," and "Speed," the overload rate has a weak correlation with time.
By comparison, the combined results of Tables 8 and 9 are as follows.
(1) e prediction performance of BPNN is superior to that of GRNN and WNN. Given sufficient data, BPNN is a better forecasting method with a high fitting degree, minimum MSE, and short running time. BPNN performs well in hourly highway data prediction and annual overload rate prediction, and the prediction results of BPNN can help determine the centralized overload management time. (2) With the increasing amount of data, the fitting degree of GRNN increases, and the running time also increases. In this study, the method of adding a "for loop" to find the best SPREAD greatly increases the running time. Compared with other networks, GRNN had the longest running time. (3) Compared with BPNN and GRNN, WNN has the shortest running time in overload rate prediction.
Although the fitting degree of WNN is low, it only requires an overload rate for prediction. It can reflect the future trends of the overload rate in the monthly data prediction of overload, and its prediction results can help determine the centralized overload management time when the data are insufficient.

Conclusions
In this study, we used machine learning to establish a model for highway overrun and overload, considering Jiangsu Province as an example. e characteristics of overloading were summarized by clustering the historical data of overloading, and a forecasting model of overloading with a high fitting degree based on BPNN was obtained. We provided information with potential value for expressway network management departments through data mining. is information could help these management departments allocate resources reasonably and optimize the information utilization rate. e conclusions of this study can be grouped into two aspects: daily overload management based on cluster analysis and immediate overload management based on NNs. e overload characteristics based on cluster analysis, human resources, and other related resources should be allocated rationally. Targeted monitoring should be performed for overload control and management level, and the governance efficiency should be enhanced using information technology. Currently, numerous large loopholes remain in time and space in law enforcement inspection, and law enforcement personnel and equipment are limited. erefore, the thorough use of historical overload data is necessary to satisfy the demand for taking severe measures on overloaded vehicles according to the law of overload characteristics. For example, based on the conclusion of this study, we can focus on monitoring 6-axle vehicles weighing above 49 t and entering an expressway between 23:00-04:00 and 12:00-14:00, and 6-axle vehicles weighing above 49 t entering the national provincial highway of a city between 21:00-23:00 and 02: 00-03:00. e allocation of manpower and other related resources should be increased in this period, and related resources should be saved during other periods. Based on the overloading characteristics, different types of vehicles in different periods must be thoroughly investigated to improve the management efficiency of the transportation departments. For example, special attention must be paid to 2-axle vehicles weighing 18-27 t and entering a highway between 06:00-07: 00 and 15:00-16:00 and to 4-axle vehicles weighing 37-43 t and 43-49 t and entering a highway between 08:00-11:00. Concurrently, the traffic management department can reasonably adjust the working hours of staff in different seasons. For example, the working hours of staff in the first quarter can  Based on the prediction model obtained by the NN, we can predict the overload rate and overload trend, develop auxiliary programs or software, and improve the existing law enforcement system. e information resources of administrative departments in fundamental areas have been in a closed state. e information resources of administrative law enforcement and administrative law enforcement supervision are integrated and shared, and real-time information, such as that on personnel and equipment, is added based on the existing system data to improve the overloaded vehicles database. Based on the BPNN model used in this study, the real-time information on personnel and equipment can be integrated to develop auxiliary programs or software to realize online inquiry by law enforcement departments and online supervision of law enforcement activities. us, all departments can inquire about the current overload rate and trend at any time, especially in centralized law management, and can reasonably determine the time and place of centralized law management with the help of prediction results and integrated information of manpower and equipment. By checking changes in manpower, equipment, and other information, we can determine real-time actions for various departments, ensure business linkage, promote the linkage work of administrative law enforcement, further promote the governance of overloading, and improve the efficiency of law enforcement systems. e overload data mining of expressways in this research is limited to expressways in Jiangsu Province, and the overload data mining of national and provincial highways is limited to Liyang city, which is not universal. Furthermore, the spatial attributes, namely, "Entrance site" and "Exit site" were not analyzed. Our follow-up research will focus on a considerably comprehensive feature analysis and mining of the spatiotemporal law of overloading. Using geographic information system (GIS), global positioning system, spatial statistical models, and other advanced technologies, we aim to explore the spatiotemporal law of overloaded vehicles in highways, visualize it with the help of GIS, predict the destination area of overloading, overloading transportation volume, and overloading transportation routes, and conduct more in-depth analysis to reveal the existence of overloading of freight vehicles in highway networks in different airspaces.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.