A New Hybrid Deep Learning Algorithm for Prediction of Wide Traffic Congestion in Smart Cities

Department of Computer Science and Engineering, SRMIST, Chennai, India Department of Computer Science, Government Bikram College of Commerce, Patiala, India Department of Artificial Intelligence & Data Science, AITS, Rajampet, India Institute of Computer Technology and Information Security, Southern Federal University, Rostov Oblast 344006, Russia Department of Information Technology, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia School of Electronics and Electrical Engineering, Lovely Professional University, Punjab 144411, India Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia


Introduction
The vehicular adhoc network is one among the puissant research applications in the intelligent transportation system (ITS) that furnishes the information to prevent or reduce the traffic congestion. For exchanging the information in a network, the vehicular adhoc network has vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication. When a conveyance directly communicates with other con-veyance in a network is V2V communication and when a conveyance directly communicates with roadside units (RSU), then, it is V2I communication [1]. The momentous standards of VANET are the dedicated short-range communication (DSRC) protocol, IEEE 802.11 [2], and wireless access in vehicular environment (WAVE) [3,4]. Delays due to traffic, traffic that leads to congestion, consumption of energy, and the emission of pollution are the disputable in traffic management for smart cities [5][6][7][8][9]. The traffic management must effectuate the smart system for parking, an intelligent system for vehicles in routing management, and an intelligent system that predicts the traffic [10][11][12][13][14][15][16].
In recent years, there is a higher death rate in road accidents which must be conquered to save the lives of people. User behavior, infrastructure, environmental factors, and mechanical error in roads are the important factors that cause accidents on the road [17,18]. Traffic congestion is one among predicaments that need to be mutated in the transportation system [19]. As stirring of population accelerated, there is an increase in the number of vehicles on road that steer to traffic congestion, accidents, and pollution [20].
Collision in traffics is caused due to bad traffic management, poor law enforcements, poor infrastructure, and failure of signals [17,21]. Averting of transportation fatalities ahead is an open aftereffect in vehicular traffic on highways, cities, and urban areas [22]. Ammunition on the road comprehends traffic monitoring and channeling that begets which consist of various technology riveting alert systems, digital maps. The vehicle active safety is a consequential part in collision warning systems [23].
The congestion can be minimized by identifying traffic jams, attaining the estimation of congestion levels, relaying the information about prevailing traffic state, and proposing new routes [24,25]. Hence, to reduce the congestion level of traffic, the methodology mandatorily needed to predict the traffic jams. Prognosticating the prevalence of crashes that pertains to the count of crashes jotted down for a unit of time at a concrete location is benignant in monitoring highways [26,27]. Evading auguring collisions will have high strike on reducing road concussion [28]. The demurrer in vehicular networks comprehends the vehicles' rapid stirring and communication disassociation and conjunctions [29][30][31][32][33][34].
The various techniques for predicting the traffic collisions in machine learning are sampling, regressions, correlations [35], clustering algorithms [36,37], k-nearest neighbor (kNN) algorithm [38], and artificial neural network (ANN) [39] are clobbered by the deep learning (DL) models in terms of accuracy in predicting the collision. CNN [40], transpose CNN [41], and long short-term Memory (LSTM) [42] are some of the deep learning techniques [43][44][45][46][47][48] used for predicting the collision [41]. The systematic random sampling ameliorates in getting the automobilist samples, samples of the commuter, and samples of arid for reducing the hazards of bias. Purposive sampling ameliorates in electing the respondents of traffic officers that cynosure on the authentic traffic officer inaugurates at the sedulous streets [35].
Congestion cluster furnishes the adverting amount of flexibility in disparate needs in applications. These clusters vary dynamically in the network. These clusters accomplish intracluster similarity to disport the analogous development of driving speeds in the road segment over time [36]. The kNN inaugurates in classifying the conditions of the traffic and imputes the advertence to the class for receiving the considerable vote among the neighbors. This method identifies the accident betides due to traffic by utilizing the condition of traffics and constraining the factors of environments [38]. The ANN substantiates in extracting the features and dredging the incidents that furnishes and smashes the warn-ing to the commuters and operator [39]. The features of images are extracted by a feedforward neural network called CNN by applying convolution operations. A conventional recurrent neural network called LSTM contains the cell state, the memory part, and three gates to predict the collision based on the time series sequence of images in traffic. The transpose convolutional neural network produces the predicted images of collision [41].
These deep learning algorithms produce the high spatial resolution that leads to the overfitting problem and disaccords the access and amalgamate in vehicle stirring patterns and conditions in traffic. Exploration of accidents at junctions must be included, and visualizing the emission and dispersion of traffic must substantiate in evaluating the realtime environment. This paper establishes the hybrid BLSTME and CNN for overcoming the overfitting problem and in predicting the traffic collision. This paper squarely fractionalized into five segments. The related work has been elaborated in Section 2. The proposed methodology in this paper has been deliberated in Section 3 along with the equation. Section 4 demonstrates the implementation of the model that is proposed and compared the accuracy with the existing models. Finally, this paper is concluded in Section 5.

Related Work
A systematic random sampling approach by Onyeneke et al. [35] supports in reducing the chance of bias by getting the samples of allonges, travelers, and pedestrians. The purposive sampling focuses in recruiting the respondents from the right traffic officers. The simple linear regression canvasses the relationship to place the dependent and independent variables in data. Based on the casualties that are intended as a dependent variable and an independent variable is the number of registered vehicle; the future values are interpreted. This model examines the independent effects of manufacturing and enables the concrete absorption of causes and effects of congestion in traffic. The model fails in evaluating the accuracy, precision, and recall.
Wang et al. [49] studied the strike of congestion in traffic by using the spatial analysis technique for finding the frequency of accidents in the road. Poisson-lognormal, Poisson-gamma, and Poisson-lognormal with car priors for firstand second-order neighbors are the models to inquire the relationship amide congestion of traffic and the prevalence of distinct accidents on road. Accidents can be mapped to the veracious motorway segment, and the congestion index is evaluated to reckon the segment-level congestion of traffic. The Poisson-lognormal and Poisson-gamma models test the heterogeneity effects and exclude the spatial correlation effects, but the Poisson-lognormal car model holds up the effects of heterogeneity and spatial correlation. These models are consistent and confronted that congestion on traffic has no strike on the frequency of accidents. Exploration of congestion effects at junction on roads is required. The analysis is made by containing only the road segments in London from the M25 motorway.
Hao et al. [50] develop a system that conveys some intimation to the operators for transportation and officials of  [36] determine the congestion clusters that furnish the significant amount of flexibility for different applications by the clustering algorithm technique. A congestion cluster is identified by dynamic congestion pockets and construction of static congestion clusters. In the case study of the Munich road network in Germany, the clusters that are the static road network of Munich and discriminating amide days of regular and irregular are taken for cluster congestion analysis. The resulting cluster countenances in identifying the weekdays that do not bear systematically. Reckoning the times and variance of the congestion and quantifying the distinct clusters for correlating the congestion behavior are obtained. This model postulates the implementation and testing in an online traffic forecast system.
The CNN model outperforms the other deep learning algorithm as it inaugurates in furnishing the prognostication of flow in traffic by prying the features of traffic images and classifies the data of traffic based on any feature from traffic data. The convolutional and pooling layers are the two important layers that inaugurate in learning the feature representation of the input traffic images. The LSTM model trains and tests the feature and solves the dematerializing and detonating of the gradient problem in training the neural network.
Song et al. [51] aim to prognosticate the traffic speed and analogize the performance with the existing prediction models by exploiting the CNN. The CNN captures the local dependencies of data and is lesser inclined to clatter in traffic data. This method requires five input layers where one input layer is for furnishing the temporal data and the remaining layers are for the speed profile of links one, two, three, and four. Attaining the local dependencies and capitalizing on the strong relationship for proximate data or nodes in the convolutional layer catenates to a fraction of nodes in the antecedent layer. This algorithm serves to attain the local dependencies and is less sensitive to noises in data. For outperforming the existing models, there is a need for multiple submodels.
Hebert et al. [52] creates the high-resolution accident prediction model for prognosticating the circumstance of an accident within hours on segments of roads delineated by intersections through exploiting big data analytics. Big data analytics is an intended approach that permits data scientists for prying the significant information from large amount of heterogeneous and complex data. The balanced random forest algorithm exploits in amending or sampling the imbalance of data, and several machine learning algorithms like decision trees, artificial neural networks, and Bayesian networks abet in prognosticating the circumstance of road accidents. By exploiting the features and parameters such as weather attributes, attributes from arterial segment, and attributes from date and time in the dataset, the circumstance of road accidents can be successfully prognosticated. Various features like location and date of erection work on roads and population density must be added to the dataset for exceling the performance.
The hybrid multimodel deep learning framework (HMDLF) by Du et al. [53] is aimed at forecasting the traffic flow. This model incorporates gated recurrent units (GRU) and one-dimensional CNN for attaining the features of correlation amid drifts and elongate dependencies of any one modal traffic data, by incorporating the CNN with GRU delves and ascertaining the deep nonlinear correlation attributes of multimodal input data. The end-to-end multimodel communes the traffic sequential data processing framework that rivets on features of spatial locally, features that have long dependency, and correlations of spatial-temporal. The CNN-GRU dopes out the traffic flow auguring issues by ascertaining the long temporal dependencies and features of spatial-temporal correlation for determining the correlation between speed flow journeys' time in multimodality traffic data. Recasting in number of vehicles at the advertence point is awaited. Encompassing the ascertaining of time series precipitate, bouncing match with error tolerance, and spatial and temporal interdependence of multimodality input data are exploited. Collecting potent traffic data in a short epoch of time is a hindrance. The information that was congregated from the highways of England is traffic flow, speed, and passing time as they face rigor by traffic fatalities and ultimate weather events. Building a potent model for prognosticating the traffic abundance based on features that effectuate the hidden insights in vehicular stirs is the intent of Moses and Parvathi [54]. The author exploits the support vector regression that maps the input using nonlinear mapping on m-dimensional features. The mean square error approach estimates the performance by scaling the average squares of errors. The linear regression model erects in scaling the relation between scalar response and independent variables. The decision tree learning algorithm reckons the entropy or information gain. Efficient in identifying the optimal model to the open data that are available is the biggest profit. This model is generic; hence, integrating with existing agencies for doping out the traffic knot in real time is arduous.
Bang and Lee [25] redict the awaited position in stirrings and direction of each conveyance for avoiding amalgamate or access of collision between vehicles. The vector-based mobility prognosticate model in the TDMA-based VANET avoids the collision by apportioning the time slots and prognosticates the mobility of proximate vehicles through exploiting the habitation information of the control time slot, vehicle ID, direction of the vehicle stir, hop information, and latitude and longitude of a vehicle. The gain in performance of the algorithm is amended in the road ambient where the firmness of the traffic is high and the conveyances have high stirring and recasting the directions for travel constantly. Access and amalgamate collisions betide due to vehicle stir patterns and the condition of traffic.
Wei et al. [55] steer in ameliorating the prognosticate accuracy in the flow of traffic. In the autoencoder long short-term memory (AE-LSTM) approach, the autoencoder endorses the internal accordance of the flow in traffic by plucking the characteristics of the stream data in the traffic flow. The LSTM network cannibalizes the attained characteristic data and the historical data to prognosticate the baroque linear data in the traffic flow. This approach is arduous to implement and has a sober applicability. It also furnishes the exalted performance in prognosticating the traffic flow. But the strike is this study only esteems the time patterns and simple spatial patterns.
Sellami and Alaya [56] inquest the unpredictable density of conveyance and also furnish the attestation in the determinate load balance and other resources attainable between the distinct VANET networks for conveyances. The self-adaptive multikernel clustering for urban VANET (SAMNET) approach is postulated on a designated data that can be measured by depicting the ambivalent density of conveyance nodes, deceleration, acceleration, and bounded radio ranges for communication. It undergoes three stages, and they are the initialization stage of clusters, adaption stage of clusters, and fusion stage of clusters. It poses preeminent resultant alluded to the other distinct algorithms for any densities of traffic in the urban environment with the deduction of the arrant transmitted packets that was not unanimous at the destination. The incommodity in the adaption of the proposed algorithm parameters to concede the wielding of SAM-NET is more complex in the road for distinct scenarios, and for optimizing the performance in distinct scenarios, there is bearing of the machine learning technique.
Ranjan et al. [41] redict the congestion level of a transportation network by integrating the CNN, transpose CNN, and LSTM. The convolutional encoder as a spatial feature extraction network encodes the input image into a lowresolution latent state. The temporal or time series information on data is ascertained by a recurrent network hackneyed as long short-term memory. The reconstruction network postulates the convolutional decoder and transposes operation on data by the transpose convolutional neural network to consequence the predicted image. The PredNet and ConvLSTM models attain the towering accuracy, precision, and recall in predicting the traffic congestion by associating the spatial and temporal features. For learning the background area, the huge number of resources and computing time is debilitated [41].
By incorporating the CNN and BLSTME models, the prognostication of the traffic flow is acquired. The proposed model does not necessitate higher resource that may lead to higher computational time [41]. The hybrid does not furnish multiple submodels of CNN for extracting the features that reduces the imbalance in collecting the data [53], and BLSTME is integrating the LSTM. The AdaBoost algorithm is used for strengthening the weak classifiers that resolve the overfitting problem with stir patterns of vehicles [25] and fur-nish higher performance in prognosticating the flow of traffic in real time [54] with higher density of population [52].
The recent survey on VANET is tabulated in Table 1. The gap diagnosed from the above survey table is optimizing the performance and arduous in inferring the traffic problem in real time that is conquered by our proposed system that combines the CNN and BLSTME.

Methodology
Section 3 includes the proposed architecture for predicting the traffic congestion. The proposed hybrid incorporated CNN and BLSTME models prognosticate the traffic flow. The features of input traffic images are extracted by CNN, and the extracted features are trained based on the classes for prognosticating the traffic flow by BLSTME through strengthening the weak classifiers.
3.1. Proposed Architecture. A high spatial resolution is produced by the long short-term memory (LSTM) technique. Hence, to avoid this problem, a hybrid deep learning algorithm BLSTME-CNN is proposed, and the architecture is shown in Figure 1.

Convolutional Neural Network.
The CNN has the pulverized adroitness in the representation of a feature of an input image with nonpareil aspects as local connectivity to the neuron and sharing of the weight. The layers of CNN are the convolutional layer that learns in representing the feature of the input image and pooling layer that accomplishes the shift invariance. In the convolutional layer, the neurons will receive the inputs from its previous layer's neuron of the local group for the output layer. The distinct feature representations were erudite by convoluting several kernels from the previous layer. The convolution layer is incurred by equation (1) [41].
Equation (1) infers the f th activation map of the l th convolution layer which is denoted by y l f , k th activation map of ðl − 1Þ th layer is represented by y l−1 k , and W1 l kf and b1 l f refer to the weight that connects the f th activation map of the l th layer at position k. The several filters in l th layer can be represented by f l , and the elementwise nonlinear activation function is signified by [41].
The spatial size of the activation map can be subdued by the pooling operations, but these operations possess the vital information. y l f ði, jÞ in equation (2) [41] can be obtained by coiling the output of previous layer with the size ðm, nÞ in the convolution filter and touching bitwise nonlinear activation is imparted [41]. a1 and b1 is kernel location.
Wireless Communications and Mobile Computing The convolution layer supervened by the location of the f th activation map of l + 1 th pooling layer, by coiling the outcome of the previous layer with the filter of size ð2, 2Þ; y f l+1 ði, jÞ is obtained, and then, the bitwise nonlinear activation is applied and postulated in (3) [41].
3.3. Long Short-Term Memory. LSTM has been extensively used in many fields such as in generating music, captioning images, recognition of speech, and machine translation for improving the hidden layer cell on the basis of the recurrent neural network (RNN) [55]. The network consists of a cell to commemorate the values aloft the time intervals from LSTM memories and the gates [57,58]. The LSTM network is the RNN that consorts with LSTM units which is paraded in Figure 2. Figure 2 reminisces the output for the hidden layer as h t , preceding output as h t − 1, input of a cell, and output and preceding state as C t , G t and G t − 1, respectively. J t , T f , and T o are      Wireless Communications and Mobile Computing three gate states in the network. LSTM cells G t and h t are calculated by evaluating the three gate states and cell input state and can be transmitted to the next neural network [57]. The input gate is given in equation (4) [57].
The output gate is calculated by using equation (6) [57].
The cell input is given in equation (7) [57]. f The matrices of weight are G 0 l , G f l , G i l , G C l connected to the input gates of the output layers and are the weight matri- Get the hypothesis with error function with respect to Dk 6.
Error function is calculated at each stage which is then weighted Dk 7.
Calculate the error function and repeat step 4 10. If error is less than e k 11. Then, ensemble all the outputs HðxÞ = HðxÞ = sign ð∑α k T k Þ/α k 12. Else 13. Go to step 4 14. End 15. End  The calculated output state of a cell is given in equation (8) [57].
The calculated hidden layer output is given in equation (9) [57].
The number of concatenated cells designates the number of observations of the data that are regarded before making the prediction. Generally, more layers of LSTM cells are strong in predicting the collision but induce the overfitting problem. The boosted LSTM ensemble approach solves the problem by boosting LSTM for an effective traffic flow prediction.

AdaBoost Algorithms.
In the AdaBoost approach, hybrid ensemble learning algorithms are established by integrating the LSTM networks with AdaBoost learning algorithms for strengthening the weak classifiers. Normally, by updating its weights, the AdaBoost algorithm strengthens the weak classifiers until classification or prediction accuracy obtains a maximum value [59]. The proposed model is a strong model where each weak classifier satisfies the performance.
3.5. Boosted LSTM Ensemble. In this approach, the hybrid neural network aggregates BLSTME and CNN to reduce overfitting for the prediction of traffic congestion. The CNN extracts the features of the image, and the feature is trained using BLSTME. The pseudocode and mathematical expression of BLSTME are given below.

Wireless Communications and Mobile Computing
By the expression α k , the network parameter has been calculated and is given by equation (12): e k and the final ensemble boosted output is calculated for every iteration when error is zero, and the mathematical expression is given by equation (13) The complete pseudocode for the proposed BLSTME is rendered below.

Results and Discussion
For a potent perpetration of prognosticate models in the networks, the real-time data are congregated from the arterial network of Seoul city. To utensil the real-time data congregate mechanism, we have incorporated SUMO platforms that run in the OMNeT++ environment. The separate python API has been developed to interface with data collection unit which runs on the SUMO-OMNeT++ platforms to utensil the continuous simulation. However, the simulation analysis has been done using Python Tensor Flow API running on the Intel i3. Figure 3 flourishes the real-time scenario of the arterial system and the vehicles are surveilled in the SUMO. They are alchemized into vehicular nodes by catenate effectuated using the C++ programming for an apparent perpetration in OMNeT++ for the foster annotations and modeling.  Table 2.
From Table 2, it is inferred that the density value is based on the pixel value of the input image. If pixel values get increased then automatically density values of an image get increased. Based on the density, the classes are classified.

Wireless Communications and Mobile Computing
When the density ranges from 1 to 3, it comes under Class 0. When the density ranges from 4 to 6, it is Class1, and if the density ranges higher than 6, then it is Class 2.
Based on the traffic density, the different image data are gathered and the image dilation and image thresholding are performed. The frames are converted into gray scale, and the images are plotted after frame differencing. The density is obtained by calculating the horizontal and vertical edges by using Prewitt kernel as shown in Figure 5. The vehicle detection zone and the contours of the vehicle in the road network are shown in Figure 6. Figure 6(a) expresses the vehicle detection zone, and Figure 6(b) denotes the contours.
The convolutional neural network extracts the features from the input images, and the results are represented in Table 3.
The total parameters computed in CNN from the input data is 2,273,706. The trainable and nontrainable parameters for training the dataset in the network are 2,272,362 and 1,344 from the total parameters, respectively. The features that are extracted from the CNN are trained by BLSTME. It has higher computational load by handling 2,272,362 trainable parameters.

Performance Analysis.
The proposed predicted BLSTME model is analyzed based on the performance metrics. The performance standards of the proposed DL algorithm are calculated, and the parameters such as accuracy, precision, and recall are applicative and estimated in training datasets and by using equations (14), (15), and (16) The performance analysis of the proposed BLSTME-CNN and the existing models such as the autoencoder, convolutional long short-term memory (ConvLSTM), and Pre-dNet are tabulated in Table 4. The data of the existing prediction models such as the autoencoder, ConvLSTM,   and PredNet are collected from [41,60] for the statistical analysis of performance. The performance analysis of distinct performance metrics is evaluated for the various prediction models such as the autoencoder, ConvLSTM, PredNet, and proposed BLSTME-CNN and is represented in Figure 7. Figure 7 flaunts that the existing autoencoder model has 0.74 precision value, ConvLSTM has 0.86 precision value, and PredNet has 0.86 precision value. The proposed BLSTME-CNN model has 0.96 precision value that is 10% higher than the autoencoder, ConvLSTM, and PredNet. Figure 7 flaunts that the existing autoencoder model has 0.71 recall value, ConvLSTM has 0.78 recall value, and Pre-dNet has 0.85 recall value. The proposed BLSTME-CNN model has 0.94 recall value that is around 10% higher than the autoencoder, ConvLSTM, and PredNet. Figure 7 flaunts that the existing autoencoder model has 0.75 accuracy value, ConvLSTM has 0.82 accuracy value, and PredNet has 0.86 accuracy value. The proposed BLSTME-CNN model has 0.98 accuracy value that is around 10% higher than the autoencoder, ConvLSTM, and PredNet.

Conclusions and Future Work
The hybrid deep learning model is evolved by assimilating the CNN and BLSTME. The models can apprehend effectively based on the relation of both the temporal and spatial of the input images. It inaugurates in prognosticating the congestion of traffic for traffic management in smart cities that reduces delays which occurred by traffic, consuming energy, and travel management for passengers. By prognosticating the flow of traffic, the recurring and nonrecurring congestion of traffics are directed by smart traffic management by computing the density, calculating the edges, and frame differencing. Incorporating the CNN and BLSTME inaugurates for smart city traffic management by thresholding, dilation, contours, and detecting the vehicle zone. The CNN method extracts both spatial and temporal features from the traffic images, and BLSTME trains the features and strengthens the weak classifiers for predicting the traffic flow. Our proposed model is analogized with the existing models such as the autoencoder, ConvLSTM, and PredNet for predicting the traffic collision. The proposed model BLSTME-CNN achieves more than 10% higher accuracy, precision, and recall in predicting the collision than the existing models by strengthening the weak classifiers. Therefore, the proposed BLSTME-CNN algorithm produces the higher performance and computational efficiency in predicting the congestion. The future direction of our research work is to propose a hybrid incorporation of predictors with the attainment during collision in the network.
Another subject worth mentioning is real-time prediction. It plays an important role in modern cities and puts greater demands on the capacity of available prediction methods to forecast in real time. In the future, we will consider a more complex model architecture, especially for modeling temporal closeness, cycles, and patterns, in order to better capture temporal dependencies. We will also look at how to deal with sparse spatial traffic flow matrix inputs in order to minimize training time and maintain topological relationships.
Future research will concentrate on using larger datasets, exploring different combinations of flow, occupancy, speed, and other road traffic characteristics to improve prediction accuracy, improving prediction methodologies and analytics, using various types of road traffic datasets, fusing multiple datasets, and using multiple deep learning models.

Data Availability
The data used to support the findings of this study are available from the author upon request (gdhiman0001@gmail.com).

Conflicts of Interest
The authors declare that they have no conflicts of interest.