Data-Driven Approach for Modeling the Nonlane-Based Mixed Traffic Conditions

,


Background
Understanding the traffic performance of the road section is vital for effective utilization. In this direction, traffic flow modeling concepts have proven to be an efficient source in gauging network elements' performance at microscopic and macroscopic levels. Since its inception, researchers conceptualized numerous concepts for traffic flow modeling for understanding the performance of road networks. Primarily, it includes car-following models, such as Pipes [1], General Motors [2], Gipps [3], and Wiedemann [4], among others. To a certain extent, these framed concepts produced a satisfactory performance and can replicate the traffic characteristics.
Further, in this direction, researchers sensed the importance of driving behavior in modeling the traffic, for which significant efforts were put in exploring human factors in developing car-following models, such as weather conditions [5], drivers perspective [6], fatigue driving [7], anticipation [8], among others. By incorporating the human factors and the car's performance, the following models is found to increase. On the other side, researchers also understood the importance of lateral behavior of vehicles in traffic streams. Numerous lane-changing models are conceptualized for capturing the lateral behavior, including research studies [9][10][11], and are able to model the lanechanging movement of the vehicles. With advancements in technology and the availability of computational tools, numerous traffic microsimulation tools [12] are developed by embedding the car-following and lane-changing models for different road geometry and vehicular characteristics.
is includes a few examples, such as PTV VISSIM [13], AIMSUN [14], PARAMICS [15], and SUMO [16]. It is well established that these microsimulation tools boosted the traffic modeling studies to an exceptional level and facilitated a more considerable extent in uncovering numerous concepts. With the development of NGSIM [17], high-quality trajectory data and driver behavior aspects have further strengthened traffic flow modeling for evaluating policy interventions more comprehensively. Further, in recent times, researchers [18][19][20] tested the deep learning and reinforcement learning strategies for improving the mixed traffic flow efficiency levels.
It can also be noted that most of the literature mentioned above entirely belongs to a homogeneous traffic environment with lane-based traffic conditions. On the other hand, in the case of mixed traffic conditions with poor lane discipline, traffic streams can result in complex spatial interactions among the vehicles in longitudinal and transverse directions. In this direction, very few studies have been carried out to assess different traffic characteristics, especially driving behavior (vehicle-dependent). Furthermore, researchers applied the concepts mentioned above under mixed traffic conditions, which includes studies such as microsimulation [21], modeling traffic flow on expressways [22], calibration of car-following models [23], to name a few. To a certain extent, these strategies can perform better at the macroscopic level, but their performance at the microscopic level (vehicle-vehicle interaction) is questionable. Many studies [24,25] conducted under mixed traffic conditions reported the predominant lateral movement of vehicles.
Mainly in case of mixed traffic conditions, with poor lane discipline due to the involvement of different vehicle categories, both longitudinal and lateral movements of vehicles can be observed simultaneously, whereas under homogeneous traffic conditions, lane-changing maneuvers are discrete. is significant difference in driving behavior can be attributed as one of the main reasons for the limited performance of established homogeneous traffic concepts in mixed traffic conditions. Under mixed traffic conditions, due to the predominance of lateral movement factor (driving behavior), the subject vehicle movements are influenced by the presence of surrounding vehicles. As a result, numerous parameters that are not accounted for in following traditional behavior and lane-changing models impact the vehicles' movement. Furthermore, in recent times, researchers [26,27] highlighted the intricacies of nonlane-based mixed traffic conditions. Given the variation in physical properties, vehicles by virtue of their size acquire any available space that can impact the driving behavior of the surrounding vehicles.
From the literature [28][29][30], it is inferred that machine learning tools are proven to be productive in understanding complex data patterns supported by quality data. Yu et al. [31] tested the Fixed Radius Near Neighbors (FRNN) for modeling the longitudinal car-following behavior and strongly advocated the usage of data driven approaches in traffic modeling. After identifying the research gaps, in the present work, it was decided to explore the performance of machine learning algorithms to address the need for modeling mixed traffic flow both at microscopic and macroscopic levels. Hence, this study is focused on employing machine learning tools from the branch of artificial intelligence in modeling this complex vehicular behavior using microlevel trajectory data. Finally, the performance of the selected algorithms is evaluated thoroughly at different levels, including microscopic and macroscopic level comparisons. Further, in recent times, researchers [32] are strongly advocating the importance of producing the reproducible research in transportation engineering. In this direction, the trained models in this study can be effectively used/improved with other data sources and supports reproducible research in the domain of traffic modeling.
Given the limitations of the traditional following and lane-change models, this paper aims to model mixed traffic conditions with machine learning algorithms. e study consists of the following main tasks: (i) Develop vehicular trajectory data to capture the study section's driving behavior using a semiautomated image processing tool (ii) Explain mixed traffic conditions using an example and mark the surrounding vehicles, which can influence the vehicles' movements (iii) Identify the parameters which influence the longitudinal and lateral movements using correlation analysis (iv) Train three supervised algorithms and one deep learning algorithm with the parameters, such as dependent longitudinal and lateral speeds with independent settings (v) Conduct simulation runs based on the trained algorithms, evaluate the algorithms' performance, and present the results with some meaningful insights  [34], to limit the noise in trajectory data, smoothening techniques were applied, and the details of traffic states, for which trajectory data is developed, are given in Table 1. More details about the data that can be used in the present work can be found from authors' previous studies [35,36]. e selected study section's snapshots and developed time-space plots of vehicles observed during real field conditions on the western expressway are depicted in Figure 1. Based on the videographic surveys broadly, six types of vehicle categories are found on the selected roadway study section: Motorized three-wheelers (M W), Motorized two-wheelers (MTW), Buses, Cars, Trucks, and Light commercial vehicles (LCV).

Overview of Mixed Traffic Conditions.
In explaining the mixed traffic conditions in a better manner, an example is presented in Figure 2. It can be noted that the movement of the subject vehicle (MTW as an example in Figure 2) is depicted for different time frames in (a) through (h). It can be observed that the subject's motorized two-wheeler (marked with a yellow star) is largely influenced by its surrounding vehicles, and the subject vehicle tends to maneuver out from its surrounding vehicles to have a better movement and avoid delays. It may be noted that the MTWs are acquiring any available position on a roadway space based on the availability of adequate longitudinal and lateral space simultaneously. Due to this traffic movement nature, even the most established following behavior and lanechanging models from homogeneous traffic conditions tend to underperform under mixed traffic conditions (as depicted in Figure 2). In general, most vehicles following behavior models predict acceleration (a) and speed (v) of the follower vehicle as a function of several variables related to its leader vehicle, as follows: where s � relative spacing. Δ v � relative speed, s min � minimum spacing, and V � desired speed.
On the other hand, in mixed traffic conditions, the subject vehicle is primarily influenced by its surrounding vehicles present in the traffic stream. Hence, the subject vehicles' movement can be governed by other added parameters discussed in the following sections.

Assessing Diving Behavior.
Considering the vehicles' naturalistic movement in the traffic stream, the subject vehicle's longitudinal and lateral movement is mainly dependent on its surrounding vehicles. In the present work, initially surrounding vehicles for a given subject vehicle are identified accurately using trajectory data. In line with the literature [37,38], a surrounding zone created by the addition of 40 m distance in front (look-ahead) and 30 m distance behind (look-back) from the center of the subject vehicle is considered, with a total longitudinal distance of 70 m forming a longer side of a rectangle ( Figure 3). A lateral distance of 5.5 m from the center position of the subject vehicle to the center position of the surrounding vehicles, including the total width of the subject vehicle (with an overlap of width), is considered over the entire road space (in longitudinal and lateral directions over time), as depicted in Figure 3. As per the developed logic, there can be a possibility of eight combinations of surrounding vehicles for the subject vehicle.
Based on the literature [39,40] from both homogeneous and mixed traffic conditions, the parameters that influence the longitudinal and lateral speeds are identified. On these lines, for longitudinal speeds, around 16 independent parameters are identified, other 22 independent parameters are identified for lateral speeds, as shown in Table 2, along with a brief description. With the help of developed trajectory data sets, using python code, surrounding vehicles are identified, and all mentioned parameters are evaluated for each vehicle at every instant of time.
Later, to identify the influential parameters for longitudinal movement and lateral movement, the Spearman correlation [41] test is performed between vehicles' instantaneous longitudinal speeds and 16 parameters, which may influence the longitudinal movement. Similarly, for the instant lateral speed of the vehicles along with other 22 parameters, which may influence lateral movement. e parameters identified are presented in Table 3, along with their brief description. After correlation analysis, it is observed that with a change in traffic flow conditions, the correlation values differed substantially, and, in some cases, the correlation nature has even varied. For example, in the case of longitudinal movement, the parameters such as lateral tilt (long_8), the lateral gap with adjacent vehicle (long_10), present lane (long_12), presence of left leading vehicle (long_13), and presence of right leading (long_14), with a change in flow levels the nature of correlation is varied. Mainly it can be noted that the parameters mentioned above are related to the lateral gap. However, it is inferred that with the rise in flow levels, the vehicles' longitudinal movement is constrained. As a result, vehicles tend to find lateral gaps for better maneuverability, particularly flow 2 and flow 3 (higher traffic flow levels). Due to this, with a change in flow levels, the correlation is found to be varying.
On the other hand, parameters such as right longitudinal gap (long_7), angle of seeping (long_9), the lateral gap with left adjacent (long_10), lateral gap with right adjacent (long_11), presence of right leading (long_14), TTC (long_15), and Smin/S (long_16) are found to be sparsely active under the present traffic conditions for the longitudinal movements, whereas parameters such as leader presence (long_1), relative distance (long_2), relative speed (long_3), leader vehicle category (long_5), left longitudinal gap (long_6), lateral tilt (long_8), present lane (long_12), and presence of left leading (long_13) are found to play a governing role in the longitudinal movement of the vehicles.
From the correlation analysis on the instant lateral speed with 22-lateral influencing parameters, it is found that, unlike longitudinal movement correlation analysis, the sense of the parameters (+ve/-ve correlation values) is similar for all flow conditions, whereas the parameters, such as present lane (lat_6), left front vehicle (lat_7), right front vehicle (lat_8), left back speed (lat_11), lateral tilt (lat_16), distance from left back (lat_17), distance from right back (lat_18), left  longitudinal gap (lat_21), and right longitudinal gap (lat_22) tend to have a good correlation with lateral speeds. Based on the correlation analysis, it is observed that in the case of flow 2 and flow 3 (higher flow levels), where longitudinal movement is constrained, vehicles tend to find the lateral movement to escape the delay in the traffic stream. As a result, parameters associated with lateral gaps are better correlated with instantaneous longitudinal speeds. Similarly, parameters associated with longitudinal gaps are better correlated with instantaneous lateral speeds. In most of these cases, it may be noted that the correlation range of the parameters is found to be within 0.5. Given the stochastic nature of driving behavior, in line with the literature [42,43], this can be treated as an acceptable correlation.

Machine Learning Modeling
In line with the work objectives, it is attempted to model the mixed traffic flow conditions. From the literature (Evan Lutins, 2017), it is inferred that numerous car-following behavior models are conceptualized. e following behavior models stood out for homogeneous traffic conditions and proved their potential in traffic flow modeling. On the other hand, in mixed traffic involving different vehicle categories and lack of lane discipline in the traffic, the vehicles' spatial interactions will be more involved. Even from correlation analysis, it is identified that numerous parameters affect the longitudinal and lateral movement of the vehicles. Considering this, modeling mixed traffic conditions with established car-following and lane-changing models from homogeneous traffic conditions may not be prudent.
In overcoming this challenge in the present work, to model the mixed traffic, machine learning from artificial intelligence is considered. ree established supervised machine learning algorithms, such as k-NN, random forest, and regression tree algorithms, are selected. Along with that, deep learning is also explored for modeling mixed traffic conditions.

k-NN Algorithm.
In general k-NN algorithm (Min-Ling Zhang & Zhi-Hua Zhou, 2005) works on the principle of pattern recognition and learns the data patterns. To better explain this, let (x 1 , y 1 ), (x 2 , y 2 ), . . ., (x n , y n ) be the data points Journal of Advanced Transportation from a sample space 'S' that belong to two classes, Class-I and Class-II, as follows: Let the class of data point (x t , y t ) be the point of interest from another sample space. To identify the class of the data, k-NN adopts the nearest neighboring approach. For example, say in the present case, k-NN adopts 3-neighbors. Initially, by means of Euclidean distance, the three nearest neighbors will be identified. Further, based on the majority class of the neighbors, the data point (x t , y t ) class will be predicted. On these lines, by changing the number of neighbors and distances measures, the performance of the algorithm can be improved. In the present trial and error strategy adopted, 5-nearest neighbors are found to be optimized values.

Random Forest.
In general, random forest [44] learns the data with a constructive multitude decision tree framework. Machine learning models the target outcomes in the form of categorization with ascertained probabilities. If the dependent variable is a categorical one, the category with maximum probability is given as an outcome by the machine learning models. On the other hand, if the target outcome is a continuous variable, in that case, by means of the weighted mean approach, the outcome is predicted. To explain the basic framework of the random forest algorithm in a better manner, let us consider the 'N' number of classes, with 'M' input variables or features. A number 't' is specified (t < M) such that at each node, t-variables will be selected at random out of M. e best split on these 't' is used to split the node. e value of 't' is held constant when the forest is developed. Further, each tree will be grown to the most substantial extent possible.
Let the training set X � x 1 , ..., x n with responses Y � y 1 , ..., y n , bagging repeatedly (N times) selects a random sample with replacement of the training set and fits trees to these samples. In the next sample, t training examples are selected from X, Y as X t , Y t . Later by means of the random forest framework, trees f t is trained on X t , Y t . After training the samples, the predictions can be given as follows: where the number of trees and t are independent parameters that can be optimized using different cross-validation strategies.

Regression
Tree. Decision tree learning [45] is a predictive model, where the decision tree is framed as branches (inputs) and leaves of the tree (outputs), in which the decision variable is categorized into subsets. A tree can be learned using recursive partitioning, in which the trained data is split into subsets until the trained data is matched with an observed target value. is process of Top-Down Induction of Decision Trees (TDIDT) [46] is generally applied in developing the decision trees. e independent variables are best riven for the target variable; on this basis, the decree is selected to split the node. e same process is repeated until all the target values are sorted to either of the nodes.
Further, every branch of the decision tree dismisses a target value. Each target falls into one and exactly one terminal node, and each terminal node is uniquely defined by a set of rules [47]. Based on the class of the output decision, variables decision trees are classified as classification trees and regression trees.
Further regression trees employ Gini impurity [48] as a measure to check the accuracy of the tree labeling. e Gini impurity is nothing but the sum of the probability p i of a data point with class i being chosen times the probability n t ≠ i p k � 1-p i of error in selecting the class.
e Gini impurity is given by the following:

Deep Learning.
Typically, deep learning is developed based on the neuron's architecture in the human brain cells.
In which, the way electrical signals travel across the cells of living, each subsequent layer of nodes is activated when it receives stimuli from its neighboring neurons. Given this, the accuracy from deep learning models predictions can be increased with the right amount of training data. Deep learning: there will be three different layers, as input layers, hidden layers, and output layers, as shown in Figure 4. Specifically, the input layers are provided with the input vectors as x 1 , x 2 , . . . ..x n . to map the final outcomes in the output layer. Given this, the input data is filtered through a series of hidden layers. e hidden layers are sandwiched between the input and output layers.
e deep learning models are developed with the help of python programming [49] using the Google TensorFlow [50] library environment. Later, the input parameters and the output velocities are mapped over deep learning models with numerous combinations of hidden layers, neuron activation functions, and many epochs. Applying the trial and error approach to limit the overfitting, for the present case, three hidden layers with 128, 64, and 16 nodes were adopted with 250 epochs by sequential modeling [51]. At the same time, ReLU [52] activation functions are used other than the final SoftMax [53] layer. e details of the trained deep learning models are presented in Table 4.
In the present study, the authors attempted to model the vehicles' instant longitudinal and lateral speeds instead of instant acceleration. It can be noted that, in normal traffic conditions, the acceleration values are in the range of -3.5 to 3.5 m/s 2 , the longitudinal speeds are in the range of 0 to 25 m/s. In comparison, the range and variation of speeds are higher compared to the acceleration. Given the less range and variation, it is envisaged that the models will be underfitted if they are trained with acceleration as a dependent parameter. Considering this, the speeds are taken as a dependent parameter over the acceleration. In the present work, to improve the precision of the training of the algorithms, the dependent variables (instantaneous longitudinal and lateral speeds) were rounded off to 0.5 m/s and 0.01 m/s, respectively. Due to this, the variable classes decrease, and the data correlation patterns can be refined. For both the longitudinal and lateral movements, correlation coefficients with values equal to or greater than 0.4 at any of the flow levels were considered as influential parameters. Using a similar approach, the preceding algorithms were trained for the dependent variables as instantaneous longitudinal speeds and instantaneous lateral speeds. In the case of k-NN, based on trial and error, the five-nearest neighbors were considered. For random forest, 15 trees were selected for training. e regression trees and deep learning were trained with their exact formulations. e training and setup process is carried out in python 3.7.0 programming language ("A primer on scientific programming with python," 2013). For training and testing the data, the entire trajectory data from the three different traffic flow conditions are divided into two equal halves. One-half is used for training the data and the other for testing purposes.

Simulation of Mixed Traffic
Further, based on the trained algorithms, vehicle movements are simulated again in python 3.7.0, as shown in Figure 5. According to their correct positions observed from field conditions, the vehicles were generated one after another, according to the initial time stamps and the positions, as shown in Figure 6. For that, the initial positions of all the vehicles are taken, and the vehicles are generated one after another according to the initial time stamps. At the same time, to sort the initial movement, the vehicles are placed with the true speeds (not by models, in the present study: 7 Vehicles). Later the vehicles are generated according to the initial start time and the positions. Upon their generation, the trained algorithms governed the subject vehicle's movement concerning its surrounding vehicles and derived the influential parameters in the traffic stream. On these lines, the simulation of mixed traffic is performed. In the present work, with the trained models along with surrounding vehicle combination, the vehicles next time instant longitudinal and lateral speeds will be predicted. For say at given the data of time step t n , the speeds of t n+1 will be predicted again with the combination data of t n+1 ; next t n+2 will be predicted and the process goes on till all predictions are made.
To assess whether the calibrated models' performance mimics the traffic behavior, time-space plots of vehicles were plotted one after another and compared with field extracted  vehicles, as shown in Figure 7. From the primary visualization of the time-space scenarios, it is observed that the simulated time-space plots using k-NN and deep learning algorithms tended to match the field observed time-space plots reasonably well. However, with random forest and regression trees, significant variation in the time-space plots was found. To assess the performance, the combined mean absolute percentage error (MAPE) was computed among the vehicle longitudinal and lateral positions for the three flow levels, and the results are as shown in Table 5.
e MAPE values show that as the traffic flow level increases, the MAPE for each of the algorithms increases. In the case of k-NN, the MAPE was well within the range of 10%. For deep learning, MAPE was about 5%. On the other hand, in the random forest and regression tree case, the error was 7% to 17%.
Similarly, based on vehicles' simulated movement, macroscopic traffic characteristics, such as stream speed, density, and flow, are computed at every instant time frame using simulation. As speed, density, and flow are computed  every immediate time frame, a large amount of data is developed for developing fundamental macroscopic plots for traffic conditions ranging from the free-flow regime to the congested regime. Further to comprehensively develop the macroscopic plots, it is planned to adopt the Rakha model [54], as previously it is proven that it works well under traffic conditions considered in this study. e model formulation is briefly given by the following: where h � headway, u n � speed of the n th vehicle, and u f � free-flow speed. c 1 , c 2 , c 3 � constants. e density of the traffic stream k is given by the following: where Using Equations (8)-(10), c 1 , c 2 and c 3 parameters are estimated using the simulated macroscopic data and speed-density plots are developed for each of the algorithms, as shown in Figure 8. Further, speed-flow plots are also compared with observed speed-flow data plots, as shown in Figure 9. Additionally, boundary conditions such as free-flow speed, optimum speed, capacity, and jam density parameters derived from the macroscopic plots developed using algorithms are evaluated and are depicted in Table 6. Similar to microscopic outcomes, again, it is witnessed that deep learning tends to match the field conditions in a better manner. It is observed that with deep learning capacity is found to be around 11,810 pcu/h/direction, free-flow speed 57 kmph, and jam density as 815 pcu/km/direction. With observed field conditions having respective values of capacity, free-flow speed, and jam density as 11,860 pcu/h/direction, 61 kmph, and 810 pcu/km/direction. is further proves that the deep learning algorithm is fairly working well under mixed traffic conditions, contributing a mean  absolute percentage error (MAPE) of less than 5%. On the other hand, in the case of random forest and regression trees, mainly free-flow speeds are higher (about 70 kmph), even the jam densities are overestimated as 1,000 pcu/km/direction, contributing to a MAPE value of more than 10%. In addition to this, from the shape of macroscopic plots, it may be further corroborated that these plots (developed using regression trees and random forests) tend to deviate significantly from the forms of macroscopic plots developed using actual field data. Further, the macroscopic data is generated from the trajectory data for every time instant from the study section with both empirical data and the simulation data. While simulating the congestion at its end times, the vehicles in that traffic exited with their time stamps. ere are no vehicles to enter the section, with vehicles exiting at lower speeds and no vehicles to enter the road space. e density levels in the traffic stream fell near the end timestamps of the simulation. As a result, fewer speeds are observed at lower density levels.

Practical Aspects
In the present study, the models' core logic in learning the data patterns played a huge part in revealing the models' performance. For example, the random forest and regression trees apply ensemble learning methods for learning the data. Given the multitude of decision trees in handling the data, the target outcome speeds have deviated. Whereas k-NN works on the nearest neighbor approach, with less variation in the dependent variables in the close vicinity, k-NN performance tends to show better results. On the other hand, the deep learning model depicts the speed variations given the internal neural layered structure. Presently modeling the car-following, lane-changing, and lateral movements of the vehicles are challenging aspects for the researchers and practitioners working under mixed traffic conditions. With the help of the depicted methodology, simulation modeling can be done with ease. Simultaneously, the accuracy in the present simulation packages can be improved by embedding the illustrated algorithms. Presently, in mixed traffic, real trajectory datasets are very scarce in the present context. As a result, very few driving behavior studies are attempted in this direction. With the study methodology, naturalistic trajectory data can be predicted for carrying driving behavior instincts. Along with that, the vehicle's  driving behavior can be quantified, and the level of aggression can be checked with the modeled driving data. Currently, trajectory prediction is one of the significant hurdles remaining to achieve safe and reliable autonomous driving. ere have been many proposed metrics for evaluating the quality of forecasts on static datasets. However, trajectory prediction for autonomous driving inherently must run in real-time, in conjunction with other driving pipeline components, such as planning. We have discussed why algorithm runtime, environment complexity, and frequency of predictions should also be considered when evaluating a prediction algorithm. To do this, we implemented several state-of-the-art prediction models and evaluated their behavior in a realistic simulation.

Conclusions
In a mixed traffic stream, vehicular movement is primarily influenced by its surrounding vehicles in the flow. us, both longitudinal and lateral movement continuously varied over a given road space based on the traffic conditions. is continuous longitudinal and lateral movement phenomenon is the principal root cause of the limited performance of the established car-following and lane-changing models developed under the perfect lane-discipline environment prevailing under homogeneous traffic conditions. e correlation analysis shows that the vehicles' instantaneous longitudinal speeds are reasonably correlated with lateral gap parameters under mixed traffic conditions. Similarly, lateral speeds are also correlated with longitudinal distance parameters. Interestingly, in the case of flow 1, that is, free-flow traffic conditions, the correlation nature (positive/negative) for certain parameters has differed compared to flow 2 and flow 3, both for longitudinal and lateral speeds. is analysis carried out in this research work indicates that initially, at free-flow conditions, the vehicle tends to move continuously over the road space with less lateral amplitude. But vehicles tend to find the lateral movement for escaping the delay in the traffic stream when the vehicles' longitudinal movement is constrained.
Based on the study, it is well established that by deploying advanced computational tools such as machine learning tools, mixed traffic conditions can be modeled with better accuracy. Based on the analysis conducted here, it is observed that k-NN and deep learning algorithms mimic the mixed traffic conditions better with a MAPE of 3 to 9 percent at microscopic and macroscopic levels. e algorithm performances can be attributed to their core model stability in handling complex data patterns. On the other hand, with a random forest and regression tree, the results tend to deviate substantially from the actual field observed traffic conditions. It can be noted that in both algorithms, data is trained in a tree assembly. Due to this, the predefined form (less flexibility) and multiple causal parameters, the algorithms' performance is limited.
Further, the methodology adopted in the present work addresses the challenge of reasonably replicating mixed traffic conditions and will be useful in modeling such mixed traffic scenarios effectively to develop viable, practical applications. Interestingly, these tested algorithms can also be used in traffic microsimulation tools to replicate mixed traffic and boost traffic simulation studies in these conditions. Furthermore, it is inevitable that, due to variation in the driving behavior, the microscopic traffic simulation studies are limited under mixed traffic conditions, given the better accuracy of modeling the mixed traffic conditions using machine learning algorithms. ese algorithms can be embedded in simulation tools as a substitute for following behavior and lane-changing models and could have a strong potential to boost the simulation studies from these traffic conditions (mixed traffic). Adopting this strategy for modeling homogeneous traffic conditions, the simulation models' accuracy may also be taken to the next level. [55][56][57].

Limitations and the Future Scope
Along with the research findings, the present study has certain limitations, which can act as the future scope of the work.
(i) Driving behavior is a stochastic phenomenon. All the drivers might behave differently by their own choice, which is highly discrete. Nevertheless, at times, the machine learning models and the deep learning model will remain ineffective due to interdriver variability but observed to perform better than the present limitations of conventional models of traffic flow modeling under mixed traffic conditions. (ii) In the present work, the models are tested at three different flow conditions from the study section. However, the present study framework must be tested over the study sections with even more different flow conditions with variations in vehicles' proportion. is can undoubtedly help in comprehensively gauging machine learning and deep learning models for traffic flow modeling. (iii) In the present study, the simulation analysis is performed by generating the vehicles at the same observed time stamps and the positions and governing the movement with models. As a result, the variation in the vehicle's arrival and the impact due to the composition of the traffic is not much tested in the simulation process. However, this can act as the future scope of the work.
Data Availability e data that support the findings of this study can be made available upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.