Physical Model versus Artificial Neural Network (ANN) Model: A Comparative Study on Modeling Car-Following Behavior at Signalized Intersections

Many studies have simulated traﬃc behavior at signalized intersections using various Car-Following (CF) models. However, the performance of which CF Model is superior at signalized intersections has not been thoroughly analyzed and evaluated. In this study, two novel Artiﬁcial Neural Network (ANN) CF models, the Convolutional Neural Network—Long Short-term Memory (CNN-LSTM) and the Convolution-LSTM (Conv-LSTM)—are ﬁrst applied to predict CF behaviors at signalized intersections. Both models can extract spatial and temporal information to address the long-term dependency problem more eﬀectively. Based on the ﬁltered NGSIM dataset, we conduct a comparative empirical study of three conventional CF models and ﬁve ANN CF models. The dataset is divided into two categories based on the characteristics of CF behavior at signalized intersections: continuous and discontinuous. The experiments demonstrated that ANN CF models outperformed conventional CF models when the output was the velocity in two categories of traﬃc ﬂow but only failed to do so when the output was acceleration in discontinuous traﬃc ﬂow. The proposed models were capable of accurately predicting acceleration, but the traﬃc ﬂuctuations also existed as time passed. Additionally, it was discovered that while the ANN CF model is preferable for traﬃc ﬂow simulation, the conventional CF model still cannot be ignored for discontinuous traﬃc ﬂow simulation, particularly when acceleration is required.


Introduction
With the development of Intelligent Transportation System (ITS), the multitransport models which include humandriven vehicles, Connected and Automated vehicles (CAVs) have become the usual pattern of urban traffic flow research. e response mechanism of CAVs is more rapid and accurately compared with human-driven vehicles, so how to guide the CAVs in different categories of traffic flow to steadily follow the preceding vehicle is significant [1]. In comparison to the Car-Following (CF) behavior on the highway, the CF behavior at signalized intersections is more unstable. Drivers are frequently caught in a bind and unable to make quick decisions due to traffic signal light control.
us, the CF model at signalized intersections is a thriving field of research. Numerous studies have used various conventional CF models to simulate traffic flow at signalized intersections and to analyze the time-varying characteristics of single-lane traffic dynamics.
Several studies [2][3][4][5][6] simulated the trajectory of a humandriven vehicle at intersections using the Intelligent Driver Model (IDM). Han et al. and Yao et al. [7,8] used the Gipps CF model to simulate human-driven vehicles crossing the intersection. Zhao et al. [9] enhanced the full velocity difference (FVD) model to better describe the CF behavior of an approach lane with a special width at an intersection. Zhang et al. [10] studied the CF behavior at intersections quantitatively and proposed a new CF model based on the FVD.
Tang et al. [11] proposed a method for speed guidance at intersections that used the FVD model as an input. Zhao et al. [12] and Yu and Shi [13] combined a new CF model of traffic flow at intersections with a basic FVD model based on multiple datasets to propose a new CF model. Liu et al. [14] improved the FVD model's stability and factored in the effect of short-term driving memory. Arvin et al. [15] used the Wiedemann CF model to evaluate the safety of connected and automated vehicles at intersections. Liu et al. [16] proposed an eco-speed guidance method for the mixed traffic flow of electric and conventional human-driven vehicles at an intersection, simulating CF behavior using the Wiedemann model. On the other hand, these conventional CF models are incapable of accurately simulating sophisticated vehicle movement at signalized intersections. Brockfeld et al. [17] stated that the error of conventional CF models is between 15% and 25%, which is difficult to reduce further. Albeaik et al. [18] found that the velocity that is simulated by IDM is unrealistic for some specific initial data and weakened the generalization ability of the model. With the advancement of artificial intelligence and data collection technologies, datadriven CF models, particularly Artificial Neural Network (ANN) CF models, are gradually emerging as a new area of research. e advantage of ANN CF models is that they do not depend on theoretical assumptions and do not strictly adhere to any mathematical derivation. Instead, they use a nonparametric approach to extract the inherent information in traffic data and build highly accurate CF models.
While several studies have explored the application of CF models at signalized intersections, few have examined the robustness of various CF models. e conventional CF models and the ANN CF models will be evaluated in this study using the NGSIM ground-truth dataset. Two improved ANN CF models are proposed to address the limitations of existing CF models when confronted with sophisticated traffic flows. Finally, we conduct simulations to determine which CF models perform the best under complicated traffic scenarios at signalized intersections. e following sections highlight the paper's main contributions: (1) e CNN-LSTM CF model is proposed, which can use Convolutional Neural Networks (CNN) to extract key features of each time step's CF behavior and the Long Short-Term Memory (LSTM) model to predict the CF behavior of future time steps. (2) e Conv-LSTM model is first applied to the simulation of CF behavior, as it can extract spatial and temporal information from multiple time steps in a single LSTM cell, thereby resolving the long-term dependency problem more effectively. (3) ree conventional CF models, five ANN CF models, and two machine learning CF models are compared at signalized intersections for two typical traffic flow categories. e acceleration, velocity, trajectory, and hysteresis loops, as well as the RMSE (Root Mean Square Error), are used to quantify the performance of all CF models in predicting the following vehicle's movement state. e remainder of this study is organized as follows. Section 2 reviews the existing CF models and their application in traffic modeling. Section 3 preprocesses the training and cross-validation dataset. Section 4 analyzes three representative ANN CF models and proposes two novels ANN CF models. Section 5 analyzes the conventional CF models and calibrates their parameters using ground truth data. Section 6 explores the performance of different types of CF models in continuous/discontinuous traffic flow where the acceleration and the velocity are used as single model output, respectively.

Conventional CF Models.
Pipes presented the first CF models in 1953 [19], and various CF models have been proposed since then. e conventional CF model can be classified into four broad categories: stimulus-response, safety distance, psycho-physical, and optimal velocity (OV).
e stimulus-response model focuses on the reaction of subject vehicles, and notable stimulus-response models include the Gazis-Herman-Rothery (GHR) model [20], the Newell model [21], and the IDM model [22]. Wu et al. [23] used the IDM model to validate the proposed variable speed limit control strategy. Cui et al. [24] used the IDM to investigate the fuel consumption and emission characteristics of adaptive cruise control/cooperative adaptive cruise control at a signalized intersection. Gipps [25] proposed the safety distance model regarded the gap between continuous vehicles as the most significant parameter. Yao and Li [5] simulated the trajectories of human-driven vehicles using the Gipps model. Additionally, the psycho-physical model is a decision-making model that employs various thresholds to denote the various driving stages of CF behavior. For instance, the Wiedemann model [26] presupposes that the driver employs various driving strategies depending on the traffic environment. Wang et al. [27] determined expected vehicle behaviors using the Wiedemann model. Chauhan et al. [28] calibrated the Wiedemann model's parameters using an Indian naturalistic dataset. Bando et al. [29] developed the optimal velocity model (OVM), and Jiang et al. [30] incorporated the effect of the velocity difference into the OVM model and proposed Full Velocity Model (FVD). Zhao et al. [31] used an OVM to model human vehicle behavior. Qin et al. [32] used OVM to analyze the stability of connected and automated vehicles. Although these conventional CF models are useful for studying the dynamics of micro traffic flows, they have some limitations. For instance, the robustness of the conventional CF model is insufficient because the model's parameters do not account for all trafficinfluencing factors.

Data-Driven CF Models.
Data-driven methods include fuzzy logic, case paper, support vector regression, and ANN. e fuzzy logic method applies fuzzy sets and fuzzy rules to a problem to describe it, and it can be used to perform comprehensive analyses of unknown models and systems [33]. e addition of fuzzy logic to the CF model improves the model's ability to describe the driver's CF behavior [34]. e case paper method models the CF behavior using locally weighted regression or the k-nearest neighbor method. Toledo et al. [35] constructed a CF model using a locally weighted regression method and fitted the position of each vehicle sequentially using the weighted least squares method. Machine Learning (ML) is one of the most significant -branches of data driven research. He et al. [36] proposed a data-driven CF model based on the K-Nearest Neighbor (KNN) algorithm, eliminating the need for complicated mathematical formulations or parameter correction. Furthermore, Yu et al. [37] proposed the novel CF model based Fixed Radius Nearest Neighbor (FRNN) with a consideration to balance accuracy and compute efficiency. e simulation experiments proved that the FRNN perform better than the KNN CF model. Support vector regression is a technique based on the support vector machine (SVM); it augments the SVM with a regression algorithm. Lu et al. [38] developed a CF model based on the support vector regression CF (SVRCF) model and calibrated the SVRCF model's parameters. e experiments showed that the performance of the SVRCF model was further improved compared with that of the conventional CF models.
Whereas fuzzy logic, case papers, and SVM models can more precisely simulate CF behavior, they performed worse than neural network models when multisource data fusion is performed. A neural network-based CF model includes a simple network, such as a multilayer perceptron (MLP) or radial basis function (RBF) network. Other models are built using deep learning techniques such as recurrent neural networks (RNNs) and long short-term memory (LSTM). A few researchers have conducted prospective studies on these models. For example, Jia et al. [39] proposed a CF model based on a BP neural network. Khodayari et al. [40] developed a neural network CF model that considered the drivers' behaviors. Following that, Zheng et al. [41] developed a CF model based on neural networks characterized by an instantaneous drivervehicle response delay. Colombaroni and Fusco [42] proposed a CF model based on a feedforward neural network that balanced efficiency and complexity via a particle swarm optimization algorithm. Zhu et al. [43] developed a deep deterministic policy gradient CF model (DDPGvRT) based on deep reinforcement learning by training the designed neural network on Shanghai's driving data. DDPGvRT is more consistent with actual driving than IDM or the other CF models. To overcome the problem of selfish objectives of human-driven vehicles, Peng et al. [44] redesigned a novel altruistic reward function of the deep reinforcement learning to improve the traffic efficiency in the unsignalized intersection. Papathanasopoulou and Antoniou [45] proposed a locally optimal neural network CF model with a lower RMSE than other CF models, such as the Gipps model. Wang et al. [46] established an RNN CF model characterized by the previous time segment's input of vehicle information; the author then simulated the driver's driving behavior and improved the model's accuracy. Zhou et al. [47] predicted traffic flow oscillations and distinguished driving behaviors using a neural network model. Ma and Qu [48] proposed a new CF model based on sequence to sequence, and the proposed model outperformed conventional models in the CF pair and platoon experiments. Hui et al. [49] developed a novel mixed deep encoder-decoder neural network to predict the vehicle trajectory and the optimal network structure was selected by different trajectory samples training. ese models demonstrated the neural network's nonlinear regression capability and its ability to simulate CF behavior.
According to the discussion above, only a few CF models based on neural networks have been applied to traffic flow research. e CF model's performance at signalized intersections has not been thoroughly investigated. Existing ANN CF models are primarily concerned with highways and pay little attention to urban roads. To address these gaps, this study compares the performance of ANN CF models and conventional CF models at signalized intersections.

Data Preparation
e precision of datasets is important for CF modeling [50,51]. Two datasets are used in this study to train and calibrate CF models of signalized intersections: the Lankershim dataset and the Peach Tree dataset, both of which are provided by the Federal Highway Administration's NGSIM program [52]. It extracted detailed information from the video data, including the vehicle's length, type, position, lane identification, velocity, and acceleration. e training and testing datasets were gathered from a section of Lankershim Boulevard in Los Angeles, California, USA, as illustrated in Figure 1(a). Five video cameras have been installed on a 36-story building adjacent to US Highway 101 and Lankershim Boulevard. On June 16, 2005, between 8:45 and 9:00 am, the vehicles' trajectories were recorded at a frequency of 10 Hz. e monitored area is approximately 1600 ft long and included four signalized intersections. As illustrated in Figure 1(b), this study's validation dataset is the Peachtree Street dataset that comprised of data collected on urban roads.
As illustrated red circle in Figure 2, these original datasets contain noise and redundant information, preventing them from directly applying to CF behavior modeling. However, both conventional CF models and ANN CF models require high-quality data. As a result, we filter the datasets using the four conditions listed in the metadata documentation.
(i) Vehicle_type � 2: It narrows the focus only on the CF behavior of the automobile while excluding motorcycles and trucks (ii) Following ≠ 0: It filters out other vehicle behaviors, such as lane changing and overtaking (iii) 5 ft < space-headway <130 ft: It eliminates the crash and free-flow traffic data (iv) CF duration >15 s: It ensures that the CF duration is sufficient for the study

Journal of Advanced Transportation
After data filtering, the filtered Lankershim dataset contained 377,000 data sample points: 90% (339,300) of the data were used in the training phase, whereas 10% (37,700) were used in the test phase. Cross-validation is performed on the filtered Peachtree dataset, which contains 83,000 data sample points.
Due to noise and abnormal data points, the datasets that satisfied the preceding four conditions are still unavailable. Montanino and Punzo [53] proposed a multistep filtering procedure to eliminate outliers from the dataset in order to address this issue. Coifman and Li [54] reextracted data from raw NGSIM video data via manual extraction. It indicated that the dataset contained errors, necessitating data filtering. e NGSIM dataset is filtered in this study using the moving average algorithm, and the filtered results are shown in Figure 3. In comparison to raw data, filtered velocity and acceleration data are smoother without sharp fluctuations. Additionally, the Pearson correlation coefficient (R) is used to evaluate the consistency between input data and filter data. It can be seen from Table 1 that the filtered data have the high consistency with input data.
ese variables are also used as inputs in this paper's ANN CF models. ere are two options for determining the output of CF models. For example, some models' output is the acceleration [40,48], while others' output is the velocity [41,42,45]. To examine the effect of varying the model's output, both acceleration and velocity are used as the output of ANN CF models, respectively. It can be formulated as follows: where τ is the simulation time step; v n (t), Δs n (t), Δv n (t), and a n (t) represent the velocity, space headway, velocity difference, and acceleration of the following vehicle n at time t, respectively. Additionally, the ANN CF models that used acceleration as output are named ANN CF-A, and the ANN CF models that used velocity as output are named by ANN CF-V.

Architecture of ANN Models
e BP neural network is a popular ANN algorithm used to solve classification and regression problems. Figure 4 illustrates the structure of the BP neural network used in this study.
As illustrated in Figure 4, the network contains several layers. e first layer is called the input layer, and it contains three input variables. e second and third layers are hidden   Journal of Advanced Transportation layers, whereas the fourth layer is an output layer that contains a single output variable. e selection of the number of hidden layers and hidden units is a difficult task, and in Section 4.3, we will provide the optimal selection strategy for each ANN CF model. Figure 5, I, S, and O represent the input, hidden, and output units, respectively. u is the weight matrix between the input and hidden layers; v is the weight matrix between the hidden and output layers. e output layer is the fully connected layer to calculate the final value. W is the last value of the hidden layer as the weight of the input, and n refers to the number of data labels. Compared with feedforward neural networks, RNN saves the state of the training data at the final stage, thereby improving the network accuracy.

LSTM.
e gradient vanishing or explosion problem may occur in a conventional feedforward neural network, limiting the applicability of such networks. Hochreiter and Schmidhuber [55] proposed the LSTM neural network to address these issues. In contrast to a standard RNN network unit, which contains only one hidden layer, a typical LSTM network unit (as illustrated in Figure 6) contains an input gate, an output gate, a forget gate, and a cell. h is the filtered output, c is the cell state, and I is the input vector.

CNN-LSTM.
Convolutional neural networks are a subclass of deep learning networks that have achieved remarkable results in various research areas, especially computer vision, image classification, and speech recognition. A conventional CNN model consists of three layers: convolution, pooling, and fully connected. e key equation for the discrete convolutional operation in one dimension is as follows: where x is the input array, ω is the one-dimension filter, a is the measurement step, and t is the current time. However,   To address this issue, a combination model called CNN-LSTM has been proposed that combines the advantages of CNN and LSTM, and the architecture of models is shown in Figure 7. Additionally, CNN-LSTM is used for the first time in the CF model study, which can use CNN to extract key features of each time step's CF behavior and predict the CF behavior of future time steps using an LSTM model. e CNN-LSTM architecture used in this study has several layers: an input layer, some convolutional layers, some LSTM layers, some fully connected layers, and an output layer.

Convolution-LSTM.
While the convolution-LSTM can extract critical temporal information from the time series input array, it does not perform well in capturing spatial information. Shi et al. proposed the Conv-LSTM model and successfully applied it to convective precipitation prediction [56]. It is first applied to the simulation of CF behavior, which can extract spatial and temporal information from multiple time steps at a single Conv-LSTM cell, allowing for a more accurate solution to the long-term dependency problem. e primary formulation of Conv-LSTM is as follows: where i t , f t , and o t represents the input gate, forget gate, and output gate, respectively. c t is the output of a single cell. W xi , W hi , W ci , W xf , W hf , W cf , W xc , W hc , W xo , W ho , and W co are weighted matrices, and b i , b f , b c , and b o refer to the bias of different gates. σ and tanh are sigmoid activation function and hyperbolic tangent activation function. Additionally, asterisk " * " is the convolutional operator, and "•" is the Hadamard product.
e major difference between CNN-LSTM and Conv-LSTM is that in CNN-LSTM, the convolutional operation is performed as a single layer. In contrast, in Conv-LSTM, the convolutional operation is performed in each Conv-LSTM cell. Additionally, gate layers can be used to visualize the cell state, as indicated by the red line in Figure 8. e proposed model is composed of five layers: an input layer, a Conv-LSTM layer, one fully connected layer that are wrapped in time distribution layers, and an output layer.

Models Training and Cross-Validation for ANN CF Models.
e Adam optimizer [57] (learning rate � 0.001), a stochastic gradient descent extension, was used to train the ANN CF model. e hyperparameter tuning is the tricky work, the different values of epoch, batch size, hidden layers, and cells are set to find the optimal model structure. e training and validations indicated that the accuracy of ANN car-following models cannot still be improved by increasing the hidden layers and the impact of different variables are independent. erefore, a parameters sensitivity simulation experiment which included four mentioned variables was conducted, and the optimal parameters settings each model are shown in Table 2.
For the overfitting problem, the dropout layer is used on each fully connected layer preceding the output layer at a rate of 0.5 [58]. e validation dataset that was not used for   training is used to solve the cross-validation problem. e cost of training and validation is depicted in Figure 9 using Mead-Square-Error. For acceleration prediction, as illustrated in Figures 9(a) and 9(b), two proposed ANN CF models outperformed other models not only in the training loss comparison but also in the cross-validation loss. Due to the convolution operation, two novel ANN CF models were able to extract more features of vehicle movement at each time step compared to other CF models. Although BP CF model converged quickly, they had the highest training and crossvalidation losses due to the BP model's poor predictive ability and feature extraction limitations.
For velocity prediction, similar results are obtained for all models' training and cross-validation losses. However, there are some notable findings. First, RNNs have the same training and cross-validation losses as LSTM CF models and perform worse than two novel ANN CF models. Second, both models' cross-validation is more fluctuated than the training loss. e possible reason may be that it is ideally included all of the training data when the backpropagation algorithm is applied, but it is impossible due to the complexity of gradient calculation for all training data. erefore, the smaller batch size is selected to accelerate the training process with the trade-off performance loss.

Conventional CF Models
In this study, three representative physical-based conventional CF models, IDM, Gipps, and FVD are chosen for comparison in the simulation experiment with the ANN CF models. [22] is the most widely used CF model. In comparison to other models, the IDM model is more accurate and capable of simulating a variety of traffic flow scenarios, as defined in

Physical-Based CF Models
where v, a, b, and s are the velocity, acceleration, deceleration, and trajectory of the following vehicle, respectively; v 0 is the desired velocity; Δv is the velocity difference; s 0 is the minimum distance, T is the time gap, and δ is the acceleration exponent.

Gipps
Model. e Gipps model [25] is a typical safety distance CF model capable of simulating two distinct driver behaviors: CF and free flow. e following equation defines the model: where v n and v n−1 are velocities of the following vehicle and preceding vehicle, respectively; a max and b max are the desired acceleration and deceleration rates of the following vehicle, respectively; b 0 is the desired deceleration rate of the preceding vehicle; V is the desired velocity of the following vehicle; Δx n (t) is the space headway; d is the minimum distance when the vehicle is stationary; and s n−1 is the length of the preceding vehicle.

FVD.
e FVD [30] model is an extension of the CF model with optimal velocity. As illustrated in equation (6), it takes the effect of the velocity difference on the CF behavior into account: where Δx n (t) is the space headway, v n (t) is the velocity of the following vehicle, Δv n (t) is the velocity difference, α and λ are sensitivity coefficients, and v n (Δx n (t)) is the optimal velocity function. e sensitivity coefficients and optimal velocity function can be defined as follows: where λ 0 is the constant sensitivity coefficient, v 0 is the desired velocity of the following vehicle, and b and β are the parameters.

Parameters Calibration.
Several parameters are included in conventional CF models, including the space headway and velocity difference. e values of the parameters vary according to the traffic scenario, and leaving the parameters uncalibrated makes evaluating the CF model's performance difficult. e simulated annealing algorithm [59] is used in this study to calibrate three different types of conventional CF models. Simulated annealing begins with a high initial temperature, steadily decreases the temperature parameters, and the global optimal solution of the given objective function is randomly found in the solution space. In other words, simulated annealing can approximate the global optimal solution from the local optimal solutions. e CF model's calibration process utilizes the value bounds defined in [60] to calibrate the model's parameters using the NGSIM dataset. Table 3 summarizes the results of parameter calibration for three CF models.
As illustrated in Table 3, when compared to the original model parameters, the calibrated model parameters more accurately reflect the following vehicle's movement state. As can be seen, the deceleration is greater than the acceleration, indicating that vehicles typically decelerate rapidly at signalized intersections, resulting in frequent variations in the space headway. Additionally, the shorter minimum distance indicates that a signalized intersection has a high volume of traffic.

Simulations and Analyses
In this section, the three representative CF models (conventional, ANN, and ML) are compared. e parameters settings of ML CF models are consistent with the literature [36,37]. Additionally, all model parameters and simulation environment parameters used in the study are given to ensure the reproducibility of the experimental results [61].

Continuous Traffic Flow.
e continuous traffic flow that occurs when the traffic light turns green or during the remaining red time allows the vehicle to pass through the signalized intersection. More precisely, the vehicle could proceed through the signalized intersection without performing a stop-go or idling maneuver. Table 4 contains the parameters for the continuous traffic flow simulation experiment. Figures 10-12 illustrate the CF models' acceleration, velocity, trajectory, and hysteresis loops in continuous traffic flow.
For the acceleration comparison, all CF models could simulate the acceleration fluctuation of human-driven vehicles in Figure 10 of the acceleration comparison. However, some differences still existed in three types of models. It can be seen that the ANN CF-A models capture more details, and in particular, the proposed CNN-LSTM CF models outperformed other ANN CF models.
For the velocity comparison, as illustrated in Figure 11(a), the Gipps model is more unstable than the e velocity of the following vehicle is simulated in Figure 11(b) by integrating the acceleration predicted by ANN CF-A (acceleration) models. e following vehicle's velocity was also simulated in Figure 11(c), but it is directly predicted by the ANN CF-V (velocity) models. As can be seen, the ANN CF-V simulates a velocity closer to reality than the ANN CF-A, and the accumulative integration error negatively affects the ANN CF-A.
For the trajectory comparison, as shown in Figure 12(a), the FVD CF model is highly consistent with the real trajectory during the first twenty seconds, but the error gradually increases during the subsequent period. e trajectory obtained by double integrating acceleration and single integrating velocity is shown in Figures 12(b) and 12(c). Compared to Figure 12(b), the trajectory error obtained by ANN CF-V is less than that obtained by ANN CF-A. is is consistent with the findings of the literature [47]. Additionally, as indicated by the red circle in Figures 12(b) and 12(c), the KNN, FRNN, CNN-LSTM, and Conv-LSTM perform better than other CF models, and the trajectory errors of the ANN CF-A and ANN CF-V remain low after twenty seconds of CF behavior, validating the proposed models' robustness.
Each model's Root Mean Square Error (RMSE) is calculated to provide a more intuitive assessment of its performance. e continuous traffic flow experiment contains 250 data points. Figure 13 depicted the RMSE of conventional, ML, and ANN CF-A models. As can be seen, the IDM and BP had a higher RMSE for acceleration, velocity, and trajectory, indicating that they were unable to accurately simulate the vehicle movement of the following vehicle. e FVD model outperformed the RNN and LSTM models. Additionally, the proposed CNN-LSTM and Conv-LSTM models performed better than the KNN CF models but worse than the FRNN models. Figure 14 depicted the RMSE of conventional, ML, and ANN CF-V models. Compared with ANN CF-A models, the RMSE of ANN CF-V models is further reduced. Especially, the Conv-LSTM and FRNN had the lowest RMSE.
As illustrated in Figures 13 and 14, the above three labels indicate the difference between CF models' simulated vehicle movement feature and the real vehicle movement feature. However, there are distinct stages of CF behavior that must be explored further. As a result, the hysteresis loop is used to evaluate CF models. As illustrated in Figure 15, the following vehicle's space-headway range is 10-60 m, and its velocity range is 6-18 m/s. It indicates that the following vehicle will proceed through the signalized intersection without encountering any traffic congestion. As illustrated in Figure 15(a), the hysteresis loop of IDM and Gipps are more "aggressive" than the real data and the FVD model. Additionally, although the FVD is closer to the real data, it has a limited capacity for detecting subtle changes in the real data. e hysteresis loops of ML, ANN CF-A, and ANN CF-V are depicted in Figures 15(b) and 15(c). Clearly, the ANN CF-V is more consistent with real data than the ANN CF-A, particularly the BP and RNN, which perform poorly and are more "conservative" than real data. Additionally, it can be seen that the KNN CF models performed more "aggressive" like Gipps and IDM.

Discontinuous Traffic Flow.
When the traffic light is red or the green time is insufficient to allow the vehicle to pass through the signalized intersection, discontinuous traffic flow frequently occurs.
is means that the vehicle must adopt a stop-go behavior in order to pass through a signalized intersection. e parameters of the discontinuous traffic flow simulation experiment are presented in Table 5.   For the acceleration comparison, the acceleration which was simulated by ANN CF-A models were more consistent with the real acceleration of following vehicle, as shown in Figure 16. e conventional CF models cannot simulate many minor changes of acceleration. Additionally, even though the ML CF models have the ability to simulate the acceleration, the KNN CF model does not perform well, and some acceleration values that are generated by KNN CF model is unrealistic.
For the velocity comparison, Figure 17 exhibited the velocity comparison results of conventional CF, ML CF, and ANN CF models. e CNN-LSTM CF and the Conv-LSTM CF models of the ANN CF-A models could simulate velocity simulations more realistically than     conventional CF models and BP, RNN, and LSTM CF models, but when the traffic light is green, only KNN and FRNN CF models could simulate the velocity of following vehicle. Additionally, they both performed worse than the ANN CF-V as shown in Figure 17(c). At approximately 40th seconds, the velocity simulated by IDM is below zero, while the velocity simulated by FVD is abnormal, as indicated by the red circle in Figure 17(a). Furthermore, as illustrated in Figure 17(b), the velocity simulated by BP and RNN is below zero. ey are not consistent with reality.
For the trajectory comparison, the unusual velocity fluctuations directly resulted in the same problems that occur during the trajectory simulation. e trajectory simulated by FVD is inconsistent with the real trajectory of the following vehicle between the 10th and 20th seconds, and the trajectory simulated by BP and RNN is also inconsistent with the real trajectory of the following vehicle when it came to a stop, as indicated by the red circle in Figures 18(a) and 18(b). e simulation results in Figures 16-18 indicate that when the loss of ANN CF-A models is not sufficiently low, they may have a limited ability to simulate long-term   stop-go behavior in discontinuous traffic flows. In comparison, although conventional CF models could not accurately simulate vehicle movement, they were more stable than ANN CF-A models, but ANN CF-V models especially CNN-LSTM and Conv-LSTM still performed best, as shown in Figure 18(c).
Similar to the simulation experiment for continuous traffic flow, the RMSE of acceleration, velocity, and trajectory for ANN CF-A and ANN CF-V are shown in Figures 19 and 20. e RMSE of ANN CF-A and ML are greater than that of conventional CF models and produces the same results as stated earlier. We are impressed to see that all ANN CF-V models had a lower RMSE of trajectory, and the CNN-LSTM and Conv-LSTM models had the lowest errors. Besides, it can be seen that KNN-FRNN simulate trajectory better than the velocity, and the FRNN-CF model achieved similar results to the ANN CF-V models.
As illustrated in Figure 21, the ANN CF-V model outperforms three different types of CF models. Gipps and IDM simulate a narrower hysteresis loop for conventional CF models than the FVD model. e reason for this is that the FVD's abnormal simulated velocity had a negative effect on its performance. For ANN-A CF models, the velocity of BP is negative, indicating that it violates basic rules of CF rules. For ANN-V CF models, it can be seen that CNN-LSTM and Conv-LSTM CF models are more consistent with the real data than KNN and ML CF models.

Conclusion
In this study, we compare conventional and ANN CF models to analyze and evaluate their performance to describe the vehicle behavior at the signalized intersection. e filtered NGSIM data are used to train and validate CF models. We propose the CNN-LSTM and Conv-LSTM CF models. e simulation is based on two typical categories of traffic flow at a signalized intersection. e acceleration, velocity, trajectory, and hysteresis loops, as well as the RMSE, are used to quantify the performance of all CF models in predicting the movement state of the following vehicle. e following summarizes the findings: (1) While conventional CF models perform poorly in continuous traffic flow simulation-based acceleration/velocity prediction and discontinuous traffic flow simulation-based velocity prediction, they perform better in discontinuous traffic flow-based acceleration prediction than ANN CF-A models. It is demonstrated that conventional CF models have an advantage in the discontinuous traffic flow simulation, particularly for forecasting acceleration of vehicle with long-term stop-go behavior. Because the parameters of conventional CF models have explicit physical meanings, they can effectively follow the human driver rules integrated into the model. us, when the accuracy of ANN CF-A models is insufficient for discontinuous traffic flow, conventional CF models are recommended. (2) e proposed two novel ANN CF models outperformed other CF models except the discontinuous traffic flow simulation-based acceleration prediction. e CNN-LSTM and Conv-LSTM models fully exploit the advantages of convolution and LSTM. e ANN CF models are recommended for velocity prediction at signalized intersections using traffic flow simulations and short-term acceleration prediction using traffic flow simulations. Additionally, the accuracy of the ANN CF models must be improved further if they are to be used in more complicated traffic flow scenarios at signalized intersections. e study's limitation is that it considers only locally following behavior (i.e., between two consecutive vehicles) and that different datasets may affect the performance of the ANN CF models. Vehicle platooning will be considered in future work, and extended experiments with diverse datasets will be required to gain additional insight into the proposed ANN-based CF models.

Data Availability
Two datasets are used in this study to train and calibrate carfollowing models of signalized intersections: the Lankershim dataset and the Peach Tree dataset, both of which are provided by the Federal Highway Administration's NGSIM program. e NGISM datasets include detailed vehicle and road information for micro traffic flow research.