Application of CNN-LSTM Model for Vehicle Acceleration Prediction Using Car-following Behavior Data

. Accurate vehicle acceleration prediction is useful for developing reliable Advanced Driving Assistance Systems (ADAS) and improving road safety. Te existence of driver heterogeneity magnifes the variations in acceleration data, leading to consequential impacts on the precision of vehicle acceleration prediction. However, few studies have fully considered the driver heterogeneity when predicting vehicle acceleration. To model the characteristics of individual drivers, this study frst identifes the driving behavior semantics which is defned as the underlying patterns of driving behaviors. Te analysis results from the coupled hidden Markov model (CHMM) are used to evaluate the driving behavior diferences between diferent drivers by Wasserstein distance. Ten the convolutional neural network (CNN) and long short-term memory (LSTM) network are applied to predict vehicle acceleration. To validate the accuracy of the proposed prediction framework, vehicle acceleration data in car-following conditions is extracted from the safety pilot model deployment (SPMD) dataset. Te segmentation results indicate that the CHMM possesses a robust capacity for modeling driving behavior. Te prediction results demonstrate that the proposed framework, which incorporates driver clustering before prediction, signifcantly improves the accuracy of predictions. And the CNN-LSTM out-performs the LSTM in predicting vehicle acceleration during car-following scenarios. Te fndings from this study can enhance the development of personalized functionalities within ADAS to promote its deployment, thereby improving its acceptance and safety.


Introduction
Autonomous driving technology is gaining attention as a solution to promote the trafc efciency and prevent accidents in diverse trafc conditions.In the car-following scenario, a signifcant challenge in autonomous driving is to accurately model the driving behavior and predict the future movements of the preceding vehicle.Driving behavior refects driver's operations on vehicles and has an important infuence on road safety [1,2].Many studies have shown that 80% of trafc accidents are caused by inappropriate driving behavior such as aggressive behavior [3].Vehicle acceleration is a crucial aspect for describing the driving behavior.Accurate prediction of vehicle acceleration allows advanced driver assistance systems (ADAS) to obtain dynamic changes of driver's actions and enables ADAS to predict the driver's future operations.Tis improvement in ADAS can avoid driver's defciencies in perception and decisionmaking, thereby improving trafc safety.
1.1.Driving Behavior Analysis.Many studies in the past have demonstrated signifcant heterogeneity in car-following behavior among drivers, indicating diverse responses and adjustments to trafc fow changes [4,5].Te heterogeneity among drivers contributes to substantial fuctuations in the driver characteristic variables, thereby increasing the complexity of modeling driving behavior, which carries implications for both the design of ADAS and trafc safety [6,7].However, most existing driving behavior modeling methods are unable to efectively handle the heterogeneity, and therefore, cannot accurately capture the driving behavior characteristics of diferent drivers.To bridge this gap, driving behavior analysis and evaluation have emerged as critical research areas.Tese methods aim to analyze the driving behavior characteristics of diferent drivers, enhance the understanding of driver heterogeneity, and evaluate differences between individual drivers based on specifc indicators to enable efective clustering and modeling of driving behavior.Approaches currently available for driving behavior analysis can be broadly classifed into two categories: those based on statistical features and those based on driving behavior semantics.
Methods based on statistical features usually consider the characteristics of driving data.Traditional methods often rely on a single statistical feature, such as the mean and standard deviation of brake pressure [8], the steering wheel position [9], and the throttle position [10].Te complexity of driving behavior arises from the intricate interplay of multiple factors, including but not limited to acceleration, braking, steering, lane changing, and other associated elements.Terefore, relying on a single feature to assess driving behavior may not adequately capture the full range of its characteristics.Many studies have used multiple indicators to analyze driving behavior.Fugiglando et al. [11] employed diferent statistical features to analyze driving behavior and applied the K-means algorithm for drivers clustering [12].Euclidean distance was applied to recognize the preference of driving behavior with diferent statistical features including vehicle speed and throttle opening [13].However, one major limitation of these driving behavior analysis approaches is that the intrinsic and dynamic characteristics of driving behavior cannot be captured from statistical data.Additionally, driving behavior evaluation approaches above are also obtained based on static criteria and do not take into account the randomness of data changes during driving.
Driving behavior involves dynamic decision-making.Even when faced with identical trafc conditions, a driver's decision can change over time [9].Terefore, it is crucial to fully capture dynamic characteristics from drivers' operations for a better analysis of driving behavior.Some studies suggest that driving behavior semantics should be segmented.Driving behavior semantics were known as the data blocks with the same behavior characteristics.Te driving behavior semantics can refect the distinct dynamic correspondence between the driving environment and driving operation caused by driver heterogeneity [14].Methods based on driving behavior semantics can be classifed into traditional methods and Markov chain-based methods.In traditional methods, supervised and unsupervised learning techniques have been commonly used [15].Supervised learning methods, such as the fuzzy logic algorithm [16,17], have been introduced for segmentation and identifcation.Xie and Zhu [18] employed timestamp-based segmentation and random forest classifcation to analyze driving behavior.However, supervised learning methods require labeled training data, which can be laborious and time-consuming.Unsupervised learning methods have emerged as an alternative.Te two-step algorithm proposed by Higgs and Abbas [6] has been used for car-following segmentation and clustering of driving behavior using K-means.Taylor et al. [19] used regularization to model the heterogeneity of driving behavior semantics over time.
Te methods based on Markov chain are often used in sequence segmentation felds, such as natural language processing.Tere are many similarities between driving behavior semantics analysis and natural language processing.Both driving behaviors and natural language data are time-series data, where the order of events is of vital importance.And the meaning of a word in natural language is determined by the words surrounding it.Similarly, a specifc driving action may imply diferent things depending on the surrounding driving environment.Tus, some natural language processing approaches have also been applied in driving behavior segmentation.Te hidden Markov model (HMM) has demonstrated signifcant advantages in capturing dynamic processes in natural language and has found extensive applications in modeling driving behavior [20].Te HMM with Gaussian mixture emissions (GMM-HMM) was proposed to analyze the heterogeneity of diferent driving behavior [20,21].Tese driving behavior analysis models could capture the underlying stochastic and dynamic characteristics of driver behaviors, but failed to obtain the microscopic preferences which included the interrelationship between diferent characteristic variables.Wang et al. [22] applied a Markov chain-based method to identify primitive driving patterns from temporal driving data and subsequently employed Kullback-Leibler (KL) divergence to classify 75 driving patterns.KL divergence is used to compare the diference between driving patterns by comparing feature distributions, which better takes into account the randomness of driving behavior data.Other commonly used methods for this purpose include the Jensen-Shannon (JS) divergence [23] and the Cauchy-Schwarz divergence [24].Tese driving behavior evaluation methods consider the infuence of the randomness of driving data and are capable of measuring the diference of information contained in two temporal driving data.However, these measures sufer from low discriminative power when the distributions have little or no overlap, leading to an inability to efectively distinguish diference between them and often require defning comparison intervals to ensure precise and stable computation results.
Terefore, the diference between driving behavior should be more comprehensively and meticulously distinguished considering the connection between characteristic variables.And driving behavior evaluation metrics should be able to handle various scenarios where the distributions have little overlap and have a wider range of applicability.

Vehicle Acceleration Prediction.
Tere are two categories related to vehicle acceleration prediction models: mathematical model and machine learning model.Te mathematical model is a fxed structure based on diferent mathematical parameters.Kim and Yi [25] utilized a probabilistic method for holistic vehicle states prediction including acceleration.Tis model could be solved as a multistage optimal estimation problem.Te parameter analysis and Fisher discriminant method were combined to predict vehicle acceleration [26].Te desired spacing carfollowing model was adopted by the proposed model.However, these models have defciencies in non-linear ftting for vehicle acceleration data.
Machine learning has become an increasingly popular approach for vehicle acceleration prediction.Zhang et al. [27] introduced a nonlinear autoregressive model with exogenous inputs (NARX) for onboard implementation.Te support vector machine (SVM) was used to train the acceleration sample and forecast [28].Recently, deep learning models have gained signifcant attention.A long short-term memory (LSTM) neural network was used to generate accurate vehicle acceleration distributions and predict future acceleration values [29].Lio et al. [30] demonstrated the performance of recursive, nonrecursive, structured, and nonstructured networks for vehicle acceleration prediction.Moreover, the nonrecursive network proved to be preferable.Although these models exhibit robust capabilities for nonlinear ftting, it should be noted that these models do not have specialized modules for feature extraction.When dealing with complex data, such as high-dimensional time-series data, they may not fully capture the characteristics of car-following data, which are critical for accurately predicting vehicle acceleration.
Previous prediction methods overlook the diferences that exist in driving behavior among individuals and fail to fully extract the valuable information from trafc data.Furthermore, the current driving behavior evaluation indicators sufer from limited applicability that can lead to unreasonable clustering results of drivers.Tese limitations may decrease the accuracy of vehicle acceleration prediction which lower the acceptance of the ADAS [31].To address shortcomings above, a framework is proposed for predicting vehicle acceleration by analyzing driving behavior in this paper.Te driving behavior clustering based on driving behavior semantics segmentation is conducted before prediction.And the prediction model with a fusion of LSTM and convolutional neural network (CNN) is defned as the CNN-LSTM.In this study, to capture the fundamental driving patterns, a coupled hidden Markov model (CHMM) is utilized.Te utilization of CHMM in this study provides several advantages compared to other hidden Markov models, particularly in its ability to model interacting processes across various domains [11,32].It enables a comprehensive consideration of the interrelationships among diferent variables.Based on the results of driving behavior semantics segmentation, drivers are clustered into diferent groups by Wasserstein distance.Te Wasserstein distance is a distribution distance metric that is insensitive to outliers and makes no assumptions on the distribution range.It effectively distinguishes diference between two probability distributions, even in cases of limited overlap, making it a valuable tool for measuring driver heterogeneity [33,34].Te prediction model combines the LSTM with the CNN, since the CNN is good at extracting features from variables which can improve the performance of the prediction model [35,36].
Te contributions of this study can be summarized as follows: (1) A method based on CHMM and Wasserstein distance for driving behavior analysis and evaluation is introduced to undertake a refned clustering of drivers.(2) A CNN-LSTM model is proposed for predicting vehicle acceleration.Te results indicate that the CNN-LSTM with a strong feature extraction capability outperforms the LSTM in vehicle acceleration prediction.
Te remainder of this paper is arranged as follows: Section 2 introduces the data sources and preprocessing.In Section 3, this paper introduces the CHMM and the CNN-LSTM.Section 4 shows the results of semantic segmentation and evaluation for driving behavior.Tis section also presents the vehicle acceleration prediction results using the CNN-LSTM.Finally, Section 5 provides conclusions and future work for this study.

Data Description and Preprocessing
2.1.Data Extraction.Te car-following data used in this paper are obtained from the Safety Pilot Model Deployment dataset (SPMD).Tis comprehensive dataset contains driving data for 2,842 vehicles over two years in Ann Arbor, Michigan, USA.98 sedans in this dataset are equipped with a data acquisition system and MobilEye [37].Te onboard data, including vehicle speed, acceleration, and GPS, are obtained from the data acquisition system while the lateral position relative to the lane or road edge is recorded by the MobileEye system.Each driver operates a vehicle and engage in several car-following instances.During these instances, data such as the car-following event ID, relative distance between the subject vehicle and the preceding vehicle, relative speed, acceleration, and data collection timestamps are collected.
Te extraction principles for stable car-following events are as follows [38]: (1) Te ego vehicle is in the same lane as the vehicle in front.(2) Te relative distance is greater than 5 m and less than 120 m. (3) Te speed of the ego vehicle exceeds 5 m/s.(4) If the ID of the preceding vehicle changes, the event is terminated.( 5) Te duration of each carfollowing process cannot be less than 50 s.Te data collection area for the car-following events used in this study is illustrated in Figure 1 [22].Records from 30 drivers with the longest trip durations are selected, and the histogram illustrating the speed distribution of the ego vehicle is shown in Figure 2.
Te car-following scenario is shown in Figure 3.In this condition, the vehicle's acceleration or deceleration is predominantly determined by the relative distance and relative speed between the preceding vehicle and the ego vehicle [39,40].To maintain the desired distance, the driver modulates the brake or accelerator pedal accordingly.Depending on that, three characteristic variables include relative speed, relative distance, and the ego vehicle acceleration, are selected to illustrate diferent driving behavior.Te acceleration of the ego vehicle (a): it can explain driving intentions and preferences in driving behavior.Relative  Ego vehicle distance (Δd): Δd � x 2 − x 1 represents the positional difference in the forward direction between the ego vehicle and the preceding vehicle.Relative speed (Δv): it signifes the speed discrepancy between the ego vehicle and the preceding vehicle, Δv � v 2 − v 1 .

Variables Segmentation.
To better understand individual driving characteristics and ensure that the extracted similar driving behavior semantics from diferent drivers correspond to consistent driving behavior patterns, the three variables above are categorized into distinct levels.Te classifcation process ensures that driving behavior semantics extracted from car-following data from diferent drivers consistently exhibit identical driving patterns.Tis classifcation is carried out by taking into account the data characteristics and the physical and mental perception thresholds of drivers [23,41,42].Te variable segmentation information is shown in Table 1.
To eliminate dimension, the raw data is standardized as follows.After processing, the characteristic variables have a mean value of 0 and a standard deviation of 1.
where represents input data of the driver i in stable carfollowing event k, and the value of i is from 1 to 30.K is the number of stable car-following events of the driver.μ i and σ i denote the mean and standard deviation of the characteristic variables for driver i, respectively.

Methodology
3.1.Framework.Figure 4 displays the overall structure of the vehicle acceleration prediction in this paper.First, the carfollowing data is processed to obtain stable car-following events.Second, the CHMM model is used to divide the processed data into driving behavior semantics segments.And the diference of driving behavior is assessed using Wasserstein distance; drivers are grouped into subgroups depending on the driving behavior evaluation results.Tird, the CNN-LSTM is constructed for acceleration prediction in diferent subgroups.

CHMM.
Driving behavior data is subject to random variation.It only depends on the current state of the driving system.Traditional models struggle to capture the randomness of driving behavior.Te HMM with its powerful ability to describe dynamic processes can efectively overcome the stochasticity.Terefore, HMM has gained extensive utilization in the analysis of driving behavior [43].
Te CHMM is a variant of the HMM that extends its capabilities.Te coupled hidden Markov model (CHMM) incorporates a coupled multichain structure that establishes interconnections between the hidden state variables of the HMM chain.Tis model efectively captures interactions among multiple sequences, thereby enhancing its ability to handle latent connections between variables [23].Tus, the CHMM is able to process the interaction between diferent characteristic variables.Te CHMM with three chains is shown in Figure 5.In this study, the hidden state sequence is presented by the driving patterns, while the variables are applied as the observation sequence.Te sequence of observations for each HMM is only related to the hidden states of this HMM.Te hidden state in each chain is not only related to the hidden state at the previous moment in the HMM chain of itself but also infuenced by the hidden states of other chains [23].In the CHMM, each HMM chain has M hidden states, and the whole number of hidden states is M 3 .Te observation sequence is expressed as Te emission function in the CHMM uses Gaussian distribution to calculate the observation probability, considering that all variables are continuous.Journal of Advanced Transportation where q c,u is the hidden states of the HMM chain c, μ c,u and σ c,u respectively represent the probability distribution's mean and standard deviation.Driving behavior segmentation is a classic decoding problem in the application of HMM.Te CHMM can compute the most likely sequence of hidden states for the time-series driving characteristic in the car-following process.Te input of the CHMM includes time series of characteristic variables in the following car events: the number of the hidden states, the initial state transition probabilities, and emission probabilities while the CHMM will compute and output the most likely sequence of hidden states for each point in time.6 Journal of Advanced Transportation

CNN-LSTM.
A CNN-LSTM model consists of two parts: the CNN part is used to capture features of the carfollowing data and improve the algorithm's efciency.Te LSTM part is then constructed to predict the acceleration data.Te framework of the CNN-LSTM model is presented in Figure 6.First, the input dataset includes l time steps Δt before t th time step, which can be written as follows: where x t is a multidimensional vector at time t, which is expressed as x t � Δd t , Δv t , a t   T .Second, the CNN part is utilized to extract diferent features.CNN is a deep neural network that employs convolutional computation [44,45].It has been shown to be efective in extracting features from matrices and accelerating the training process [46].Te core component of CNN is the convolution layer, which extracts features from characteristic variables via convolution operation [47,48].Te structure of a single-layer convolutional network can be expressed as follows: where N j 1 is the output of the convolution layer, X j 1 is the collection of input samples of X t , W j 1 is the convolution kernel, and b j 1 is the bias weight of the layer.Where j is the channel index considering the multiple convolution flters in the convolutional layer, and the activation function is σ.
In general, the pooling layer follows the convolutional layer to reduce the dimensionality of the characteristic data obtained by convolution.In this paper, the maximum pooling layer is utilized.Te CNN part is followed by the LSTM part.LSTM, which is a variant of recurrent neural networks (RNN), incorporates input gates, output gates, forget gates, and memory cells to address the challenge of gradient vanishing or explosion [49][50][51].Figure 7 shows the structure of LSTM unit.Te forget gate (FG) determines which information about the state of the cell is lost.According to the h t−1 and the x * t , the forget gate outputs a number between 0 and 1 for the state.0 represents complete discard, and 1 indicates fully accepted.Te FG is expressed in (8): where f t represents the state of the forget gate at time t, ω f and b f is the forget gate weight and bias, respectively.x * t is the input at time t, which also indicates the output from maximum pooling.h t is the output, σ is the activation function.Input Gate (IG) is used to update information in the state of the input cell.It takes the input x * t at the present moment and the implicit layer state h t−1 at the previous time into the sigmoid function.It is calculated as follows: where i t indicates the input gate state, ω i and b i is updated by information stored in the previous memory cell c t−1 and new candidate information, it is shown as following equations: where  c t is the candidate cell state at t, c t is the updated cell state, ω c and b c is the weight and bias of the cell state, respectively.
• is based on the memory cell state c t and output gate state o t .It is expressed in equations:

Journal of Advanced Transportation
ω o and b o is the weight and bias of the input gate, respectively.

Wasserstein Distance.
In the driving behavior evaluation part, the Wasserstein distance is used to evaluate the difference between the distributions of various evaluation indicators for drivers.Te Wasserstein distance is a mathematical method used to measure the distance between two probability distributions and g(x).It can be mathematically expressed as follows: where c(x, y) is a set of all joint distributions whose marginals are f(x) and g(x), and W p (f, g) is the Wasserstein distance of order p between f(x) and g(x).When p � 1, the Wasserstein distance is referred to as the frst-order Wasserstein distance or Earth Mover's Distance.In comparison to KL divergence and JS divergence, the Wasserstein distance possesses the advantage of being sensitive to the shape of distributions.Even when two distributions have minimal overlap, the Wasserstein distance can efectively capture and diferentiate the dissimilarities between them.Furthermore, KL divergence and JS divergence have the limitations that both functions have a value of zero, rendering them undefned.In contrast, Wasserstein distance can overcome this drawback which is appropriate for the evaluation indicators in this study.

Driving Behavior Evaluation.
Given the diverse characteristics of drivers, similar car-following scenarios can prompt varied responses among individuals, resulting in signifcant fuctuations in the car-following characteristic variables between drivers.Tis poses a challenge to driving behavior modeling and prediction.By assessing the driving behavior among drivers and grouping those with minor diferences together, the variability in the characteristic variables within a group is reduced, facilitating more accurate modeling and prediction of driving behavior.Depending on the characteristics of driving behavior, most studies have divided diferent drivers into three types: aggressive, moderate, and conservative [52][53][54].Terefore, in this study, the number of hidden states in a single CHMM chain is set to 3, and the model is referred to as CHMM_3.
Figure 8 shows the duration distribution results for driver #5.As can be discerned from the fgure, the majority of driver #5's durations fall within the 1-10 second interval, with the average duration being 5 seconds.Figure 9 illustrates the segmentation results of driver #30.Te CHMM model divides the car-following process into several segments.One segment represents the same primitive driving pattern, refecting a uniform driving behavior characteristic.Te background color block represents extracted driving behavior patterns, with uniform colors indicating similar behavior types.Te curves in the graph indicate signifcant fuctuations in the characteristic variables of driver #30.Te segmentation results obtained from CHMM_3 align closely with fuctuations in the driver characteristic data.Tus, the driving behavior semantics extracted from the data efectively refect the potential driving behavior characteristics represented by characteristic variables.
Table 2 displays the duration distribution of driving behavior semantics obtained from GMM-HMM and CHMM_3.Te GMM-HMM has been widely applied in the feld of segmentation to process sequence data and unearth hidden states [55,56].Te results from CHMM_3 indicate that 81.3% of pattern durations fall within the range of 1 to 10, while the proportions of durations from 1 to 10 in the results of GMM-HMM are only about 60%.A similar study by Wang et al. [22] yielded similar fndings, where over 80% of behavior semantics durations ranged between 1 and 10 seconds, with a mean duration of approximately 5.9 seconds.Tus, the behavior semantics segments obtained from CHMM_3 can better describe the actual driving behavior characteristics.
In this study, the diference in driving behavior among drivers is evaluated by Wasserstein distance from three aspects: the duration, occurrence probability of a primitive driving pattern, and distribution of characteristic variables between every pair of drivers [23].To eliminate the infuence of results with diferent magnitudes on the overall result, normalization is applied to the results of the three dimensions separately.
Te duration distribution of semantics can provide insights into the characteristics of the patterns of driving behavior.For driver i, the duration of driving behavior semantics is d i , the duration distribution of semantics is represented as g(d i ).For driver j, the distribution of the duration d j is represent as g(d j ).Te diference of the duration of driving behavior semantics between two drivers can be expressed as follows: Te occurrence probability of a primitive driving pattern can serve as an indicator of drivers' driving behavior preference.Calculating the average values of three variables for each driving behavior semantics, labels are assigned to the driving behavior semantics based on the intervals defned in Table 1.For example, "CD-KE-AA" represents the semantic label "close distance-keeping-aggressive acceleration."Due to the discrete distribution of labels, the frequencies of diferent labels are used instead of the values of a continuous distribution function.By conditioning the carfollowing distance, the frequencies of diferent labels are computed for each distance category.For drivers i and j, the occurrence probabilities of labels are P(l S Δd i ) and P(l S Δd j ), respectively.And the average Wasserstein distance is then calculated for the three distance labels: Distribution of characteristic variables refects the degree of aggressiveness.According to Table 2, the driving behavior semantics are divided into eight segments based on the duration.Te mean and standard deviation of the feature variables x k are separately calculated for driver i and j within these eight intervals.Te data distributions are then ftted using a normal distribution which are represented as g(r x k i ) and g(r x k j ).Te average of the Wasserstein distance for the degree of aggressiveness diference is calculated as follows: At last, the average from the D i,j , L i,j , and R i,j provides a comprehensive evaluation index of driving behavior.
Figure 10 shows the heatmap results of the three evaluation aspects after normalization, as well as the comprehensive evaluation results.Te darker colors represent the larger Wasserstein distances, indicating a greater diference between the two drivers in each indicator.Rows and columns corresponding to drivers with signifcantly diferent behaviors compared to others in Figures 10(a

Prediction Results Analysis and Comparison.
We compared the performance of the proposed framework with the LSTM, and the acceleration prediction experiments are carried out separately for each group.All experiments have been conducted on a computer using the TensorFlow framework.As shown in equations ( 18) and (19), two indicators are employed for evaluating the performance of diferent algorithms on diferent groups: mean squared error (MSE), mean absolute error (MAE).
where n is the number of prediction samples, y i represents the true value, and  y i represents the predicted value of the acceleration.
For each driver in each group, the data is divided into training and testing sets, which are grouped based on carfollowing events, with a 4 : 1 ratio.Within the training set, allocate 15% of the data as the validation set.Te input length for both models is 80, with a prediction length of 1. Due to the inherent randomness introduced by random seeds in deep learning models, each model's performance can vary with each run.To reduce the infuence of randomness in the CNN-LSTM and the LSTM, fve separate experiments are conducted for both models, and results are shown in Table 3.For both models, across fve repeated experiments, the data used and hyperparameters (including learning rate, number of neurons, type of optimizer, etc.) are kept consistent.Group 1 consists of eight drivers with a minor diference in driving behavior, whereas group 2 comprises drivers with varying driving behavior.Group 3 encompasses all the drivers examined in this study.And group 4 consists of eight drivers that were selected at random.Te degree of heterogeneity between the four driver groups follows the pattern: group 2 exhibits higher heterogeneity than group 3, while group 3 exhibits higher heterogeneity than group 1.And group 4 also exhibits higher heterogeneity than group 1. Te bold results in the table represent the minimum experimental errors among diferent groups.In each group, the results of the fve experiments for diferent models are distinct, but they demonstrate minor fuctuations within a narrow range.Te results suggest the existence of randomness in the deep learning network   For further comparison, group 1 shows decreased predictive errors in both models compared to the other three groups.Tis observation indicates that the efectiveness of both models in vehicle acceleration prediction is diminished when applied to driver groups with greater heterogeneity.In particular, the CNN-LSTM of group 1 achieves a higher accuracy than group 4 with an improvement of 60.0% in MSE and 42.8% in MAE.Both group 1 and group 4 comprise an equal number of drivers.Te diference lies in the method of selecting these eight drivers.Group 1 is the result of clustering using the Wasserstein distance based on the behavior semantics divided by the CHMM, while group 4 is the result of random selection.Te better performance of group 1 is probably due to the fact that the CHMM is capable of efectively partitioning driving behavior semantics and that the driving behavior evaluation method using Wasserstein distance is comprehensive and rational, leading to a more accurate evaluation of driver heterogeneity and consequently making the driving behavior more consistent among drivers within group 1. Groups 2-4 show larger heterogeneity among drivers, presents a greater challenge for vehicle acceleration prediction.Te comparison of the true values with the prediction values is drawn in Figure 11.Te areas where there are signifcant performance diferences between two models are highlighted with green boxes.According to three subfgures separately, the CNN-LSTM exhibits better alignment with the true value curve compared to the LSTM, as evident from the visual analysis.Tis is probably because the proposed framework can profciently leverage the informative data available.Te model exhibits an exceptional nonlinear ftting capability on vehicle acceleration prediction for driver groups with minor diferences in driving behavior.Te pairwise comparison of the three subgraphs reveals that the two models in group 1 demonstrate superior ftting performance to the ground truth values among the three groups due to the relatively high degree of alignment observed between the lines.Te fndings are in line with the predicted outcomes presented in Table 3.

Conclusions
Tis paper has proposed a framework for predicting vehicle acceleration based on the CNN-LSTM, considering heterogeneity in driving behavior among drivers.To achieve this, the CHMM is utilized to segment carfollowing data on driving behavior into semantics segments.Tis model enables an accurate description of the interconnections between multiple variables.Based on the results of driving behavior evaluation, drivers are categorized into diferent groups by Wasserstein distance.Wasserstein distance has a wider range of applicability and can provide more accurate clustering results.Te CNN-LSTM model is then employed to predict vehicle acceleration for the driver group with minor diferent characteristics.Te experimental results demonstrate that the CNN-LSTM outperforms LSTM in terms of prediction accuracy.Moreover, the CNN-LSTM based on driving behavior analysis exhibits a superior prediction performance compared to the CNN-LSTM without clustering.Overall, the proposed model can provide a high accuracy for vehicle acceleration prediction.For future work, the proposed CNN-LSTM model can be expanded to process car-following data from other locations or diferent road conditions to further evaluate the generalization capabilities of the model.

Figure 2 :
Figure 2: Te speed distribution of the ego vehicle.

Figure 1 :
Figure 1: Te car-following data collection area.

Figure 4 :
Figure 4: Entire process of vehicle acceleration prediction.
2,t , Z 3.t  .For time t, the observed values of Δd, Δv, and a is Z 1,t , Z 2,t , and Z 3,t .Te hidden states are S 3 t � S 1,t , S 2,t , S 3.t  , S 1,t , S 2,t , and S 3,t represent the hidden states of Δd, Δv, and a at t. Te CHMM is represented by parameters λ � (A, B, Π).Te state transfer matric A � a u,v  , probability of observations B � b u (Z t ) and initial state probabilities Π � π u

Table 1 :
Segmentation of variables.

Table 3 :
Te prediction results for four diferent groups.
Te bold results represent the outcomes of the model with the smallest prediction errors among the four driver groups.Acceleration (m/s2 LSTM across four driver groups.Te bold results represent the outcomes of the model with the smallest prediction errors among the four driver groups.Te CNN-LSTM consistently outperforms the LSTM for all four driver groups.For group 1, the CNN-LSTM demonstrates marginally superior predictive accuracy compared to the LSTM.In terms of group 2, CNN-LSTM outperforms LSTM with a 21.5% improvement in MSE and a 13.5% improvement in MAE.Similarly, in group 3, the CNN-LSTM model outperforms LSTM with a 25.9% improvement in MSE and a 16.3% improvement in MAE.With group 4, the CNN-LSTM model outperforms LSTM with a 11.2% improvement in MSE and a 6.8% improvement in MAE.