Short-Term Passenger Flow Prediction of Urban Rail Transit Based on SDS-SSA-LSTM

Predicting rail transit passenger ﬂ ow is crucial for modifying the metro schedule. To increase prediction accuracy, a model is proposed that combines long short-term memory (LSTM) with single spectrum analysis (SSA). Firstly, a stepwise decomposition sampling (SDS) strategy based on SSA progressive decomposition is proposed as a solution to the data leaking issue in traditional sequence decomposition. Then, based on this strategy, the passenger ﬂ ow time series with complex features is decomposed into a relatively single trend and ﬂ uctuation component. Finally, the LSTM network is employed to perform short-term predictions on each component separately. The predicted value of each component is accumulated to obtain the original passenger ﬂ ow ’ predicted result. The example shows that, compared with the single LSTM and other hybrid models, the proposed method o ﬀ ers a greater overall prediction accuracy in the experimental days, and the method has speci ﬁ c applicability.


Introduction
Subway system has the advantages of large capacity, high speed, and high reliability. It has attracted more and more attention both academically and practically [1]. However, while the public is pursuing the high-quality metro, various problems such as congestion and transfer inconvenience must be solved.
The metro system's short-term passenger flow (STPF) prediction is a crucial component in managing public transportation. By releasing the predicted STPF and congestion status of bus routes to the public, passengers can adjust travel mode, route selection, or departure time in advance to better understand travel mode. For the government and operators, the accurate STPF forecast can effectively control and assess the system status and implement response measures when emergencies or special events occur [2]. Hence, the accurate STPF forecast is of great significance.
Summing up, in order to predict the STPF more accurately, it is necessary to consider whether the prediction results can reflect the actual online prediction effect and effectively avoid the problem of data leakage. In this research, a prediction model based on SDS-SSA-LSTM is proposed. The SDS strategy can effectively avoid the prediction model from involving future data in the testing phase and solve the data leakage problem of the traditional ODS strategy. Using the SSA method, the trend and fluctuation components in the passenger flow sequence with mixed features can be effectively extracted. Combined with the ability of the LSTM network to transmit long-term dependencies of sequence data, the accurate prediction of STPF can be achieved.

Literature Review
For unstable traffic flow, at present, wavelet decomposition (WD) [3,4] and variational mode decomposition (VMD) [5] have been studied for rail transit STPF prediction. However, the above studies all adopt the overall decomposition sampling (ODS) sampling technique to process the data samples; that is, the passenger flow time series is decomposed into subsequences. Nevertheless, the future data is assumed to be known in the decomposition process, and there is a problem of data leakage. Therefore, the model in online prediction has poor practicability and unreliability.
The sampling technology based on stepwise decomposition sampling (SDS) [6] can effectively avoid the data leakage problem in traditional ODS technology. EMD cannot be used for SDS because the number of modal decompositions is uncontrollable [7]. EEMD and CEEMD have their algorithm principle [8,9]. With the update of the edge value of the time series to be decomposed, each decomposed component will change significantly. When they are sampled by SDS, the time sequence of the training samples is uncontrollable. It is difficult to determine the wavelet basis and decomposition level of WD. The trend items directly decomposed by VMD [10] will produce significant errors, which will make prediction difficult. SSA is a nonlinear time sequence analysis method [11], which can effectively extract the trend and volatility in the original sequence. There is no need to select a priori basis function. With the update of the time series to be decomposed, the sequence characteristics of the history of each decomposed component can be almost wholly preserved [12]. Zhou et al. [13] combined SSA with AdaBoost weighted limit learning machine. The original data was divided into three parts for the construction of SSA: trend, periodicity, and residue. Shuai et al. [14] show that the complex characteristics of regularity and randomness of traffic flow can be captured by the decomposed traffic flow prediction hybrid model. Therefore, with the advantage of the SSA decomposition method, this paper decomposes the mixed passenger flow series into trend series and fluctuation series with relatively single characteristics, which is easy for the model to learn.
In terms of the STPF prediction, traditional prediction models mainly include the autoregressive integrated moving average model [15], gray model [16], and Kalman filtering model [17]. The traditional time series prediction model has weak generalization ability, its performance and application will be limited, and it is challenging to meet the requirements of processing massive data in practice. In recent years, a large amount of data generated by various city sources has allowed us to understand better the essence of hidden dynamics in the traffic system and significantly changed the way of predicting passenger traffic. Due to higher computing capabilities, there is various computing intelligence and data-driven technology to create opportunities for the use of big data to broaden the practical application of the intelligent transportation system (ITS) [18].
Advanced intelligent algorithms have attracted the attention of many scholars due to their superior structure and performance. Li et al. [19] combined the characteristics of segmented passenger flow data with a BP neural network to predict segmented passenger flow. Tsai et al. [20] proposed a new neural network which proves to produce satisfactory results in forecasting STPF. Deep learning can capture complex nonlinear relationships in big data, which significantly improves predictive accuracy. In terms of the STPF prediction, the most commonly employed models are CNN [21][22][23][24] and RNN [25], which have shown their excellent ability to extract space-time structure information. LSTM [26][27][28] [29][30][31][32][33] proved that LSTM has certain advantages in capturing the regularity of STPF.
The sampling strategy of SSA stepwise decomposition (SDS) can be applied to avoid the data leakage in the traditional method of sequence decomposition. LSTM neural network is suitable for predicting each component by learning historical data. The SDS-SSA-LSTM hybrid model can provide a new method for the accurate predicting of STPF. In this paper, we attempt to decompose the time series with complex characteristics through the SDS strategy based on SSA, so as to extract the internal characteristics of complex passenger flow data, and build a LSTM memory network model to forecast the short-term metro passenger flow.  Figure 1: The input and output samples.
The formula is as follows: where K = n − L + 1 (2) Singular value decomposition. Calculate the eigenvalues ; the singular value decomposition of the trajectory matrix Y can be written as follows: where is the ith singular value of the trajectory matrix Y corresponding to the left singular vector; and V i is the ith singular value of the trajectory matrix Y corresponding to the right singular vector. The contribution degree of each matrix Y i to the trajectory matrix Y is related to the eigenvalue λ i , and its contribution rate η can be defined as follows: (3) Diagonal averaging. Each submatrix Y i is reduced to a time series with n length. Suppose that Y is a matrix with the same shape as L × K. Let L * = min ðL, KÞ and K * = max ðL, KÞ. If L < K, y * ij = y ji ; otherwise, the restructured series Z c = ðz c1 , z c2 ,⋯,z cn Þ is defined as follows: (4) Grouping. The singular vector corresponding to the largest eigenvalue represents the largest change trend of the original sequence, and the singular vector corresponding to the smaller eigenvalue reflects the fluctuation components of the original sequence. By setting the contribution rate threshold η e (generally taken η e ≥ 80%) and then gradually   The application of the above SDS strategy completely avoids the data leakage problem in traditional ODS, and the model based on this strategy conforms to the actual online prediction scenario.

Long Short-Term Memory Network (LSTM).
Compared with traditional neural networks, RNN can capture the temporal regularity of sequence data more effectively. LSTM was originally a variant of the traditional recursive neural network. It has strong memory ability when dealing with time-series prediction and is widely used in time-series prediction scenarios with long time interval and delay. The basic structure of the LSTM unit is shown in Figure 2.
LSTM adds more memory modules to the neural network to selectively let information pass through, with each door having a different function.
The forget gate f t decides which news to discard by integrating the information of the last moment and the news of the present moment. The formula is as follows: Week Week 1 Week 9 Week 5 Week 2 Week 10 Week 6 Week 7 Week 3 Week 11 Week 4 Week 8

Journal of Advanced Transportation
where σð⋅Þ is the sigmoid activation function, x t the input at t, h t the output at t, h t−1 the output at t − 1, C t the candidate vectors at time t, w f x , w f h , w f c weight coefficient of the forgetting gate, and b f bias of forgetting gate.
An input gate i t is the opposite of a forget gate f t . This section decides the news to enter the cell state based on the input threshold. The output of the input gate at t is as follows: where w ix , w ih , w ic are the weight coefficients of input gate, b i is bias of input gate, w cx , w ch are the weighting coefficients and bias of candidate vectors, b o is the bias of candidate vectors, tanh ð⋅Þ is the hyperbolic tangent activation function, and v t is the updated value of candidate vectors. The output gate o t determines what information will be output. No information can pass through the output gate except what is required. So the output at time t is as follows:   where o t is the output gate and w ox , w oh , w oc are the weighting coefficients of the output gate. Figure 3 is the process of the SSA-LSTM model based on the SDS strategy, and here are the modeling steps:

STPF Prediction Model.
Step 1. Original data processing. The original rail transit passenger flow's preprocessed time series is divided into a training set and a test set.
Step 2. Decomposition of original passenger flow data. SDS for the training set based on the SSA method until the    Step 3. Establish prediction model. The LSTM prediction model in view of the deep learning is established for the training set in each subsequence, respectively.
Step 4. Perform SSA stepwise decomposition on the test set, and input it into the LSTM to generate the prediction results of each component.
Step 5. Accumulate the predicted values of each component. Get the STPF forecast results.

Data Analysis.
The original data is selected from the daily passenger flow data of Xi'an Metro Line 3 from January 1, 2017, to January 24, 2020, to analyze the established model. The sampling interval is one day, and a total of 1119 groups of sample points are obtained.
Passenger flow data has a time correlation. The closer the time is , the higher the correlation between the speed. With the increase of time, the correlation weakens. In addition, the passenger flow data has an obvious trend in the short term and periodic in the long term. To analyze the temporal characteristics of passenger flow data, some datasets are selected for visual analysis, as shown in Figure 4.
The weekly trend changes in Figure 4 are very similar, which shows the passenger flow data has periodicity of   9 Journal of Advanced Transportation prominence. Therefore, in this study, the passenger flow data is grouped into two samples, from Monday to Thursday (data 1) and from Friday to Sunday (data 2), and the two sets of datasets are, respectively, predicted to verify the model.

Passenger Flow Decomposition
4.2.1. SSA Decomposition. The SSA is applied to the historical time-series for SDS. Figures 5 and 6 show the decomposition results of training samples and test samples of the two datasets, respectively.

Data Normalization.
To improve the data processing efficiency of the subsequent prediction model, it is necessary to normalize each subsequence. In this paper, MinMax is selected for normalization.

STPF Forecast
4.3.1. Model Parameter Setting. In this paper, given the daily passenger flow data of Xi'an Metro Line 3 from January 1, 2017, to January 24, 2020, the short-term passenger flow in the future period is calculated. After many tests, the parameters of the SDS-SSA-LSTM model are set in the example as shown in Table 1.

Performance Comparison.
In this paper, three single models and four mixed depth models commonly used in the existing literature are selected to compare the prediction accuracy. To evaluate the effectiveness and accuracy of the SDS-SSA-LSTM model, the mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE) were used to evaluate the model. Table 2 shows the error comparison results of the prediction model.
According to the error results in Table 2, it can be seen that (1) in the classic single-model prediction, the LSTM model shows better prediction effect on both datasets, and RMSE and MSE indicators are the lowest. However, the traditional statistical method has low prediction accuracy for passenger flow data with complex influencing factors (2) in terms of predicting STPF, the hybrid model with decomposition and fusion strategies performs better than the single model. To some extent, the prediction accuracy has increased (3) the prediction performance of the DWT-LSTM, EMD-LSTM, and VMD-LSTM models using ODS strategy is lower than that of the SSA-LSTM model based on SDS strategy (4) ODS strategy assumes that the future data is known from the beginning, but in practical application, the future data needs to be predicted. Therefore, the prediction results obtained by the model based on the ODS strategy cannot reflect the actual online prediction effect. The model based on the SDS strategy completely avoids the leakage of future data, and its prediction results can guide online prediction

Conclusion
This research suggests a prediction methodology based on SDS-SSA-LSTM to better precisely predict the passenger flow sequence of rail transit. The SDS strategy can effectively avoid the future data involved in the prediction model in the testing phase and solve the data leakage problem of the traditional ODS strategy. Using the SSA method, the passenger flow sequence with mixed features is decomposed into trend sequence and fluctuation sequence with relatively single features, which is easy for model learning. Through example verification, the proposed model established in this paper performs well in the STPF prediction, and this method reveals great application prospects. In the future, developing a visual STPF forecasting platform is the next research focus, and developing a prediction model based on decomposition needs to be combined with SDS strategy. In addition, the cumulative error caused by the increased in modeling quantity is a challenge in achieving the STPF prediction.

Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.