Research on Railway Passenger Volume Forecast Based on the Spline Interpolation and IPSO-Gradient Difference Acceleration Rule

,


Introduction
As an important basis for preparing macro development strategy and passenger transportation plan, railroad passenger volume forecasting is of great signifcance for formulating railroad network planning, designing train operation plans, optimizing passenger transportation product structure, and improving passenger transportation service level.Tere are more research results on railroad passenger trafc forecasting, which are broadly divided into two categories: one is to introduce corresponding methodologies to improve the model according to the shortcomings of the forecasting model itself in order to improve the forecasting accuracy.Dempster and Bui et al. studied the neural network model and proved that the prediction results of this model are more accurate than those of other models [1,2].Aihara et al. successfully combined artifcial neural networks with the chaos theory to develop a freight volume forecasting model [3].Li et al. predicted passenger transport volume on railways based on the Grey-Markov chain model [4].Qiu et al. forecasted China's railway freight volume based on the combined PSO-LSTM model [5].
Another category is the combined prediction model formed by combining the characteristics of several models.Wang combined the Gray model and the neural network to obtain a Gray neural network model, and successfully predicted the highway passenger fow [6].Ye et al. combined the Markov and the Gray GM (1, 1) model and established the Gray-Markov forecasting model to forecast the freight volume [7].Ge et al. studied an ARIMA model and combined it with the FSVR to propose a hybrid prediction method for high-speed railway passenger trafc forecasting [8].
When social emergencies such as epidemics and major social events occur, scholars choose models with a strong adaptability to passenger fow fuctuations to make relevant predictions.Jiao proposed an improved STL-LSTM model to improve the accuracy of bus passenger fow prediction during COVID-19 [9].Wang et al. showed that the SAR-IMA-NAR combined model can be used during the epidemic [10].
Based on the current research fndings, when there are sudden fuctuations in the training data caused by unexpected social events, scholars select models with a high adaptability to predict passenger fow fuctuations.However, the impact of sudden increases or decreases in the passenger volume caused by epidemics, major social activities, and other unexpected social events on the changes in the trend and pattern of passenger volume has not been efectively addressed.Consequently, it disrupts the accuracy of the prediction [11].Sudden events such as the 2003 SARS epidemic in Beijing, the 2020 COVID-19 pandemic, and major holidays all cause sudden fuctuations in passenger volume [12].When incorporating such data for volume prediction, fuctuations in passenger volume over time may impact the model's training parameters, thus resulting in higher prediction errors [13].In addition, most of the existing forecasting methods improve the forecasting models according to the characteristics of the research object or apply multiple forecasting models for combined forecasting.Te prediction accuracy of the weighted combination of multiple models is too dependent on the prediction efect of the selected prediction model and has a certain degree of uncertainty, especially when one of the models has a better prediction accuracy, while another model has a poor prediction accuracy, then there will be a situation that the combined prediction accuracy is worse [14].
Spline interpolation is based on the segmented low-order polynomial interpolation method, and the interpolation condition and smoothness of the curve are achieved by adjusting the coefcients of the interpolation basis function on each interval, which has a high accuracy and stability, and the ftted curve is smooth and not easy to oscillate [15].Spline interpolation can better explore the rule of smooth change of railroad passenger volume with time series [16].Terefore, to address the problem that the irregular changes in railroad passenger volume under social emergencies afect the accuracy of passenger volume prediction, this paper introduces spline interpolation to correct the preprocessing of abnormal passenger volume data caused by major events such as new crown pneumonia and epidemics to solve the problem of interference of abnormal data on model prediction accuracy, and applies the Holt exponential smoothing method and the BP neural network for railroad passenger volume forecasting [17].To address the shortcomings of the existing combined forecasting methods, the idea of the rediference acceleration rule is proposed to combine the forecast results for correction, and the redifference metric parameters are optimized by the improved particle swarm algorithm.Finally, combining the spline interpolation method and rediference acceleration rule with the Holt exponential smoothing method and the BP neural network forecasting proposes a railroad passenger trafc forecasting method that can adapt to irregular fuctuations of passenger trafc under unexpected events and canimprove the accuracy of combined forecasting.

Triple Spline Interpolation
Triple spline interpolation (spline interpolation for short) is an important method used in numerical analysis for function estimation and numerical ftting, which can explore the variation pattern of data series and estimate the function values of certain interpolation nodes in between [18].Te basic idea of applying the spline interpolation method to solve the problem of abnormal railroad passenger volume data caused by holidays or major events is as follows: assuming that the variation law of the railroad passenger volume forecast data series in the study area in the past years is a cubic polynomial function S(x), and by excluding the abnormal data due to holidays or major events, the interpolation nodes (i.e., time nodes) are noted as x c , then the function is a cubic polynomial on every small interval on the cubic spline function.If a function value y j is given at node x j , then the interpolation condition is satisfed [19].

S x j
� y j . (1) Ten, we call S(x) three times the spline interpolation function.
Te expression of the cubic spline interpolation function is where a, b, c, and d are the parameters to be estimated.
To fnd out S(x), it is necessary to determine the 4 parameters to be estimated on each small interval [x j , x j+1 ].Since S(x) is a second-order derivative continuous on the interval (a, b), the continuity condition is satisfed at node x j .
Te natural boundary condition of S(x) at the interpolation node x 0 , x n is By combining the interpolation condition, the continuity condition and the natural boundary condition can be solved for S(x).Te correction value of the abnormal data due to holidays or major events is S(x c ).

IPSO-Redifferential Acceleration Law
3.1.Law of Rediferential Acceleration.Te law of rediferential acceleration is an approximation acceleration technique proposed by Liu Hui, a mathematician in the Wei-Jin period of China, to solve higher precision values, the basic principle of which is that in a monotonically bounded approximation series a n  , the existence of a 1 , a 2 , • • • , a n gradually approaches the limiting value a * , assuming that the approximating series all satisfy the rediference metric δ, the series deviation ratio is a certain value, and the resulting series can be accelerated to approximate the limiting value according to the following formula: where  a n is the improved value of a n and δ is the rediference metric (δ > 1).
Te rediference metric is calculated as According to the rediferential acceleration law, a n+2 is the higher accuracy series approximation value, and a n+1 and a n have the next highest accuracy.Applying this idea to the correction of railroad passenger volume forecast results, the following rediference acceleration correction formula is available: where  F is the improved value of the railroad passenger volume forecast, F b is the forecast value of model b, which has a higher forecast accuracy, and F a is the forecast value of model a, which has a lower forecast accuracy than model b.
According to (7), it can be seen that the implementation of the acceleration rule requires the selection of "one main and one auxiliary" forecasting methods for railroad passenger trafc forecasting, and "one main" means the forecasting method with a better forecasting accuracy, and the forecasting result is F b .Te "one auxiliary" is the forecasting method with a poorer forecasting accuracy, and the forecasting result is F a .Te optimization by the rediferential acceleration rule can make the forecast correction value better than the forecasting result of the "one main" model.Tis design idea maximizes the promotion of the prediction value with a better accuracy and suppresses the prediction value with a poor accuracy, so as to obtain the prediction value with a higher accuracy.
Te rediference metric can be calculated according to (6), but this method has two drawbacks: one is that a better or worse accuracy forecasting model is also needed to forecast the railroad passenger trafc, which not only increases the difculty and complexity of the forecasting work, but also cannot guarantee the forecasting accuracy of the selected model; and the other is that the rediference metric calculated according to ( 6) is an estimated value, which is not the optimal parameter value to achieve combined forecasting.Terefore, this paper proposes an improved particle swarm algorithm (IPSO) to solve the optimal rediference metric.

Basic Principle of the Particle Swarm Algorithm (PSO).
Te PSO is a swarm intelligent search algorithm developed by imitating the characteristics of a fock of birds foraging.Its basic idea is to fnd the fnal foraging location, i.e., the optimal solution position, by sharing the respective search information of each forager in the fock.In practical application, the particle fies continuously in the set space, and continuously adjusts its position according to its own search experience in the process of fnding the optimal target location, until it satisfes the search termination condition to fnd the optimal solution [20].
Suppose the position and velocity of the ith particle of a certain population at the tth iteration are x i,t and v i,t , respectively, then the particle updates the position and velocity by supervising the individual extremes and population extremes to further approximate the optimal solution.Te particle velocity and position update are calculated as follows: where w is the weight, taken from 0.4 to 0.9; c 1 and c 2 are the individual learning factor and group learning factor, respectively; rand is a random number generated between 0 and 1; pbest is the individual extreme value; gbest is the group extreme value; and λ is the speed coefcient, generally taken as 1 [21].

Improvement of PSO.
Te basic PSO generally adopts the method of fxed weights to seek the optimal solution, i.e., a certain fxed value in 0.4∼0.9, and the learning factor is generally taken as c 1 � c 2 � 2. Depending on diferent data and simulation environments, its value afects the optimization-seeking ability and convergence speed of the particle swarm algorithm.Terefore, this paper proposes the idea of nonlinear variation to optimize the weights and learning factors of PSO to improve the optimization ability of the algorithm.Te principle of weight optimization is as follows: at the initial iteration, a larger weight is set to ensure that the initial iteration has a larger particle search speed and a better global search capability, and when the number of iterations increases, the weight assignment needs to be reduced to slow down the particle search speed and ensure that the particles have a better local search capability.Te optimization formula of weight w is as follows: Journal of Advanced Transportation where w max is the maximum weight, w min is the minimum weight, ger is the maximum number of iterations, and iter is the current number of iterations, 0 < iter ≤ ger, iter ∈ N * .Te learning factor optimization principle is the optimal value of c 1 ranging from 2.5 to 0.5, and the optimal value of c 2 ranging from 0.5 to 2.5.Terefore, in the optimal value range, in order to ensure the global search ability at the beginning of the iteration, the value of c 1 decreases nonlinearly with the increase of the number of iterations, so that the learning ability of the individual particle itself is larger at the beginning.At the same time, let c 2 increase gradually with the number of iterations to strengthen the group learning ability in the late iteration and avoid the algorithm falling into the local optimum in the late iteration.Based on this idea, this paper selects the nonlinear change function constructed by the power function to improve the learning factor, whose expression is

Algorithm Flow for IPSO Optimization of the Redifference Metric.
Te specifc algorithm for IPSO optimization of the rediference metric taking values is shown in Figure 1.
Step 1. Forecast the railroad passenger volume according to the selected forecasting method.
Step 2. Use the rediference acceleration rule to process the predicted value of the railroad passenger trafc, and at this time, the predicted improved value is a function expression containing.
Step 3. Initialize the population particle parameters.According to the constraints, such as the range of particle parameters, set the initial parameter values.
Step 4. Calculate the particle ftness.With the goal of optimal prediction accuracy, the ftness function f(δ), which is the minimum absolute value of the relative error of prediction for all prediction years, is determined for the rediference metric optimization.Terefore, the ftness function is defned as follows: where  F t (δ) is the improved value of the rediferential acceleration rule forecast in a period t, m is the total forecast year, and X t is the actual value of the passenger trafc in a period t.
Step 5. Compare the particle ftness values of diferent iterations to fnd the current optimal particle ftness value and its position, so as to update the particle search position and search speed.
Step 6. Determine whether the iteration termination condition of the particle swarm algorithm is satisfed.If satisfed, the algorithm ends; if not, return to step 4.
Step 7. Output the optimal solution.

Forecasting Method Selection
Railroad passenger volume forecasting methods can be divided into linear forecasting and nonlinear forecasting models [22].Among the linear forecasting models, time series forecasting models are represented by ARIMA and the exponential smoothing method, which have the advantage of not needing to study the infuence of independent variables (infuencing factors) on forecasting results.Among the nonlinear forecasting models, the BP neural network has a better adaptability to nonlinear demand forecasting problems by virtue of its powerful self-adaptability, selflearning, and fault tolerance.Terefore, in this paper, the classical Holt exponential smoothing method and the BP neural network are chosen as the representatives of linear and nonlinear forecasting methods for railroad passenger trafc forecasting.

Holt Exponential Smoothing Method.
Te Holt exponential smoothing method is based on the primary exponential smoothing method to forecast the original time series

4
Journal of Advanced Transportation based on the smoothed and trend values, which strengthens the prediction ability of the exponential smoothing method for trend data [23], and its prediction formula is where F t+m is the predicted value in the period t + m, S t is the smoothed value in a period t, X t is the actual value in a period t, b t is the trend value in a period t, m is the number of prediction overruns, and α, c is the exponential smoothing-related parameter [24].

BP Neural Network.
Te BP neural network is a multilayer feedforward neural network trained according to the error backpropagation algorithm, which is widely used [25].Te topology of the BP neural network is shown in Figure 2, and its basic idea is the gradient descent method, and the learning process includes forward propagation and backpropagation.In the forward propagation process, the railroad passenger prediction factors are passed from the input layer to the implicit layer as input information, and fnally passed to the output layer to output the corresponding railroad passenger volume and training error information [26].If the output prediction error is larger than the training target, the error is backpropagated from the output layer by repeatedly training and adjusting the weights of w ij and w jk until the error is reduced to an acceptable level or until the maximum number of learning is performed.At this point, the sample data is input again to obtain the output value with the minimum error [27].
Tere are many types of activation functions for the BP neural network prediction, each with its own advantages and disadvantages.In this paper, we choose the commonly used tansig function and purelin function as the hidden layer neuron and output layer neuron activation functions, respectively, whose expressions are [28] as follows:

Railway Passenger Volume Forecast Based on the Spline Interpolation and IPSO-Gradient Difference Acceleration Rule
Te combined railroad passenger volume forecasting process combining the spline interpolation method and the rediferential acceleration rule with the Holt exponential smoothing method and BP neural network forecasting is shown in Figure 3.
Step 1: the spline interpolation method is used to make data replacement corrections for the abnormal railroad passenger volume caused by holidays or major events, and the corrected railroad passenger volume data series is formed.
Step 2: based on the corrected railway passenger volume data series, the BP neural network and the Holt exponential smoothing method are applied to make predictions.
Step 3: the adaptation function of the IPSO-optimized redistribution metric was constructed based on the prediction results, and the IPSO was applied to solve the optimal redistribution metric.
Step 4: based on the optimal rediference metric, the rediference acceleration law of equation ( 7) was applied to improve the prediction results of the BP neural network and Holt exponential smoothing.

Case Study
In 2003, Beijing was afected by the SARS epidemic, and the railroad passenger trafc was "abnormally decreasing" [29][30][31][32][33]. Terefore, the research uses Beijing railroad passenger trafc from 2000 to 2019 as the research object, considering the impact of regional GDP, regional resident population, per capita consumption level, and the number of tourists.And the research uses the combined prediction efect of spline interpolation and rediference acceleration method combined with Holt's exponential smoothing method and BP neural network prediction (Table 1).

Data Preprocessing Based on Spline Interpolation.
In the data series of railroad passenger volume from 2000 to 2019 in Beijing, the railroad passenger volume in 2003 is excluded, and three times spline interpolation is performed by calling the spline function with the help of MATLAB, and its function image is shown in Figure 4, and the interpolation calculates the revised value of the railroad passenger volume in 2003 as 52.27 million.Te preprocessing situation of the spline interpolation method for railroad passenger trafc in Beijing is shown in Figure 5.It Journal of Advanced Transportation can be seen that the overall smoothness of the railroad passenger volume after the spline interpolation method is better, which is more consistent with the overall development trend of the passenger volume.

Single Model Prediction Based on Spline Interpolation.
Te railroad passenger volume data from 2000 to 2016 are used as training samples, the railroad passenger volume from 2017 to 2019 are used as test samples, and the Holt exponential smoothing method and BP neural network are used for forecasting, respectively.In the research process, the "relative prediction error" is used as the evaluation index of the model prediction accuracy, which is abbreviated as "prediction error" for the convenience of presentation.

Holt Exponential Smoothing Prediction. Te results
and prediction errors of Holt exponential smoothing using SPSS are shown in Figures 6 and 7.
It can be seen that Holt exponential smoothing has a good prediction efect.After the spline interpolation method preprocessed the data, the absolute value of the average ftting prediction error for the training samples was 2.053%, and the absolute value of the average prediction error for the test samples was 2.160%.Te prediction efect of the spline interpolation method is better after the substitution correction of abnormal passenger trafc data.When predicting based on the original data of Beijing railroad passenger volume, the prediction error of the passenger volume in 2003 was as high as 16.016%, and the absolute values of the average prediction errors of its training and testing samples were 5.228% and 5.862%, respectively, which were 2.6 times of the prediction errors after processing the abnormal passenger volume data based on the spline interpolation method.

BP Neural Network Prediction.
Te prediction process of the BP neural network is implemented by MATLAB programming, and the main parameters are set as follows: the number of training is 1000 times, the hidden layer is 1 layer, and the neurons are set as 1.Te prediction performance of the BP neural network is shown in Figures 8 and 9.
Figure 8 shows the variation of prediction error with the number of iterations for diferent samples, blue is the training set, red is the test set, and green is the validation set generated by the system.Te network converges after 14 times of training, and the network error of the validation set is 0.00917, and the network error of the training and test sets is much lower than 0.00917, with a better error accuracy.Journal of Advanced Transportation can also be seen from Figure 11 that the spline interpolation method has a better prediction efect after processing abnormal passenger trafc data, and the average absolute value of prediction error for training and test samples is reduced by 1.712% and 0.860%, respectively.

Combined Prediction Efect Based on the Rediferential
Acceleration Law 6.3.1.Simulation Parameter Setting.Te algorithm experiments in this section are conducted in the MATLAB 2019 environment, and the improved particle swarm algorithm code is written to fnd out the optimal rediference metric.Te main parameters of the particle swarm algorithm are initialized as follows: the population size is 30, the maximum number of iterations is 300, and the speed search interval is −0.5 to 0.5.To further verify the efect of the nonlinear change optimization method proposed in this paper on the superiority-seeking ability and convergence speed of the particle swarm algorithm, the superiority-seeking situation and convergence speed of the basic particle swarm algorithm are compared with those of the basic particle swarm algorithm.Te fxed weight of the basic particle swarm algorithm is taken as 0.9, and the learning factor is c 1 � c 2 � 2.

Prediction Results and Analysis
(1) Analysis of IPSO Algorithm Advantage.Te variation of the ftness values of diferent particle swarm algorithms for solving the rediference metric with the number of iterations is shown in Figure 12.From the experimental results, it can be seen that the improved particle swarm algorithm with nonlinearly varying weights and learning factors proposed in this paper has a better fnding ability and convergence speed, which is refected in the following: the ftness value of the improved particle swarm algorithm for solving the optimal heavy diference metric is 0.00640188456, which converges after the 113th iteration, while the ftness value of the basic particle swarm algorithm for solving the optimal heavy diference metric is 0.00640189062, which converges after the 234th iteration.In contrast, the improved particle swarm algorithm has a better seeking ability and convergence speed.
(2) Combined Prediction Efect of the IPSO-Heavy Diference Acceleration Law.Te IPSO-optimized heavy diference metric is taken as shown in Figure 13, and the optimal heavy diference metric is solved as 1.9088.Section 4.2 shows that the prediction accuracy of the Holt exponential smoothing is better than that of the BP neural network.Te prediction results of the BP neural network and Holt exponential smoothing from 2017 to 2019 after processing abnormal passenger trafc data by the spline interpolation method are improved by the rediference acceleration rule equation (7).To verify the improvement efect, the prediction results are compared with those of the average weighted combination approach, i.e., where F w is the average weighted combination of the passenger trafc forecast, F bp is the BP neural network forecast, F holt is the Holt exponential smoothing forecast, and w 1 and w 2 are the weights, w 1 � w 2 � 0.5.Te predicted values of the railroad passenger trafc in Beijing from 2017 to 2019 are shown in Table 2.
Te analysis shows that the improved prediction values of the rediference acceleration rule have a better prediction accuracy, which is refected in the following.
After the improvement of the prediction results of the heavy diference acceleration rule on the BP neural network and the Holt exponential smoothing method, the mean value of the absolute value of the prediction error from 2017 to 2019 is 0.642%, which is 3.320% and 1.518% less than that of the BP neural network and the Holt exponential smoothing method, respectively.
Comparing the forecast results of the rediferential acceleration rule with the average weighted combination forecast results, the absolute value of its average forecast error of the railroad passenger trafc from 2017 to 2019 is reduced by 2.419%.In addition, the mean value of the absolute forecast error of the average weighted combination is 3.061%, which is higher than the mean absolute forecast error of the Holt exponential smoothing method of 2.160%.Tis also indicates that the direct weighted combination of multiple model prediction results has some uncertainty, i.e., there is no guarantee that its combined prediction efect is necessarily better than that of a single prediction model.Te IPSO-rediference acceleration method efectively avoids this problem.
To further verify the advantage of the IPSO-rediference acceleration rule over the traditional rediference acceleration rule, the estimated value of the rediference metric is calculated according to (6).It is known that this method needs to introduce another prediction method with a worse  prediction accuracy compared to the BP neural network and the Holt exponential smoothing method.After several experiments, the ARIMA (3,0,1) prediction model was selected, and its prediction results in the SPSS25 environment are shown in Figure 14.Te traditional rediference acceleration method improves the prediction value with the following idea: the mean value of the prediction value of the BP neural network, the Holt exponential smoothing method, and the ARIMA (3,0,1) prediction model from 2017 to 2019 is substituted into (6) to estimate the rediference metric, and then the prediction results of the BP neural network and the Holt exponential smoothing method are improved according to (7).
Te predicted values of the ARIMA ( According to (7), the improvement values of the traditional rediferential acceleration rule on the prediction results of the BP neural network and the Holt exponential smoothing method from 2017 to 2019 were calculated as 138.95 million, 144.56 million, and 150.94 million, and the absolute values of prediction errors were 0.157%, 1.282%, and 2.296%.
By comparing the calculated results with Table 1, it can be seen that, after improving the forecasting results of the BP neural network and the Holt exponential smoothing method, the mean value of the absolute forecast error from  Te IPSO-rediferential acceleration method has a higher prediction value improvement accuracy than that of the traditional rediferential acceleration method, and its average prediction error is reduced by 0.602% in absolute value.

Conclusion
To address the problem of abnormal railroad passenger volume data due to holidays or major events and the uncertainty of weighted combination forecasting, this paper introduces the spline interpolation method to make replacement corrections for abnormal railroad passenger volume and reduce the interference of abnormal data on forecasting accuracy.In addition, an improved particle swarm algorithm is proposed to optimize the rediference acceleration rule to improve the railroad passenger volume forecasting results, and combine the BP neural network and the Holt exponential smoothing method to forecast the railroad passenger volume in Beijing.Te results show that the spline interpolation method has a better prediction efect after the substitution correction of anomalous data, and the improved particle swarm algorithm also shows a better fnding ability and convergence speed when solving the optimal rediference metric.Compared with the prediction results of the BP neural network, the Holt exponential smoothing method, the average weighted combination method, and the traditional rediference acceleration rule, the prediction accuracy of the IPSOrediference acceleration rule to improve the prediction results is higher.It is worth mentioning that there are more interpolation methods in the feld of numerical analysis, such as Newton interpolation and Lagrange interpolation, and which interpolation method has the best efect in solving the problem of data anomalies caused by holidays or major events is the focus of future research.

Figure 9
Figure8shows the variation of prediction error with the number of iterations for diferent samples, blue is the training set, red is the test set, and green is the validation set generated by the system.Te network converges after 14 times of training, and the network error of the validation set is 0.00917, and the network error of the training and test sets is much lower than 0.00917, with a better error accuracy.Figure 9 represents the BP neural network's ft prediction goodness and the ft goodness of the training set, validation set, and the test set and all data are 0.99929, 0.97893, 0.99971, and 0.99777, respectively, which shows that the network training efect is very good.Te absolute values of the predicted values and prediction errors of the BP neural network are shown in Figures 10 and 11. Figure 10 demonstrates that the predicted and true values of the BP neural network are better ftted.It

Figure 8 :Figure 9 :
Figure 8: Convergence process of the BP neural network.

Table 1 :
Basic data related to railroad passenger volume in Beijing.

Table 2 :
Forecast values of the railroad passenger trafc in Beijing for diferent forecasting methods.