Discrete Train Speed Profile Optimization for Urban Rail Transit: A Data-Driven Model and Integrated Algorithms Based on Machine Learning

Energy-efficient train speed profile optimization problem in urban rail transit systems has attracted much attention in recent years becauseoftherequirementofreducingoperationcostandprotectingtheenvironment.Traditionalmethodsonthisproblemmainly focusedonformulatingkinematicalequationstoderivethespeedprofileandcalculatetheenergyconsumption


Introduction
In recent years, urban rail transit has developed rapidly around the world due to its high capacity, safety, superior energy performance, and reliable service with sufficient punctuality [1], which is becoming increasingly important for large cities development [2].For example, 35 cities in China have urban rail transit with total length over 4750 km in 2017 [3].According to the Web of China Rail Transit, there will be more than 50 cities operating urban rail transit in the next few years.In 2020, the total mileage of urban rail transit in China will be 6000 km, making the rail systems an important component of urban public transportation.Around the world, more and more cities are traveling oriented to public transportation.As shown in Figure 1 (which is from Global Cities Public Transit Usage Report of moovit), urban rail transit system has attracted much attention in recent years especially in some large cities and accounts for a high proportion of public transportation.However, the quick expansion of urban rail transit networks led to the problem of larger energy consumption.Taking Beijing rail transit as an example, in 2011, the total electric consumption of Beijing urban rail transit was 750 million kwh, and 470 million kwh was used for traction energy consumption, with the proportion as high as 55% which has attracted tremendous attention in recent years (Yin et al. [4]).In 2015, it reached 1.4 billion kwh, accounting for 40% of the total operating cost of the metro [5], which was equivalent to the annual electricity consumption of 730,000 households (annual electricity consumption of one household is based on 2016 BEIJING STATISTICAL  YEARBOOK from Beijing statistical information website).In the European Union (EU), for instance, transport causes approximately 31% of total greenhouse gas (GHG) emissions.Within this sector, metropolitan transportation is responsible for about 25% of the total CO2 emissions (González-Gil et al. [6]).Therefore, energy saving has become an important issue in real train operating in order to reduce the operation cost and satisfy the requirement of environment protection.
To reduce the energy consumption in urban rail transit, a lot of models have been developed in recent years which mainly considered the train controlling between two stations based on the kinematic equations.There are three types in general, i.e., mathematical optimization models, simulation methods, and multiple linear regression, and neural network model based on the data.Although a lot of works had been done in optimizing speed profiles, existing methods have some limitations: (1) The mathematical optimization model in theoretical aspects has been sounded.However, the actual situation is often more complex, and the theory of optimization may not get a good performance when the actual facts are taken into consideration.(2) The establishment of the simulation model (e.g., agent-based simulation [7]) is complicated and costly.Further, there is a certain deviation between the simulation results and the actual measurement data.(3) The traction energy consumption and its influence factors are not linear, and the precision of the multiple linear regression model is limited.The neural network relies too much on the empirical information extracted from historical data.The phenomenon of overfitting is prone to occur, and the generalization ability may be hard to guarantee.Besides, it is easy to fall into the local optimum.In contrast, from view of the data-driven optimization on the basis of machine learning theories, the limitations could be avoided.Firstly, real-world data that contains the influences from actual factors can be utilized well.Secondly, machine learning has been well applied in many fields, which provides a method to study the existing information from data, acquire new information, and improve performance of data set.The process that utilizes input data (real-world profile) to obtain output data (energy consumption) is easier to be realized.Thirdly, machine learning is stable.For instance, the RFR and the SVR have stable performance in the data set, and they have been widely used in many fields, such as biology, medicine, economy, managementm and so on [8] Therefore, it becomes possible to optimize the train speed profile in the urban rail transit system on the premise of verifying their effectiveness.Main contributions of this research can be summarized as follows: (1) A data-driven optimization model (DDOM) is proposed to optimize the speed profile in urban rail transit system.The traditional speed profile optimization model is easy to be analyzed in the theoretical aspects.In this paper, the train speed profile is optimized based on the view of discrete profile which can be applied in the practice easily.
(2) Based on actual data obtained by experimental measurements, a novel method of utilizing the machine learning algorithm to calculate the energy consumption of speed profile is proposed which can avoid considering longitudinal train dynamics.Besides, the calculation error of machine learning algorithm (RFR and SVR) on speed profile energy is verified.
(3) To solve the proposed model, an integrated heuristic optimization algorithm based on RFR and SVR is developed.In addition, comparison of real data, results show average 2.84% energy reduction.
The framework of this paper is shown in Figure 2.

Literature Review
During last years, many studies have focused on the energyefficiency analysis of train traction; Scheepmaker et al. [23]  summarized and gave a review from two aspects, (1) optimizing the speed profiles and driving strategies to reduce the energy consumption (e.g., Howlett [24,25]; Albrecht et al. [12]; Scheepmaker and Goverde [26]; Yang et al. [18,27]; Tian et al. [28]; Sun et al. [17]; Yang et al. [29]) and (2) optimizing the timetable by means of utilization of regenerative energy with minimum energy consumption (e.g., Chevrier et al. [30]; Li and Lo [19,20]; Wang and Goverde [31]; Wang et al. [32]; Zhao et al. [33]).Some typical publications about energy-efficient research are listed in Table 1.In essence, energy consumption is related to the train traction process.It is a fundamental work to improve the speed profiles.Over the past 25 years, the challenges in the train speed profile optimization have resulted in a variety of analysis frameworks.(1) Mathematical optimization models.The modern theory of optimal train control was developed during the years 1992-2014 by the Scheduling and Control Group (SCG) at the University of South Australia in a collection of papers.For example, Howlett and Cheng [9] built a discrete control model and confirmed the fundamental optimality of the accelerate-coast-brake strategy for energy-efficient train operation.On the basis of the Pontryagin maximum principle, if no energy is recovered during braking, then it becomes an optimal switching strategy.Wong and Ho [11] showed that a genetic algorithm was more robust in calculational processes.After reformulating the necessary conditions for optimal switching, Howlett et al. [34] proposed a less general model that the optimal switching points for each steep section can be found by minimizing an intrinsic local energy function.Albrecht et al. [13] used the Pontryagin principle to find necessary conditions on an optimal strategy and showed that a strategy of optimal type uses only a limited set of optimal control modes, Maximum Power, HoldP (Hold using Power), Coast, HoldR (Hold using Regenerative braking), and Maximum Brake.Albrecht et al. [14] developed general bounds on the position of optimal switching points and proved that an optimal strategy always exists.And an intrinsic local energy minimization principle for determination of optimal switching points was established, which shows that the optimal strategy is unique.Huang et al. [35] proposed an integrated approach for the energy-efficient driving strategy and timetable which was solved by a particle swarm optimization (PSO) algorithm.Yang et al. [36] employed an energy-efficient through the Taylor approximation.They   [37] modeled electric trains energy consumption using neural networks, providing a reliable estimation of the consumption along a specific route when being fed with input data such as train speed, acceleration, or track longitudinal slope.Big data analytics (BDA) has increasingly attracted a strong attention of analysts, researchers and practitioners in railway transportation and engineering filed [38].From a data-driven view, this paper mainly focuses on how to obtain the optimal speed profile based on well-developed machine learning algorithms.There are still seldom researches aiming at optimal speed profile by this proposed method.

Data Analysis and Preprocessing
. .Data Overview.During the operation of the subway, the most widely used power is electricity.Some are used for the consumption of facilities in the train, such as air conditioning, lighting, etc.The rest is for traction of metro trains.Our data resources are formed by urban rail transit train running state and corresponding energy consumption, which are derived from Changping Line of Beijing urban rail transit.The operation section of Changping Line is from the Xi' erqi station to the Changpingxishankou station, with operating mileage of 31.9 kilometers and total of 12 stations opened (as illustrated in Figure 3).In order to accurately capture the actual traction power consumption during the operation of the subway, we installed sensors and computers on the train.The total energy consumption and the energy consumptions of various electrical appliances in the train are both recorded.Then, the total consumption is subtracted from the electrical energy consumed by the electrical appliances, and the rest is the energy consumed by the traction of the subway train.The provided data covers running stage of 4 months.There are two circle running tests every night in the up and down direction.The types of recorded data are showed in Table 2.

. . Data Preprocessing
Symbols : number of section is discretized to.V 0  : th speed point of original profile   .∇: the time interval used to record the speed and displacement data during train traction.
Using these recorded data, we can draw out the running process of the urban rail transit train.Taking MingTombs-Changpingxishankou of the down direction, for instance (showed in Figure 4), the train operation process is divided into three stages.The first stage is accelerating until approaching the maximum speed limit; the second stage is fluctuating in the high-speed zone; the third stage is the deceleration braking until the train stops.Normally, differences in track conditions are caused by construction and geological reasons.There will be limited speed at different locations in each section of the urban rail transit.In this section, there are three speed limiting sections: 0 →  1 ,  1 →  2 ,  2 → .Each part has its maximum speed limit.
Train running state form is shown in Table 3 (m: the number of data recorded on an original speed profile).A speed profile has three elements, speed, time, and distance.The time interval between records in the table is 0.2 seconds.However, the running time between two stations varies from almost one to several hundred seconds.This means that a speed profile may be made up of thousands of records.We need to calculate the energy consumption from the profile, that is to say, to find the relationship between energy consumption and the thousands of data records, which is the so-called "high-dimensional" data in statistics.
Although machine learning algorithms under the back of big data are suitable for dealing with high-dimensional data, for extremely high-dimensional situations, large amounts of data are needed as training sets, and calculation precision is hard to be gained [39].Therefore, we choose dimensionality reduction for the limitation of data quantity.Not only can the algorithm achieve good training effect, but also the accuracy of the original high-dimensional data can be reserved.
Process of reducing the dimension is as follows: (1) The section length  0 can be obtained from records, then  0 is divided into  small sections (the uniform segmentation method is chosen in this paper).Thus, the (n+1) points are represented by { 0 , . . .  . . .  |  = 0, 1, . . .}.Clearly,  0 = 0,   =  0 (section total length).Taking MingTombs-Changpingxishankou of the down direction, for instance, as shown in Figure 5, a uniform interval of 50 m and 5 m is selected for discrete process.In Figure 5(a), the speed profile record number drops to 26, getting 26 control points during the train traction, respectively, in Figure 5(b), speed profile record number is 247, and the density of control points is higher.
(2) Find the latter and previous positions of   in original profile within ∇ interval, recorded as  −  and In the original velocity profile, we can get the velocity and time corresponding to the  −  and  +  , recorded as V −  , V +  ,  −  , and  +  .In the small section from  −  to  +  , the train is assumed to be in a uniformly accelerated state.As shown in Figure 6, by using V −  , V +  ,  −  , and  +  , the V  can be obtained.Therefore, we can get the {V 0 . . .V  . . .V  }, where V 0 = V  = 0. Figure 6(a) indicates speed profile can be represented by fewer points.Figure 6(b) shows error between the simplified profile and original one could be ignored when compared the whole length of section.
The speed profile sequence {V  −   },  = 1, 2, . . . and the traction energy consumptions of each sequence  are extracted.And the data is shown in Table 5 (q: number of processed data records).Then, to eliminate dimension, the data is normalized.The extracted data is divided into two parts.80% is as the training set, and 20% is as the test set.

Formulation
In this section, a data-driven optimization model (DDOM) is proposed to optimize the urban rail transit traction energy consumption, which discretizes velocity profile and describes the relation between velocity profile and energy consumption as a complex mapping-relation.
V    : minimum speed limit corresponding to   .
V    : maximum speed limit corresponding to   .  : minimum acceleration limit in operational section.  : maximum acceleration limit in operational section.  : minimum time limit in operational section.  : maximum time limit in operational section.
Assumption.During the process of  −  →   →  +  , because the interval is small enough, it is assumed that the train is in uniform acceleration.According to the theorem of V −  relationship in physics, the quadratic function can be given.
Derived by formulas (1)-( 3), we get the velocity sequence {V 0 . . .V  . . .V  } as follows: or . .Train Operation Constraints.During the running state from one station to a neighboring station, some constraints should be satisfied.Speed limit (SL) constraints: the speed limit of the section at   should be satisfied.
V    and V    are determined by the actual speed limit of the section.
Acceleration constraints: in order to satisfy the comfort of passengers on the train, the acceleration needs to be kept in a suitable range.As shown in formula ( 7)-( 8),   and   are determined by actual empirical parameters, and   > 0,   < 0.
Train operation time constraints: transportation efficiency also should be taken into account.Therefore, the train running time  also needs to be within a certain range as shown in formula (9).
where   and   are determined by the service level and operational condition.
Train operation distance constraints: to ensure that the train can reach the station accurately, the total displacement of the train in the section must be equal to the length of the section.
. .Objective Function.When the section running time of train is , the corresponding energy consumption is   , which has a complicated relationship with the sequence of velocity points.That is,   ({ 0 −V 0 } . . .{  −V  } . . .{  −V  }) i=0,1. ..n.The optimization of urban rail transit speed profile is to minimize the energy consumption under the condition of satisfying transportation task, and the objective function of data-driven optimization model (DDOM) is showed in (11).

A Greedily Heuristic Algorithm for Model
In this section, firstly two energy consumption calculation methods based on machine learning algorithm are introduced.Then, by analysis the characters of them, an integrated optimization flow is developed with a combination of their merits.

. . Energy Consumption Calculation Based on Machine
Learning Algorithm.From the view of data-driven method, urban rail transit train runs within each section and produces a traction speed profile that corresponds to an energy consumption value.Although the factors affecting the energy consumption of each train are not only related to the speed profile, the external factors are determined once the operational section is fixed.Moreover, the transmission characteristic of the train is determined when the type of train is selected; then the energy consumption is only related to the speed profile during the traction process.Therefore, the speed profile becomes the key to the energy consumption of train traction.
In this paper, two typical machine learning algorithms (RFR and SVR) are introduced, where RFR is utilized to get velocity points' importance degrees in different positions, which can be responsible for obtaining these pairs spacespeed with a major contribution to the energy consumption.And, SVR is employed to calculate the energy consumption of the profile.The programming environment is Python 3 and its machine learning module is scikit-learn.

. . . Random Forest Regression (RFR) Algorithm Module.
Random forest is a kind of ensemble learning algorithm, which uses multiple trees to train and predict a classifier, and also can be used for regression [40].Based on decision trees combined with aggregation and bootstrap ideas, random forests were introduced by Breiman in 2001, which added an additional layer of randomness to bagging.In addition to constructing each tree using a different bootstrap sample of the data, random forests change how the classification or regression trees are constructed.They are a powerful nonparametric statistical method allowing consideration in a single and versatile framework regression problem [41].The random forest optionally produces two additional pieces of information: a measure of the importance of the predictor variables and a measure of the internal structure of the data (the proximity of different data points between one and another).In this paper, we can take advantages of this module to get velocity points' importance degree in different positions which can be used in heuristic solution process for model.

Evaluation and Analysis of RFR.
In the utilization of RFR algorithm, two important parameters should be calibrated: the number of split attributes (Mtry) and number of decision trees (Ntree).For simplicity, the enumeration method is used to traverse the two parameters.The convergence process is shown in Figure 7 over ten experiments.We can see that, when Ntree≥50, the average error is close to 0.1kwh.For different Mtrys, errors are shown in Figure 8(a), and there is an acceptable convergence range in Figure 8 Mtry=2 or 3, the error is minimal.Therefore, the optimal parameter combination used in this paper is Mtry=2 or 3 and Ntree≥50.By using the FR algorithm, the traction energy consumption evaluation average error is less than 0.1kwh and within range of 1%.
In addition to the high precision evaluation ability, we also get importance degrees of the velocity in different displacements during the traction energy consumption of the urban rail transit.We can find that the speed at which position is more significant to the energy consumption in a section, which indicates contributions to energy consumption of pairs space-speed.For instance, in the section of MingTombs-Changpingxishankou, section length is 1230 m, the importance degrees at different positions are shown in Figure 9.

. . . Support Vector Machine Regression (SVR) Algorithm
Module.Support vector machine (SVM) algorithm is from statistical learning theory (SLT), which is based on the structural risk minimization principle that can avoid excessive learning problems and ensure the generalization ability of the model.In essence, it can solve the convex quadratic programming problem and avoid falling into the local minimum.It can be applied not only to classification problems but also to the case of regression [42].Therefore, it can be divided into support vector classification (SVC) and support vector regression (SVR).Because of its solid theoretical foundation and its complete theoretical derivation, support vector machine is an effective tool in dealing with small samples, nonlinear, local issues.In this paper, it is applied to calculate the energy consumption based on real data.
Before using the SVR, the first step requires the determination of the kernel functions.The second step is to optimize parameters corresponding to different kernel functions.In this paper, three typical kernel functions are verified: radial basis kernel function (RBF), linear kernel function (LIN-EAR), and polynomial kernel function (POLY).
(1) For RBF, calibration parameters include  penalty factor and  value.As shown in Figure 10(a), convergence rate of RBF is very fast.When  ≥ 20, the error will drop to a lower level.As  ≥ 100, the average error of traction energy consumption can reach about 0.1kwh.The best combination of parameters is  ≥ 30, and  = 3.
(2) For LINEAR, calibration parameter is  penalty factor.As shown in Figure 10(b), the convergence is slow.When  ≥ 900, the average error of traction energy consumption also can reach about 0.1kwh, which means that it will take a little longer time to reach minimum errors.
(3) For POLY, calibration parameter is  penalty factor.As shown in Figure 10(c), average error is fluctuating updown at 0.1Kwh and not stable, which fails to achieve better convergence results.
Comparing the performance of the three kernel functions, average error of the RBF kernel function is the best, which means that the traction energy consumption can be calculated under the optimal parameter conditions.
. . .Analysis of the Two Machine Learning Algorithms.For RFR algorithm, stable performance is in the data set, and the evaluation results are satisfactory.At the same time, the more momentous point is that the importance degrees of the velocity points in different positions can be sorted, which will be a valid guiding to the optimization control of the speed profile.For example, we can adjust the speed with high importance degree in the speed profile optimization process.As for the SVR algorithm, although the performance is not good in some kernel conditions, the ability to calculate in the RBF kernel function is also serviceable enough.For optimizing the speed profile of an urban rail transit train, we should find a speed profile that is not less than the existing energy consumption or is even lower than the existing energy consumption.However, the RFR algorithm has a fatal flaw: random forest cannot make the output beyond the range of data set, which may lead to overfitting in modeling of some specific data with noise.Therefore, the design of urban rail transit speed profile optimization algorithms could be beneficial to the combination virtues of the SVR and RFR.
. .Optimization Process.Form the view of discrete train speed profile optimization, the key problem is how to design a method to get a more energy-efficient profile; thus a group of combinations {V  −   }( = 0, 1 . . .) should be found.Velocity V  in every position can be in a range, and the number of {V  −   }( = 0, 1 . . .) combinations will be beyond imagination.It is necessary to discretize the speed changing value.Thus, there should be a step size used for the speed adjustment.A simple and effective step size is the unit from recording instrument (in our experiment, it is 0.001km/h).Further, a heuristic process can be proposed to reduce the combinations: we can utilize important degree from RFR to adjust the velocity with fixed order.Then, energy-saving profile will be easier to get by the heuristic process.As shown in Figure 11, in one operation section, of the real-world data, there are many profiles under the same running time but with different energy consumptions.Under every running time condition, we can try to find a satisfactory profile at this fixed running time.Then, the best of them with different fixed running time is taken as the optimal solution.Based   on this, we develop an integrated greedily heuristic algorithm combined with RFR and SVR.

Parameters
+ : set of index values corresponding to the speed at which the importance degree is arranged in descending order.
− : set of index values corresponding to the speed at which the importance degree is arranged in ascending order. () +: in descending order, the speed index value corresponding to the ℎ importance degree. () −: in ascending order, the speed index value corresponding to the ℎ importance degree.

Collection of all solutions
Feasible solutions at different times Local optimal solutions at different time Global optimal solutions Step .In the case of optimal parameters, random forest regression (RFR) Algorithm Module (Section 5.1.1))is used to obtain the importance degree of speed series {V  −  }.Then, sort them (because the importance degrees of {V 0 −  0 }.{V  −   } are zero, they are excluded) in descending order.And the  speed sequences {V  + −   + } of the previous m%( =  * /100) are selected.For the corresponding importance degree  +  (1 ≤  ≤ ), we can get  + 1 ≥  + 2 . . .≥  +  . . .≥  +  .Then, in ascending order, similarly, the  speed sequences {V  − −   − } of the previous m% are selected, and get Step .Initialize the operation time  of the urban rail transit train, and set  0 =   .According to the minimum and maximum time in the data,   ,   are determined, and discretized unit of time is ∇.Then let  = 1,  = 0.
Step .Then, we can get a new profile after adjustment of V  and V  .Support vector machines regression algorithm (SVR) module (Section 5.1.2) is used to calculate the energy consumption.We adjust the velocity until  =   and get the minimum energy consumption   , during the adjustment process and the corresponding speed Formulas (12) and (13) show the calculation of ∇ ∧  and ∇ −  where velocity changes are ∇V ∧  and ∇V −  .To ensure the  balance of displacement, let ∇ ∧  = ∇ −  .
Step .Get all the energy consumption ), calculate the energy Finally, algorithm flow is shown in Figure 13.We take Changping Line MingTombs-Changpingxishankou section of down direction as a numerical experiment to explain the optimization process, and the section parameters are listed as above.And there are two cases in different intervals.A complete operation state is showed in Figure 14.

. . Optimization Result
Case .  ( = 0, 1 . . .) is set as an uniform interval of 5 m, and let V 0 = V 246 = 0,  0 = 0, 246 = 1230.The   operation time is 103.4s.The results after optimization are shown in Figure 15.We can see that the optimal profile is not smooth.It suddenly increases or decreases in some places.Apparently, the availability of the optimized profile is not enough.
Case .  ( = 0, 1 . . .) is set as an uniform interval of 50 m, and let V 0 = V 26 = 0,  0 = 0, 26 = 1230.Figure 16 shows the optimal results when  = 50% (showed in Figure 16(a)) and  = 100% (showed in Figure 16(b)).In this case, the operation time is also 103.4s.The optimized energy consumption can be reduced by 0.65 kwh.We can see that the speed profile is much smoother than Case 1 with rate of energy reduction is 3.1%(0.65/21* 100%).In Figure 16(a), for m=50%, after optimization, the acceleration stage is slightly flat.However, in Figure 16(b), when m=100%, whole speed profile is flatter compared to the original profile, and it is more valuable in practice.6.
Operation sections with different distances should not have the same discrete interval.For longer section, the interval could be bigger.For example, distance of Xi' erqi-Life Science Park is 5455 m, and interval could be 200 m.
In addition, the comparison of profile before and after optimization is shown in Figures 17(a)-17(j).Optimization results of other operation sections are listed in Table 6.We can see that, in some section, the maximum energy saving ).However, our improvement is compared with a real-world result that had already been imposed with an optimal control (traditional train optimal control with on the basis of Pontryagin maximum principle).There is an ATO (automatic train system, which is equipped with optimal control) in Beijing Changping Line and Yizhuang Line.Yizhuang Line and Changping Line have some similar features, train type, number of organized group, passenger intensity, power supply mode, and so on.
A well-designed method in real world that is applied into Yizhuang Line can achieve average saving energy blow 3% from the operator's statement.Therefore, the improvement based on an ATO profile which makes it look modest is reasonable.Besides, for different section, there are different improvements.The results may be triggered by many factors, like different section external environments (radius of curve, slope, air humidity, and so on).The optimized control effects in different sections are key to the room for improvement.If the room for improvement is limited, the real improvement may be also limited.Therefore, there is no quantitative result to illustrate the different improvements in each section.

Conclusion
Reducing train traction energy consumption is one of the efficient ways to cut energy cost in urban rail transit systems.And to protect the environment, the optimization of urban rail transit traction energy conservation has been a significant task in urban rail transit operation and management.The traction energy consumption of a single train is related to the speed profile between stations.When energy-efficient profiles are applied in every section, there will be a positive effect on reducing energy consumption of the urban rail transit system.Therefore, train speed profile optimization is a fundamental work.
In this paper, the speed profile optimization problem is discretized, and the decision variables of the speed profile become a series of space-speed points.From this viewpoint, a data-driven urban rail transit train speed profile optimization model (DDOM) is proposed to describe the relationship between profiles and energy consumption.Two machine learning algorithms, namely, random forest regression (RFR) and support vector regression (SVR), are taken into account.RFR is applied to get the important degree of velocity in positions, and the degree is utilized as heuristic information to decide the optimization order of velocity in different positions.SVR is used to calculate energy consumption of profiles with a high accuracy (95%).Combined with the advantages of the two algorithms, an integrated heuristic greedy optimization algorithm is developed to solve the model, which can reduce energy consumption by 2.84%.In some theory research, energy conservation percentage is higher than our results.However, few are verified based on the real-world data.Furthermore, our methods may be quite simple and can be applied to practice easily.
Nevertheless, because the data samples are far from enough, when adjusting velocity in different positions to get a new profile in the optimization process, range of velocity change is limited.There is still some room for an improvement on the basis of the optimization results.Although there are many different views, the data-driven method is new to the problem, and applying machine learning algorithms to the field of energy saving in urban rail transit is the innovation.Future research can be focused on the following areas.Firstly, a further improved algorithm for a different heuristic strategy could be studied.For instance, based on the data machine learning method, the regenerative electricity consumption in the braking process may be reused in the trains from neighboring sections.Thus, instead of optimizing one single train speed profile in each section separately, train speed profiles from neighboring sections should be taken into account.Secondly, in the urban rail transit networks, if power supply in the network nodes (transfer stations) is transmitted from the same transformer substation, the energy-saving optimization of trains can be extended to the urban rail transit network.
transportation and urban rail transit

Figure 1 :
Figure 1: Proportions of public transportation and urban rail transit.

Figure 8 :
Figure 8: Convergence process and errors in RFR.(a) Errors in different Mtrys.(b) Convergence range.

Figure 9 :Figure 10 :
Figure 9: Importance of velocity at different locations in the section.

Figure 12 :
Figure 12: Explanation of changes of velocity and displacement.

Figure 17 :
Figure 17: The obtained profiles in different sections.Section (a)-(j) are listed in Table6.

Table 1 :
Some typical publications about energy-efficient.
I: speed profiles/driving strategy; II: energy-efficient timetable.

Table 2 :
Overview of measurement characteristics.

Table 3 :
Part types of the original data.

Table 4 :
Part of the velocity series after being processed.

Table 5 :
Data format of training and testing set.

Table 6 :
Optimization results of other sections.08% (in the section Shahe to Shahe University Park), which is a good performance.And, for a 31.9kmlength with 12 stations train line, energy saving is 2.84%.The improvement may look modest when compared with previous researches (most claim saving energy above 4%