Developing an Enhanced Short-Range Railroad Track Condition Prediction Model for Optimal Maintenance Scheduling

As railroad infrastructure becomes older and older and rail transportation is developing towards higher speed and heavier axle, the risk to safe rail transport and the expenses for railroad maintenance are increasing. The railroad infrastructure deterioration (prediction) model is vital to reducing the risk and the expenses. A short-range track condition prediction method was developed in our previous research on railroad track deterioration analysis. It is intended to provide track maintenance managers with two or three months of track condition in advance to schedule track maintenance activities more smartly. Recent comparison analyses on track geometrical exceptions calculated from track conditionmeasuredwith track geometry cars and those predicted by themethod showed that the method fails to provide reliable condition for some analysis sections.This paper presented the enhancement to the method. One year of track geometry data for the Jiulong-Beijing railroad from track geometry cars was used to conduct error analyses and comparison analyses. Analysis results imply that the enhanced model is robust to make reliable predictions. Our in-process work on applying those predicted conditions for optimal track maintenance scheduling is discussed in brief as well.


Introduction
Transportation systems play a critical role in development of society and economy.Railway system accounted for the largest part of national freight ton-miles, for example, 38.2% in 2005 in USA [1] and 49.70% in 2005 in China [2].Railroad infrastructure as a base element of railway system has great and direct influences on safety and cost efficiency of rail transport.It is believed that as railroad infrastructure becomes older and older, the risk to safe transport and the expenses for preserving the infrastructure will increase.Specifically, when the infrastructure grows up over a certain age, the risk and the expenses will increase exponentially [3].The last ten years (as a small portion of the entire infrastructure evolution process) has seen linearly increasing expenses per mile for class I railroad infrastructure of USA, as illustrated by Figure 1.Furthermore, the recent development of rail transportation towards higher speed and heavier axle load is also believed to increase the risk and expenses.
Practices in transportation infrastructure management try to balance the cost associated with potential damage resulting from unfavorable infrastructures, as well as the cost for Maintenance and Renewal (M&R) activities in order to minimize the total cost.Management practices of highway pavement and some other infrastructures have been implemented into some tools [4].But such tools for railroad infrastructure management are rare.Among issues in achieving the balance between the two categories of costs, railroad infrastructure deterioration modeling is vital [5].The infrastructure deterioration models fall into two categories: longrange and short-range deterioration (or prediction) models.The long-range models assist infrastructure management departments in making budget plan to minimize the planning horizon cost under constraints.The short-range models are necessary to optimally schedule R&M activities, constrained by limited budgets and other maintenance resources allocated through long-range models, and acceptable infrastructure, so as to minimize effects of the activities on rail traffic.
According to the R&M scheduling, resources are allocated to each R&M activity within the planning horizon, and, accordingly, the balance between the cost associated with potential damage and the cost for R&M activities is achieved.The categorization of deterioration models applies to infrastructure of all transport modes.If infrastructure condition predictions by deterioration models are not accurate enough, plans will be in question, and sometimes damage might be caused by some of the unpredicted infrastructure failures.Therefore, both long-range and short-range prediction models should be characterized by high accuracy and extensive suitability for infrastructure in various conditions.A large number of models have been developed for highway pavement deterioration.For instance, Markov decision process was employed to formulate pavement deterioration [6][7][8][9][10].Kobayashi et al. modeled the pavement deterioration process through a hidden Markov model [11].Using a time series method, Durango-Cohen formed a model for the pavement deterioration process [12].Several investigations have been performed on mathematical modeling for railroad infrastructure deterioration.Kawaguchi et al. applied a double exponential smoothing method to the track geometry evolution process and formed a track deterioration model [13].Based on a track degradation database, Alfelor et al. established a one-to-one linear relationship between track deterioration and one specified contributing factor through least square method [14].Based on a common denominator regarding track deterioration, Veit and Marschnig used an exponential model to describe track deterioration process [15].Meier-Hirmer et al. fitted gamma stochastic process to the evolution rate of track surface over a 1000-meterlong section of track [16].He et al. considered that track deterioration rate increases linearly with the current track condition and exponentially with the sum of individual effects of five contributing factors and formulated their track deterioration model [17].Liu et al. and Xu et al. thought track deterioration processes between two adjacent maintenance activities may be any forms of smoothly nonlinearly nondecreasing functions and therefore employed piecewise linear regression models to describe track deterioration [18,19].
From the above literature review, it is seen that some investigations consider infrastructure deterioration stochastic, while others assume that relationships between deterioration and impact factors are characterized by deterministic formulations, for example, linear, polynomial, exponential, and piecewise linear.From the perspective of approximating a curve with straight line segments, the piecewise linear relationship can be used to approximate any smoothly nonlinear relationships.In other words, if properly modeled, the piecewise linear relationship can be more applicable than linear, polynomial, and exponential ones.
In previous research on track (geometry) condition prediction technologies, the authors proposed a method employing the piecewise linear regression to make shortrange predictions for condition indices [18,19].The proposed method uses track inspection data within time periods of equal length to estimate deterioration rates for all unit sections of a track.Specifically, the lengths of time periods are considered equal in both temporal and spatial dimensions.The estimation of deterioration rates is triggered by availability of new track inspection data.Detail about the estimation will be described in Section 3. The deterioration rate of a unit section for a time period is then used to predict track condition of the unit section for future two or three months.From the brief introduction, it is easily seen that the length of time periods is an extremely key parameter for the method.Hereafter, the time period is referred to as the time span.In the previous research, the length of the time span is determined mainly according to railroad field engineers' knowledge.Recent data analyses discovered that in some cases errors in track condition predictions are not normally distributed around 0 mm.Further rigorous analyses on these cases revealed that the undesirable error distributions may arise from the inappropriate length of the time span.
This paper attempts to formulate an optimization model to estimate the time span length for each unit section.The estimated time spans are varying along a track.The previous model takes a constant time span length for an entire track.This is the difference between the enhanced model and the previous one.Then the estimated time span and a normally distributed random variable are incorporated into the enhancement of the previously proposed prediction method.The enhanced prediction model allows maintenance-of-way departments to acquire accurate track condition two or three months in advance, depending on railroad's transportation focuses, that is, freight and passenger, million gross tons, and traffic speeds.
The reminder of the content is organized as follows.The effects of impact factors on track condition deterioration are descriptively analyzed in Section 2 in order to form a basis for the track condition prediction model.Section 3 presents the enhanced track condition prediction model based on characteristics of track condition deterioration under the impact factors.Section 4 presents error analysis results for track condition predictions.Using the track condition predictions to optimally schedule track maintenance is briefly discussed in Section 5 for future research.Finally, conclusions regarding the research in this paper are drawn in Section 6.

Descriptive Analysis on the Effects of Influential Factors
Railroad track geometry deviations are usually termed track irregularities.Generally speaking, track irregularities are the result of cumulative comprehensive effects of seven categories of impact factors [3,4,[20][21][22][23] As the base of railroad track, terrain has obviously direct and considerable influences on track geometry.Any variations in terrain will be reflected immediately by sudden changes in track geometry, for example, Taiwan high-speed rail subsidence [24].What is more, the vertical stiffness of terrain along a track is varying, resulting in longitudinally varying track deterioration processes.
Under the effects of these seven categories of impact factors, track deterioration processes fall into three groups, gradual deterioration, sudden deterioration (more precisely, damage), and improvement in track condition, as shown in Figure 2. As discussed above, the gradual deterioration process is the result of the effects of moderate environmental factors and the first four categories of impact factors, as demonstrated by solid dark curves in Figure 2, and the improvement process is caused by track maintenance, as demonstrated by the dashed gray lines in Figure 2. Happenings of the other factors including variations in terrain and extreme environmental factors make track deteriorate suddenly, which is not presented in the figure.
In the process of track deterioration, some categories of the impact factors influence each other.For instance, wheel loads deteriorate tracks in terms of track geometry, condition of track components, and performances of materials of the  track components.Simultaneously, the deteriorated tracks increase the wheel loads and decease track resistance to deterioration.Such interactive influences among these three categories of the impact factors continue as trains run over the tracks.Sadeghi and Askarinejad have quantitatively investigated influences of some impact factors on track deterioration [23].Therefore, the track deterioration rate usually shows an accelerated trend.It is the basis for researchers to formulate track deterioration models.
Track deterioration proceeds under the influences of all these seven interactive categories of impact factors.Actually, the influential degree on track deterioration varies along tracks.In other words, each track location has its own distinctive track deterioration process.The distinctive characteristic of track deterioration has been experienced by railroad field engineers during the past several decades of track management practices.Values of geometrical parameters for track positions vary considerably along a track section, even when the whole track section is maintained during one maintenance activity with the same maintenance machine.By now, only few of the impact factors can be measured, and interactive effects among the impact factors are unable to be measured.What is more, even data for those most possibly accessible influential parameters, that is, million gross tons and train travelling speeds, is often unavailable to investigators because of confidential regulations of operation divisions.Moreover, measurement data of these impact factors that can be measured is often contaminated with slight noises [16,[25][26][27].Because of the unavailability of data for most impact factors and uncertainty in the measurement data, track deterioration is usually considered stochastic and is formulated with the independent variable of time.

Enhanced Track Condition Prediction Model
Because of the distinctive characteristic of track deterioration, the proposed track condition prediction model is formulated for each analysis object.The spatial and temporal dimensions of a practical analysis object are determined first in light of facts of track geometry measurement data by track geometry cars.Hereafter, the analysis object is often named analysis section for compatibility with railroad industry.For a specified analysis section, track condition prediction model is proposed to predict its condition two or three months in advance.Last, the optimal time span estimation model is formulated for the model.Secondly, there are errors in milepoint measurements [27].These errors make it impossible to acquire historical geometry measurements for even those sampled track positions.Lastly, geometry measurements are usually contaminated with slight noises.This fact makes it difficult to mine slightly contaminated historical geometry measurements for accurate deterioration processes of track points.Therefore, the analysis object cannot be track points but should be a track section.Because of the distinctive track deterioration processes, the analysis track section should not be too long.To determine the length of the analysis track section, the above three factors have to be taken into consideration.
Track geometry cars throughout China Railways today have only one kind of sampling distance, that is, 0.25 m.Due to many reasons [26][27][28], milepoint measurements from track geometry cars are inconsistent with actual milepoints in field.To deal with errors in milepoint measurements, two mathematical models have been developed by the authors, Key Equipment Identification (KEI) [28] and Dynamic Sampling Point Matching (DSPM) [26,27].KEI is intended to automatically identify sampled points associated with key locations of some of track equipment (horizontal curves and diverging tracks of turnouts) in a specified inspection data file and then to revise milepoints of the data file according to actual milepoints of those identified points.After being processed by KEI, milepoint errors are reduced considerably, but two processed inspection data files over same track still have small differences in milepoints of almost same sampling points.Such kind of milepoint differences is referred to as milepoint shift in the references.To reduce milepoint shifts, DSPM was developed.DSPM is formulated to automatically match sampling points on one inspection with the closest sampling points on another inspection.After being processed by both KEI and DSPM, milepoint shifts, in most cases, are reduced to the level below one standard sampling distance, that is, 0.25 m.Readers for details about the above-mentioned data processing models are referred to [26][27][28].
As for slight noises in geometry measurements, based on inspection data processed by KEI and DSPM, an optimization model was formulated to minimize the effects of the slightly noisy geometry measurements.The noisy effects are quantified through the deviation of the geometrical parameter values predicted by the proposed model from those measured ones.For each inspection, the deviations associated with each analysis section length were calculated.The optimal model uses the sum of the squared deviations as the object function to choose the optimal analysis length that produces the best fit (i.e., minimizing the objective function).The minimization of the object function was accomplished through comparing the sums linked to different lengths.Two years of inspection data from the Jiulong-Beijing Railroad administered by Jinan bureau of China Railways was used to accomplish the comparison analysis.The length of 0.5 m is attained as the best analysis track section.

Track Condition Prediction Model.
Occurrences of sudden deteriorations are unpredictable in advance mainly because of unavailability of data for those impact factors causing such category of deteriorations.As shown in Figure 2, after sudden deteriorations, deteriorated (or damaged) tracks are restored to satisfactory condition, which provides safe running surface for trains.This indicates that the restored tracks need not to be worried about very much from the perspective of train safety for research on track condition prediction.But for research on long-range track maintenance optimal scheduling, the restored tracks have to be taken into account in order to maximize benefit functions or to minimize cost functions over a planning horizon.Considering the focus of the current paper, it is assumed that deterioration rate estimation is made for a gradual deterioration process between occurrences of two adjacent sudden deteriorations.
If the value of () over the time range from   to  is assumed to be equal to (  ), the equation () = (  ) + ∫    () is rewritten as () = (  )+(  )(−  ).Actually, the approximation is basically true when the time range is short, like less than half a year [5,19].The reason for this is that within a span of such time range there is a small probability for track components to experience large performance degradations, which will result in rapid changes in deterioration rate.Therefore, the assumption for the currently concerned problem (i.e., short-range prediction) is acceptable.But the approximation of the deterioration rate function () by (  ) assuredly introduces errors into the predicted track condition ().Considering such consequence, a normal random variable, , is incorporated into the approximate equation.Accordingly, the short-range prediction mode for an analysis section is formulated as where (  ) is a condition column vector of a specified geometrical parameter measured by a track geometry car at the time point   over sampling points in the analysis section, Ĉ() is a column vector of predicted condition values at the time point  for the parameter on the sampling points, (  ) is the average of deterioration rates of the parameter at   on the sampling points and will be estimated in Section 3.3, and  is a column vector with the identical dimension to (  ) and all elements equal to 1.In (1), the average deterioration rate (  ) rather than individual deterioration rate is used to make predictions.Such treatment of the deterioration rate is feasible because the analysis section is only 0.5 m in length and influence degrees of all impact factors are almost identical along the analysis section.Furthermore, the use of the average deterioration rate may also reduce the effects of slight noises in track geometry measurements.According to the above discussion, the proposed model is built for the deterioration process of a geometrical parameter on a sampling point in an analysis section, and the deterioration process within a short time range is approximated by a linear model which has a normally distributed random component to quantify errors introduced by the approximation.Parameters of (1), (  ) and (  ), have to be updated continuously as track condition evolves.In the current research, updating the parameters is triggered by availability of new inspection data from track geometry cars.

Optimal Time Span Estimation.
As concluded in Section 3.1, each geometrical parameter of each analysis section has its own distinctive deterioration process.This indicates that the length of a time range within which historical measurement values are used to estimate the average deterioration rate (  ) through least squares method varies among seven geometrical parameters and varies along track.As for a geometrical parameter of sampling points on a given analysis section, differences between measurement values and prediction values by ( 1) on all the sampling points are calculated.The minimum sum of the squared differences is used as the objective function, formulated as (2) to determine the time range length at the time point   for the geometrical parameter on the given analysis section:  where In the objective function, Ĉ( + , ) denotes the prediction of the column vector ( + ) when the last ( + 1) measurement column vectors {( − ), 0 ≤  ≤ } are used to calculate the average deterioration rate (  , ), and the constant  equals the number of track geometry cars' inspections in the time period which the proposed model covers.Equation ( 3) is the prediction model developed in Section 3.2.In (4), ( − ) is the mean of elements of the measurement column vector ( − ), and (  , ) is calculated through the least squares method from the point set {( − −   , ( − )), 0 ≤  ≤ }.
When the optimal solution to the object function, that is,  * , is attained, the value of (  , ) in ( 4) is the average deterioration rate which the proposed model uses to make predictions, as shown in Figure 3.As track inspection cars continue to inspect track condition, the process illustrated in Figure 3 will keep repeating itself.

Performance Analysis
This section analyzes the performance of the enhanced model in terms of statistical analyses of errors in geometry parameters values predicted by the enhanced model and comparison analyses of errors between the enhanced and original models.Since the middle of 2007, the authors' team has been collaborating with a couple of bureaus of China Railways on optimal track maintenance scheduling.The Jinan bureau is one of these collaborative bureaus.The collaboration with these bureaus entitles us to access their track The inspection frequency of track geometry cars for the Jiulong-Beijing railroad is basically 3 times per 2 months.The track section whose track geometry data are used to perform the performance analyses starts at the milepoint of K612+000 and ends at K614+000.Rails in this track section were manufactured in early 2003 and were placed on March 1 of 2003.They are all standard length rails, and their weight per meter is 60 kilograms.Because the Jiulong-Beijing railroad is a skeleton one, its rails are all continuously welded.Ties were manufactured in 1994 and were placed in 1995.All ties are of type II concrete ties and the number of ties per kilometer is 1760.The rails are fastened onto the ties with fastening systems of type I.Under the ties, 30 cm thick granite ballast and 20 cm thick granite subballast layers were laid in 1994.The section from K582+84512 to 650.33300, apparently spanning the track section to be analyzed, was rehabilitated in 2003.
Track geometry data that are used to do the performance analysis were acquired through a track geometry car of GJ-4, the most extensively used model of track geometry car in China Railways.Due to errors in milepoint measurements mentioned in Section 3.1, track geometry data were processed by both KEI and DSPM. Figure 4 as an example shows the superposed waves of track geometry data on February 20 and March 6, 2008, for the analyzed track section.In Figure 4, the waves of February 20 are plotted with the black color and the waves of March 6 with the gray color.From this figure, it is hard to differentiate the superposed waves of track geometry parameters.The reasons are that milepoint measurements are almost corrected by the milepoint correction models, namely, KEI and DSPM, and most track positions normally deteriorate slightly within a short period of time, that is, 16 days in the demonstrated example.

Statistical Analyses of Prediction Errors.
After each inspection run of the track geometry car, track conditions within following two months, namely, values of each geometrical parameter on sampling points in the analyzed track section, were predicted by the original and enhanced models, respectively.The parameter of the length of the time period, within which track geometry data are processed to calculate the deterioration rate, was determined for the original model according to experiences of field engineers and takes on the value of four months, whereas the value for the enhanced model varies.After the inspection on October 30, track condition prediction is made for each of days from October 31 to December 31, 2008.Within this date range, there are three inspections by the track geometry car on November 13, December 12, and December 25.
Figure 5 plots predicted values versus actual values on December 12 for each geometry parameter.The predicted values are plotted along the vertical axis, whereas the actual ones are along the horizontal axis.Table 1 tabulates three statistical indices of errors in predictions by the original and enhanced models on these three days for each of geometrical   The plots in Figure 5 clearly show that the predicted values for each geometry parameter are pretty close to the actual ones.For the parameter of gauge, 95 percent of the predicted values have errors less than 0.2717 mm; errors of 95% predictions for crosslevel, left surface, right surface, left alignment, right alignment, and twist are less than 1.1409 mm, 1.4894 mm, 1.2346 mm, 0.7765 mm, 0.8202 mm, and 1.6558 mm, respectively.
Table 1 shows that the mean and standard deviation for each geometry parameter are far below its corresponding theoretical measurement accuracy of the track geometry Figure 6 shows the values of standard deviation in Table 1 as vertical bars for each geometry parameter.From these figures, it is apparent that standard deviations associated with the enhanced model are less than the ones with the original model.More importantly, standard deviations linked with the original model basically increase as the time span between the date of the available inspection, for example, October 30, and the date on which predictions were made, for example, November 13, lengthens; however, the standard deviation connected with the enhance model stays robust in the whole time span.This inference implies that compared with the original model the enhanced model possesses a characteristic of robustness.

Discussion about Predictions Usage for Optimal Track Maintenance Scheduling
After each inspection run of track geometry cars, the inspection data is used to calculate two categories of track condition indices: track geometrical exceptions and Track Quality Index (TQI) [29][30][31].Track geometrical exceptions are characterized as the geometry parameter, the maximum value, the exception class, and the length.Track geometrical exceptions fall into four classes according to their maximum values: I, II, III, and IV.The lower and upper limits of each exception class for each geometry parameter with a speed range are specified by MOR in the Railway Line Maintenance Regulations.Figure 7 depicts an exception of class III.For geometry exceptions of each class, MOR recommends certain measures to be taken.TQI is calculated for a track unit section and is the sum of standard deviations of seven geometry parameters over the unit section [19].MOR also specifies the management thresholds of TQI for each class of railroads, as listed in Table 2.It is recommended by MOR that track unit sections with TQI greater than the corresponding TQI threshold should be considered when scheduling track tamping maintenance.
A program having the enhanced model has been coded by the authors.The two categories of track condition indices are therefore available two or three months in advance.According to the Railway Line Maintenance Regulations, future two or three months of maintenance works are available in advance.Given maintenance works of two or three months in a planning horizon, they can be optimally scheduled so as to minimize over the planning horizon the sum of the cost for travels between maintenance sites and travels between maintenance sites and depots, the cost for maintenance works themselves, and the influential cost of completing the works on rail transportation.The object function of the optimal scheduling problem is subject to the constraints of available maintenance resources including maintenance machines, required materials, crew members, and track windows left in train timetable.

Conclusions and Following Research Areas
As railroad infrastructure becomes older and older and rail transportation is developing towards higher speed and heavier axle, the risk to safe rail transport and the expenses for railroad infrastructure maintenance are increasing.The railroad infrastructure deterioration (prediction) model is vital to reducing the risk and the expenses.This paper enhanced our previous railroad track prediction model.The previous model considers that the length of the historical period (within which track geometry data from track geometry cars are used to estimate track deterioration rates) is constant for all analysis sections in a railroad track.The value of the length for the previous model is determined according to field engineers' knowledge.Comparison analyses between track geometry exceptions calculated from track condition measurement data from track geometry cars and those from predictions by the model imply that the method cannot provide reliable track condition predictions for all analysis sections.The enhanced model, according to track deterioration process revealed by lining up historical track geometry data, employs the minimum sum of squared differences between prediction values and measurement values to estimate the optimal length of the historical period for each analysis section.One year of track geometry data for the Jiulong-Beijing railroad was used in the section of performance analysis to perform error analyses and comparison analysis between the original and enhanced models.The analysis results show Mathematical Problems in Engineering

2 MathematicalFigure 1 :
Figure 1: Railroad infrastructure expenses per mile for the last ten years.

Figure 2 :
Figure 2: Two groups of track deterioration process.

3. 2 . 1 .
Method for Predicting Track Condition.As noted in Section 2, track condition is the result of the cumulative comprehensive influences of all impact factors.Track condition evolution for an analysis section is mathematically expressed as the equation () = (  ) + ∫    (), where () and (  ) represent track condition at the time points  and   (wherein the subscript  denotes the th inspection of track geometry cars after a maintenance work covering the analysis section), respectively, and () denotes the deterioration rate of track condition at the time point .The deterioration rate function () quantifies the comprehensive effects of all impact factors at a point in time.The cumulative influences over the time range from   to  are modeled through the integral of the deterioration rate function () over the time period, that is, ∫    ().The equation shows that accurate estimation for the deterioration rate function is crucial to accurately predict track condition.

Figure 3 :
Figure 3: The formulated prediction model at the time point   .

Figure 4 :
Figure 4: Superposed waves of track geometry data on February 20 and March 6.

Figure 5 :
Figure 5: Comparison between actual values and predictions by the enhanced model on December 12.

Figure 6 :
Figure 6: Comparison of standard deviations of prediction errors between the original and enhanced models.
[5].Determination of the Analysis Section.Track geometry measurement data from track geometry cars are usually used to analyze track deterioration[5].Track geometry cars at the speed of train travelling measure various track geometry parameters, positioning parameters (i.e., milepoint), and comfort related parameters at a constant sampling distance.

Table 1 :
Statistics about errors in predictions by the original and enhanced models.

Table 2 :
TQI management thresholds specified by MOR.From Table1, it is clear that the mean of errors in predictions by the enhanced model for each parameter on each day is almost identical to the one by the original model and approximately equals 0. When it comes to the other statistical index (standard deviation), values for the enhanced model are less than the ones for the original model, whereas the correlation coefficient associated with the enhanced model is greater than the one with the original model.Those inferences imply that in comparison with the original model the enhanced model makes more accurate prediction.