Railroad Track Deterioration Characteristics Based Track Measurement Data Mining

Accurate information on future railroad track condition is essential to optimally schedule track Maintenance & Renewal activities in order to minimize influences of the activities on rail traffic under constraints of limited budgets and maintaining allowable condition tracks. In this paper, a track measurement data mining method is presented to this aim. It is developed on the basis of track deterioration characteristics. Actual track measurement data is used to analyze errors in track condition predictions by the method. The analysis results show that the proposed method can mine accurate track deterioration rates from historical track measurement data and thus accurately provides future track condition two or three months in advance.


Introduction
Traffic accidents have long been social-economical problem which has caused increasing concerns to the public worldwide [1]. According to statistics on train accidents by Office of Safety of US Federal Railroad Administration, 542933 people were injured or killed by railway accidents mainly resulting from railway track from January 1975 to May 2011 [2].
Transportation systems play a critical role in the development of society and economy. Railway system constituted the largest part of national freight ton-miles, for example, 38.2% in 2005 in USA [3] and 49.70% in 2005 in China [4]. Railroad track as a base element of the railway system greatly and directly influences safety and cost efficiency of rail transport. In the process of track management, maintenance-of-way departments have to try to balance the cost associated with potential damages arising from unfavorable tracks and the cost for Maintenance & Renewal activities to minimize the life cycle cost of track. To attain the minimization of the life cycle cost, there are key issues which need to be addressed. One of them is the railroad track condition forecast technology which is able to allow maintenance-of-way departments to acquire accurate track condition information two or three months in advance. Such information is essential to optimally schedule Maintenance & Renewal activities, constrained by limited budgets and maintaining track in allowable condition, to minimize influences of the activities on rail traffic.
To date, there are several track condition prediction methods developed throughout the world by researchers of universities, technology firms, and railroads. Researchers of the Railway Technical Research Institute of Japan employed Double Exponential Smoothing method to develop track degradation models for predicting standard deviations of track surface and alignment over 100 meters long sections of track [5]. Alfelor and Fateh of the United States Department of Transportation and Carr of the ENSCO Inc. built a track degradation database which stores measurements of Gage Restraint Measurement System, including track gauge restraints and track geometry parameters and contributes factors covering traffic loads, environmental factors, and track structural characteristics [6]. Using the degradation database, a track degradation analysis program was coded, which is able to establish a one-to-one linear relationship between track degradation and a specified contributing factor by employing least square linear regression. Chen et al. of the Academy of Railway Sciences of China proposed an Integrated Factor Method (IFM) to predict values of track geometrical parameters on next month oversampling points [7]. IFM assumes that track geometrical condition on next month is linearly dependent on the current month's track geometrical condition. The linear evolution rates of a given geometrical parameter are determined according to standard deviations of the geometrical parameter over 200 meters long sections of track. After analyzing the determined evolution rates of a track, Chen et al. categorize evolution rates of a given track geometrical parameter over track sections into 17 groups, each of which has a constant evolution rate. To make the prediction, measurements of the last running of track geometry car are used. Based on a common denominator, "A good track behaves well, while a poorer one deteriorates faster, " Veit of the Graz University of Technology and Marschnig, a LCC rail consultant, of Austria considered the relationship between initial and future track condition linear, as well as the relationship between future track condition and elapsed time since the last ballast temping, cleaning, or renewal exponential, and thus proposed an exponential model to predict condition over 5 meters long sections of track between two adjacent maintenances of given kinds [8]. Quiroga and Schnieder of the Braunschweig University of Technology of Germany proposed an exponential model, like the model of Veit and Marschnig, to predict mean deviations of track surface over 200 meters long sections of track between two adjacent temping activities [9]. Based on experience of experts in regard to railroad track deterioration, Meier-Hirmer, Riboulet, and Sourget of the SNCF and Roussignol of the Université Paris-Est Marne-la-Vallée of France employed gamma stochastic process to fit the deterioration rate of track surface over 1000 meters long sections of track [10]. According to track deterioration characteristics, Xu et al. proposed a multistage linear method to describe track condition deterioration processes between two adjacent maintenance activities [11]. Based on the research results in [17], Xu et al. employed piecewise linear regression to develop a method for predicting mean values and standard deviations of track condition over unit sections in the future two or three months [12].
According to the track deterioration characteristics, this paper will present a novel method that mines historical track measurements for track condition in future two or three months. The rest of the content will be organized as follows. Section 2 briefly discusses the track deterioration characteristics. Based on the characteristics, the novel track measurement data mining method follows in Section 3. Using actual track measurements, performance of the presented method is analyzed in Section 4. Research conclusions regarding the current research are drawn, and future research areas related to the current research topic are briefly discussed in Section 5.

Track Deterioration Characteristics
Generally speaking, railroad track deteriorates as a result of accumulative combinational influences of seven categories of impact factors [13][14][15][16]: (1) wheel loads on the rails, (2) track characteristics, (3) materials and manufacture, (4) design and construction, (5) maintenance, (6) environment, and (7) terrain. The influence of the wheel loads is by far the primary cause for track deterioration. The track characteristics, that is, track configurations and condition of track components, play a critical role in resisting track deterioration and greatly affect dynamic wheel loads. Track deterioration usually begins with small imperfections in the materials and errors in the manufacture of rails and other track components, and performances of the materials and efficacy of the manufactured components are crucial for maintaining track in good condition. Influences of errors during the design and construction of track add to the influences of the materials and manufacture, as the initial source of track deterioration. During a maintenance operation, survey errors, measurement errors, and maintenance machine tolerances may introduce additional track deviations. Moreover, different kinds of maintenance machine usually have different effectiveness. During track deterioration, in addition to the wheel loads, environmental factors directly deteriorate track as well. As the base of railroad track, terrain has obviously direct and great influences on track deterioration. Any variations in terrain will be reflected in rapid deteriorations of track.
During track deterioration, some of these categories of the impact factors interact with each other. For instance, wheel loads deteriorate a track in terms of condition of track components and performances of the materials of the track components. Simultaneously, the deteriorating track increases the wheel loads and reduces the resistance to the deterioration. Such interaction between these three categories of the impact factors continues as trains run over the track.
The above brief introduction shows that there are many kinds of impact factors affecting track deterioration. The combinational influences of all the impact factors vary from one track location to another. In other words, each track point location has its own unique track deterioration process. The uniqueness characteristic of track deterioration has been qualitatively proven during the past several decades of track management practices. To date, only few of the impact factors are measurable, and interactions among the impact factors are unmeasurable. Mainly because of these two facts, when modeled, track deterioration is usually considered random.

Track Measurement Data Mining Method
Based on the above introduced track deterioration characteristics, this section will present a track measurement data mining method, which allows maintenance-of-way departments to acquire track condition two or three months in advance. To this aim, brief introductions of track condition and the corresponding measurement data are given first. The mining method is presented last.

Track Condition and the Corresponding Measurement
Data. Track condition is described by track geometrical condition and track structural condition [17]. But track condition usually refers to only track geometrical condition. Our research follows this terming convention as well. Track condition is described by eight geometrical parameters [16]: Gauge, Cross Level, Left/Right Surface, Left/Right Alignment, Twist, Mathematical Problems in Engineering 3 and Curvature. These parameters of tracks under wheel loads are often measured at a specified sampling interval with Track Geometry Car. Within China Railroads, there are four kinds of Track Geometry Cars. GJ-4 is the most extensively used category, and its sampling interval is 0.25 m. In addition to the eight geometrical parameters, measurement data by GJ-4 also includes other two categories of parameters: positioning and comfort. There are two positioning parameters, Milepoint and Auto Location Detection, and four comfort parameters: lateral and vertical box and axle accelerations.

Characteristics Based Track Measurement Data Mining.
As pointed out in Section 2, track deteriorates uniquely at track point locations. As a result, it is ideal that track condition data should be mined on a track-location basis. In reality, it is impossible to obtain actual values of geometrical parameters over track locations mainly because of errors in milepoint measurements of track measurement data [18,19]. This means track condition measurements cannot be mined on a point basis. To model track deterioration as accurate as possible, consequently, track deterioration modeling should be done on a short track section (referred to as unit section, hereafter), whose length is determined by the accuracy of milepoint measurements. In previous researches, two fine levels of milepoint error correction model have been developed. After processed with the two correction models, the track condition measurement data can achieve milepoint accuracy most often far below two sampling intervals, that is, 0.5 m. Accordingly, the length of unit section, on which track deterioration will be modeled, is 0.5 meters. Except for unit sections that cover badly damaged rail joints, such track length is reasonable for the track deterioration modeling, because track of 0.5 m in length deteriorates basically similarly.
For a given unit section, the track between two adjacent maintenance operations deteriorates nonlinearly with the accumulative influences of combinational impact factors. But within a short period of time, the track deterioration can be considered approximately linear. In other words, within the short period of time, the track has an approximately identical deterioration rate. It is important that the length of the short time period should match the cumulative combinational influences of all the impact factors. Therefore, if the deterioration rate of a unit section in a short period of time is available, track condition over the unit section can be forecasted. Because track condition is described with 8 geometrical parameters (see Section 3.1), the track deterioration rate of a unit section is characterized by 8 deterioration rates corresponding to the eight geometrical parameters as well.
Because there are at least two sampling points on a unit section, the mean of measurement values of a specified geometrical parameter over the section is used to present the value of the geometrical parameter over the section. Such processing method has two main benefits: (1) reducing the adverse effect of remaining milepoint errors and (2) tolerating the negative effect of noises in geometrical measurements. For a given geometrical parameter on a given unit section, its deterioration rate within a short period of time can be obtained by employing the least square method to fit values of the parameter in the time period. Unfortunately, for the forecast scenario, the values of the geometrical parameter are unavailable. There is a practical knowledge regarding track deterioration that a section of track with high deterioration rate deteriorates rapidly, and a section of track with slow deterioration rate deteriorates slowly. The reason is that the two categories of impact factors, the track characteristics, and the terrain, over a section of track without maintenance operations involved, are basically unchanged within a short period. The practical knowledge indicates that for a specified geometrical parameter over a given unit section, the deterioration rate in future several months can be estimated by using the historical values of the parameter over the section. The date range, within which the historical values are used to make estimation, must match track deterioration characteristics.
Let , be the number of days between the day when the last maintenance operation was carried out on a track section covering the given section and the day of days after the th track geometry car inspection since the last maintenance, and let , be the value of the specified geometrical parameter over the th sampling point in the given section on , days after the last maintenance operation. Assume that the number of sampling points in the given section is . Let , denote the value of the specified parameter over the given section; that is, , = (∑ =1 , )/ .
Assume that the th track geometry car inspection is the current inspection and the date range, within which the historical values are used to estimate the deterioration rate, is from ℎ,0 through ,0 . Accordingly, the deterioration rate , which is used to forecast track condition, is obtained according to the following: Track condition in the future can be considered the sum of the current condition and the cumulative combinational effects of all impact factors, which are quantified by the deterioration rate . Therefore, the value of the specified parameter on a day, , , over the th sampling point in the given section can be estimated aŝ, = ,0 + * , wherê, denotes the estimation for , . Let , = [ 1 , , . . . , , ] and̂, be the estimation for , ; that is, , = [̂1 , , . . . ,̂, ] . The above given process of mining historical measurement values of the specified track geometrical parameter for future track condition is graphically demonstrated in Figure 1.   As noted previously, on a unit section, the track deteriorates nonlinearly with the cumulative combinational influences of the impact factors. To approximate actual track deterioration rates as close as possible, the estimated deterioration rates should be revised continuously as the impact factors deteriorate the track. Therefore, after a new track condition inspection was carried out, the inspection data is involved in the process of the deterioration rate revision, as demonstrated in Figure 2

Performance Analysis
The Kowloon-Beijing railroad track is very important in the Chinese rail networks. It connects Kowloon to Beijing through 9 provinces. By the million ton kilometers, it ranks in the top sixth of the entire Chinese rail networks; by the million passenger-kilometers, it is ranked fourth. In this section, errors in track condition predictions for a 2 kilometer On the presented track section, 60 U71Mn rails were laid and continuously welded, where 60 indicates that a piece of such rail 1 meter in length weighs 60 kilograms, and U71Mn is the material of the rails. The ties on this section are concrete and their model number is II. The ballast is the first class granite rocks, and thickness of the ballast layer is 50 mm.
We made discussions with field engineers on the length of the short time period, within which the track deteriorates approximately linearly. The time period takes the values of 6 months. Considering the traffic characteristics of the track section, track condition of future two months should better be predicted. Therefore, track measurement data in the last four months is used to estimate the deterioration rate.
For measuring the performance of the proposed method, four statistical indices are calculated. They are the mean, the standard deviation, the mean absolute error, and the correlation coefficient between measurement values and predicted values.

Right Alignment.
Errors for every prediction are presented in the histogram, as shown in Figure 3. Four performance measuring statistical indices are calculated for the prediction on each target date as well, as listed in Table 1.
From these 6 histograms, it is concluded that errors of each prediction are normally distributed around 0 mm. The normal distributions of errors indicate that the proposed track measurement data mining method is able to capture the track deterioration trend components and then to use the captured deterioration trends, that is, the deterioration rates, to make predictions.
The above conclusions drawn from the error distributions are quantitatively verified by the calculated statistical indices. The mean error for each prediction is very close to 0 mm; the standard deviation for each prediction is far below 1 mm. From these statistical facts, it is confident that for right alignment, the proposed method can accurately predict its values two months in advance.   Figure 4. The statistical indices of errors are worked out for each prediction, as listed in Table 2.

Mathematical Problems in Engineering
Histograms in Figure 4 show that errors in the twist predictions are normally distributed around 0 mm. These facts indicate that deterioration trends in the twist are captured by the proposed method. Statistical indices in Table 2 confirm the conclusions drawn from Figure 4. The mean error for each prediction is very close to 0 mm; the standard deviation is far below 1 mm. From these statistics, it is straightforward to infer that the proposed method can accurately predict values of twist over sampling points two months in advance.

Conclusions and Future Research Areas
In this paper, a track measurement data mining method has been proposed. The method can mine the track condition deterioration trends which are essential to make predictions for track condition. Track condition measurement data of the Kowloon-Beijing railroad track was used to analyze errors in track condition predictions by the method. The analysis results show that the proposed method can accurately predict values of geometrical parameters over sampling points. The analysis on the track deterioration characteristics shows that if the length of the time period (see Section 3.2) can be determined in accordance with actual track deterioration processes, errors in track condition predictions can be reduced further. That means more reliable track condition predictions can be available. Therefore, this direction towards improving the proposed method will be investigated. What is more, the distributions of errors in track condition predictions show that if a normal random variable is incorporated into the proposed method, it will be enhanced again.