^{1}

^{1}

^{2}

^{3}

^{4}

^{3}

^{1}

^{2}

^{3}

^{4}

The presented paper concerns the development of condition monitoring system for railroad switches and crossings that utilizes vibration data. Successful utilization of such system requires a robust and efficient train type identification. Given the complex and unique dynamical response of any vehicle track interaction, the machine learning was chosen as a suitable tool. For design and validation of the system, real on-site acceleration data were used. The resulting theoretical and practical challenges are discussed.

A key and irreplaceable part of every railway track is its switches and crossings (S&C). In terms of dynamic effects, these are some of the most loaded track sections. They not only interrupt runway continuity but also see a change in track stiffness. S&C represent only a small part of a railway network in terms of the length of track; however, their maintenance (which includes special rail structures such as road crossings), relative to conventional tracks, can involve high maintenance costs [

For the above reasons, condition monitoring of railways (not only S&C) is a very current topic. In recent years, various sensors and methodologies for measuring and evaluating results have been developed [

According to [

This paper is focused on the first part of the self-diagnostic system for railway switches and crossings (S&C)—Train Identification System (TIS). A similar system for predictive maintenance for rail switches is, for example, Konux [

TIS is based on real on-site data from the acceleration sensor and in the future is assumed use of embed vibration acceleration sensors. Accelerometers provide various benefits including the following temperature stability (over a wide range of temperatures), wide frequency response, linearity, adaptability, and ruggedness. As such, they are suitable for fully operational online measurement in the long term. Different dynamic effects can be observed for each train type [

This research was part of an initiative whose aim is to investigate, develop, validate, and initially integrate radically new concepts for switches and crossings that have the potential to lead to increases in capacity, reliability, and safety while reducing investment and operating costs.

The first part of the article is dedicated to the description of measurement of the data, selection of datasets, their analysis, and building of a vector for machine learning. The second part deals with the application of support vector machine and validation of the results.

The used data were collected during several measurement campaigns which took place in the years 2013 and 2014. The measurements were made primarily on two locations: Choceň and Ústí nad Orlicí and on two S&C per each location. The accelerometers were mostly placed around the crossing because of the maximal dynamical effects on rails and bearers during the train passage. The placement of the sensors is shown in Figures

Measurement methodology used for data acquisition from the crossing part of the S&C.

On-site installation of the sensors according to measurement methodology above.

All data were acquired with measuring system Dewetron DEWE 2502 and acceleration sensors triaxial piezoelectric Brüel &Kjær 4524 B001 (for rail) and piezoelectric Brüel & Kjær 4507 B004 (for bearer). Sampling frequency was set on 10 kHz, high-pass filter frequency 3 Hz, and low-pass filter frequency on 1000 Hz [

The train speed was measured by radar speed gun Bushnell.

Acceleration is measured at several points along the crossing. The observed magnitude is chosen as the vertical acceleration of the bearer under the crossing nose, as this is the point at which the greatest dynamic effects on bearers occur. Undoubtedly, any damage to the trackbed or the crossing would influence the frequency response. Figure

Acceleration plot of the bearer under the crossing nose.

The full dataset consists of over 100 complex measurements (in addition to the acceleration, which was measured at several S&C locations, train speed and rail displacements were also measured), taken from trains passing through crossings at a number of stations. However, for building successful classifier, it is required to have data that were obtained under the same or very similar conditions. In Figure

Comparison of data vectors from Ústí nad Orlicí (rows 1 and 2) and Choceň (rows 3 and 4).

The available dataset was able to meet the requirements mentioned for only four trains. However, the number of measurements was still sufficient to build the minimum number of data subsets for training and testing. The mechanical properties of the trains are given in Table

Mechanical properties of the trains.

Locomotive class | 151 | 362 | 380 | Leo |
---|---|---|---|---|

Distance between pivots (m) | 8.3 | 8.3 | 8.7 | 16.0 |

Axle spacing (m) | 3.2 | 3.2 | 2.5 | 2.7 |

Max. axle loading (t) | 20.5 | 21.75 | 21.5 | — |

Weight (t) | 82.0 | 87.0 | 86.0 | 150.0^{1} |

^{1}Total weight of the whole five-car unit.

Train types: (a) 151, (b) 362, (c) 380, and (d) Leo Express.

These locomotives are very similar, in terms of both geometry and design. They were made by Czech industrial conglomerate Škoda Works. All the locomotives are electrical; however, class 151 can be powered only by direct current (3 kV) while both 362 and 380 are adapted for other standardised voltages and current (362 is equipped with double system 3 kV DC/25 kV 50 Hz and 380 is equipped with even triple system 3 kV DC/25 kV 50 Hz/15 kV 16,7 Hz). The maximal speed is 160 km/h for type 151, 140 km/h for 362, and 200 km/h for 380. Locomotives 151 and 380 have the same fixed wheelbase and pivot spacing.

Leo Express train is Stadler Flirt IC five-car electric multiple unit. That means the train signal should always have 12 peaks. The major difference between LE and previously mentioned trains is system of chassis. The LE has two powered bogies (at both ends of the train) and 4 Jacobs bogies [

There are 3 considered groups of methods: (i) complex time-frequency methods, (ii) methods based on statistical processing, and (iii) combination of the two previously mentioned. The first group analyses signal simultaneously in both time and frequency domains. There are several time-frequency distribution functions, such as wavelet transform (WT), Wigner–Ville transform (WVT), and short-time Fourier transform (STFT). With these methods, it is possible to conduct a sufficiently detailed analysis of the structure’s frequency response to reveal minor differences in the individual signals that can suggest that there are vehicle and track faults. However, the major disadvantage of these methods is their significant requirement for data performance and, hence, for computing resources. This is problematic when attempting to ensure long-term in situ measurements for multiple S&C. The use of expensive sensors is also necessary to ensure the high quality of the signals; however, this may not align with other deployment objectives.

The second group of analysis methods can be used as an alternative, and these are based on statistical processing. For example, it is possible, with these methods, to evaluate the maximum amplitudes, as well as their count, standard deviation, and long- and short-term variance. This group of methods is, in essence, the opposite of the time-frequency methods because they have little sensitivity to imperfect input signals, their computational difficulty is negligible (in comparison with the first group of methods), and the device built as a result can be inexpensive. However, the main disadvantage of the second group is that there is limited information in the frequency domain, meaning that the detection of any defects might be too late to be of use. Nonetheless, the time domain of the signal provides very accurate information.

The methods in the third group are a combination of the two approaches mentioned previously, enabling the time domain of the signal to be analysed using statistical methods. In identified areas of interest (for example, maximum amplitude axles), a simple frequency analysis can be conducted using the selected signal subsection’s frequency spectrum and its statistical properties. This method is advantageous for our research because it is economical on computer performance while being able to adequately describe the signal.

The use of statistical methods was inspired by previous research [

As a truly economical system, this can easily be scaled and expanded to other variables (as demonstrated in Figure

LEO Express train signal windowed variance (three passages). Window size: number of samples/300. Top: windowed variance value. Middle: windowed maximum. Bottom: windowed mean.

Peak detection of windowed variance. Window size: number of samples/300. Defines the area in the time domain; the analysis is then conducted in the frequency domain.

Evaluation is performed in the frequency domain with the Seewave package [

Discrete probability density plots of the Leo Express.

The analysis is supplemented further by the maximum and minimum frequencies at three density intervals (0.0001–0.00015, 0.00015–0.0002, and 0.0002–0.0004). Combining the scalar features of the statistical properties in the frequency and time domains enables a vector to be obtained that represents the signal in the time-frequency domain but with minimum resources in comparison with traditional methods such as WT or STFT. However, it should be noted that not all vector values are relevant.

A wide variety of data and formats can be used as inputs for ML. A high-resolution accelerometer signal (such as 10 kHz) as an input is likely to be the simplest option. However, this would require a particularly powerful computing subsystem with substantial memory, which would render the method unsuitable for use in situ or on larger scales. In addition, it is not guaranteed that such a procedure will lead to the best results. Therefore, the selection of the descriptive features is an important step in the ML-based identification process, along with the creation of a sequence of

The initial set of 27 scalar features contains number of peaks (number of axles), their minimum and maximum, standard deviation, and total sum. Furthermore, the mean of the signal, standard deviation, median, standard error of the mean, 25% and 75% quantile, interquartile range, centroid, skewness, kurtosis, spectral flatness measure, and minimum and maximum frequencies for a given interval of discrete probability density. This vector was reduced to 5 using an iterative optimisation process, whereby accuracy was maximised by the minimisation of training time, evaluation time, and classifier memory and loss. The initial set of 27 scalar features contains number of peaks (number of axles), their minimum and maximum, standard deviation, and total sum. Furthermore, the mean of the signal, standard deviation, median, standard error of the mean, 25% and 75% quantile, interquartile range, centroid, skewness, kurtosis, spectral flatness measure, and minimum and maximum frequencies for a given interval of discrete probability density. The use of the whole vector was considered; however, due to the low number of data and the large number of possible parameters, this is an overdetermined problem, and therefore, according to the authors, it did not make sense to do a detailed sensitivity analysis. Figure

Visualization of clustering of scalar features for individual train passages sorted by train type.

The following scalar features were chosen to describe the train passage:

_{peaks}: number of peaks detected during windowed variance. The R language findpeaks function was used for the detection. The number represents the number of axles on the train.

peaks_{sum}: sum of maximum values of npeaks detected. To a certain extent, this expresses the absolute amount of dynamic energy that is transmitted to the sleeper

sem: the random sampling process is described using the standard error of the mean. The variation in measurements is described using the standard deviation of the sample data. The sem is a probabilistic statement that describes how the sample size, considering the central limit theorem, will provide a better boundary on estimates of the population mean.

IQR: the interquartile range, which is also known as the midspread or the middle 50% (or, technically, H-spread), is a measure of statistical dispersion, which is equal to the difference between the upper and lower quartiles, or between the 75th and 25th percentiles. The IQR value represents the bandwidth of energy transferred to the sleeper.

prec: the spectrum’s frequency precision.

The aim of this study is to confirm the hypothesis regarding the possibility of using recorded acceleration data to identify specific train types at rail S&C. Utilisation of ML methods [

An in-depth literature review showed that the use of measured acoustic or acceleration signals with ML to identify train type was performed successfully on a segment of plain-line railway [

Currently, there are many machine learning methods that differ in the structure and complexity of the algorithm and the suitability for use with different types and sizes of input data. Based on recommendations derived from the literature review, as well as initial investigations using the available ML methods at Mathematica [

A decision tree [

Gradient boosting [

Logistic regression [

In a Markov model [

A naive Bayes [

Nearest neighbours [

The random forest [

A neural network (NN) [

Unlike neural network, the support vector machine [

In this research, machine learning and its postprocess were performed in Wolfram Mathematica 11.1 [

Comparison of ML methods shows the accuracy, training times, and required computation memory for some of the previously mentioned methods (Table

Comparison of ML methods.

Method | SVM | Neural net. | Log. reg. | Nearest neigh. | Rnd. forest |
---|---|---|---|---|---|

Accuracy (%) | 75 | 58 | 67 | 58 | 50 |

Train. time (s) | 2.0715 | 0.9397 | 0.293 | 0.0406 | 0.0529 |

Memory (kB) | 323.384 | 219.512 | 189.240 | 126.824 | 199.520 |

In terms of implementation, SVMs are regarded as binary classifiers [

Although classification using SVM can be controlled in a number of ways [

Full validation of the classifier is impossible due to the limited number of comparable train passages. In the smallest classes, it is only possible to use one train passage for validation, whereas it is possible to use the remaining four comparable train passages for training. This is the case for all combinations. In total, 19 train passages are used, with the recorded acceleration time history being reduced to 5 scalar features.

Due the low number of comparable train passage in the classes, the reliability of the classifier highly depends not only on the selection of the scalar features, but also on the choice of the vectors (passages) for the training set. To avoid cherry-picking and decrease possibility of incorrect results due to the inappropriate selection of data for training and testing, the bootstrap analysis was performed. Bootstrapping is a compute-intensive method for statistical data analysis [

Visualization of clustering of scalar features for individual train passages sorted by train type.

As soon as the sets were ready, the ML was performed and classifier was built. The result of the consecutive testing was confusion matrix. This process of building subsets, training and testing, was repeated 1000 times. As the outcome of this repetition process, 1000 confusion matrices were obtained (i.e., one matrix per one subsets selection).

In ML, a confusion matrix (also known as an error matrix [

The 3 random confusion matrices obtained by random selection of train and test sets.

From all 1,000 matrices, one average matrix was evaluated (i.e., total sum of all results on the same location in matrix was divided by number matrices). For easier understanding, values in each row of the matrix were rescaled to give total sum of 1 so it is possible to seen probability of (miss)classification for this class. Because the test sets were not the same size, the colour of each field tells the information about significance of the testing—the darker, the higher number of test samples. That means that if there are in test subset, for example, 6 train passages from the same train type for testing and it gives the probability of correct classification 0.9, it is more reliable than if there is only 1 testing passage.

The confusion matrix shown in Figure

An average confusion matrix calculated during cross-validation using the SVM classifier. The accuracy is relatively poor at 61%, but this can be attributed to the lack of training data and train similarities.

Although the data contain passages from before and after the common crossing was renovated, the identification method is sufficiently robust, based on the probabilities, to allow for railway crossing component modification, provided that measurements are obtained at the same locations, and as long as the primary objective is TIS only, not condition assessment.

In this paper, the authors have conceptually approached the AI-assisted Train Identification System (TIS), a component of the self-diagnostic system for S&C, utilizing real on-site acceleration data from TEN-T railway lines in Czech Republic. This research is part of the S-CODE project; the overall aim is to investigate, develop, validate, and perform initial integration of radically new concepts for S&C with the potential to increase their capacity, reliability, and safety, while reducing investment and operational costs. Presented approach is unique in attempting the TIS based on measured acceleration time histories in S&C rather than in straight track.

The presented accuracies of the various 5 ML classifiers are clearly limited due to the number of uncontrollable variables and uncertainties, as well as due to limited number of comparable train passages, considering the dimensionality of both the physical problem and the abstract models. As the classification procedure can be sensitive to unequal class sizes, all training classes (train types) have the equal size of 4.

Although a bootstrapping analysis has been performed (1,000 training and testing subsets) in order to fully utilize the experimental evidence and to more objectively select the data for training and testing, the resulting average confusion matrices show prohibitive probabilities, which can be attributed to similarities of the 151 and 380 locomotives, low number of observations, and complex dynamic interactions at S&C in general.

Nevertheless, based on the presented theoretical and practical arguments, it can be concluded that the support vector machines (SVM) can be recommended as most suitable ML method. This conclusion is in line with the published evidence (TIS based on straight track measurements) and is supported by the presented comparison of alternative ML methods. The obvious trade-off for highest accuracy, the increased training time, and memory, however, is relatively cheap considering the efficiency and availability of current low (energy harvested battery powered) powered computer modules and relative to hardware resources required for statistical preprocessing of the recorded vibration time histories.

In fact, the average accuracy of 75% for SVM-based TIS at S&C cannot be considered entirely off if the published results from straight track TIS using SVM yields accuracy of 96%, and considering the inherently more complex and uncertain response of S&C compared to straight track and the clear similarities of the 151 and 380 locomotives.

For future applications within the system of early warning, it would be advisable to implement the SVM method, use it within the experimental envelope, avoid excessive extrapolation (as can be generally recommended for all ML methods), and combine the diagnostic from S&C with straight track measurement, where it is possible to identify defects on carriage, such as a flat wheel. This would dramatically improve the sensitivity and specificity of the TIS by, e.g., avoiding false positives from boogie defects. Although current optical systems can be used to identify trains by detecting and evaluating the mark placed on each locomotive, these systems are relatively expensive and sensitive to maintenance and weather conditions, compared to the vibration data-driven ML models.

One of the contributing factors to the overall uncertainty is the variable number of passengers in each wagon, significantly affecting the dynamic characteristics of the train formation. This particular aspect can be approached by trimming the signal so that only the locomotive remains, resulting in easier-to-classify data while simultaneously reducing hardware requirements. However, it would be necessary to define objective and universal applicable method of trimming, due to the complex interference of the vibrations caused by the locomotive and the following car, the nonuniform number of locomotive axles, or the presence of Jacobs bogie. In addition, by shortening the signal, some data that can provide valuable information about the condition are lost, and, most importantly, for evaluating the locomotive-only signal, analytical approaches are typically sufficient (classification based on, e.g., distance and number of axles), i.e., ML methods are not required at all.

From a pure TIS perspective, best input would clearly be represented by repeated passages of (specially scheduled) separate locomotives; however, such system could hardly be considered as an early warning system, but a preventive monitoring, as is routinely done, e.g., in the field of structural health monitoring of bridges with scheduled passages of specialized instrumented vehicles.

Although the cross-validation options available clearly limit the statistical significance, the results are unique in demonstrating that

ML- (SVM-) based TIS at S&C is feasible if, within the S&C, the monitoring location is consistent. In cases in which the monitoring location is not consistent, identification is not successful.

Specifically the approach using SVM is insensitive to common crossing renovation, i.e., data from before and after the renovation can be combined, if only TIS without S&C condition assessment is considered.

The input vector that reduces full recorded time histories to a set of scalar characteristics must always be chosen subjectively so that it characterises all important features sufficiently while maintaining realistic hardware requirements stemming from the intended in-situ implementation on energy harvested battery powered modules.

During an iterative optimisation process in which accuracy is maximised and training time, evaluation time, and classifier memory and loss are minimised, the initial vector of 27 scalar features is reduced to 5.

Artificial intelligence

Switches and crossings

Machine learning

Train Identification System

Neuron network

Support vector machine

Leo Express train

Wavelet transform

Wigner–Ville transform

Short-time Fourier transform.

All data are available on request through corresponding authors.

The authors declare that they have no conflicts of interest.

This research is part of the S-CODE project which received funding from the Shift2Rail Joint Undertaking under the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement no. 730849. The support of the FAST-J-19-6062 project is also acknowledged. Furthermore, the research was supported by the project TAČR CK01000091Výhybka 4.0.